Detecting people on a space and using their silhouettes as masks for live video.

Carlos Gomez de Llarena

Hello everyone,

I'm working on a patch for a video installation where I need to isolate the shape, motion and colors of anyone walking by the gallery space and overlay this footage over a white background. Imagine something like chromakeying, except that in this space there will be no single flat color to do proper chromakeying.

I will have a ceiling mounted camera pointing down to the ground (like a floor plan view). So far I made this patch where I am able to do this by the following steps:

A. Take a snapshot of the empty space to use as a reference frame.
B. Use jit.op and rgb2luma to calculate the luminance difference between the empty space and anything happening over live video.
C. Apply this luminance mask over the live video to see the colors of the isolated subject with jit.op.
D. Apply jit.op one last time to swap the black background with a white one.

I am attaching the files for the patch below. My question is if there is another better way you can think of to achieve this effect? Or any tweaks to this one that can improve the output? The silhouette luma masks seem to work well but I would like to achieve even better colors and solid shapes within the final colored silhouettes that are composed on step D.

I appreciate any help in advance.

Thanks,

Carlos

Space tracker.zip

zip 1.11 MB

phiol

Hello Carlos ,

If you're on windows, use a kinect and use the depth cam.
kinect2 has a hd resolution. the sample here is from a kinect 1.
You could also use jit.gl.cornerping to compensate for the camera angle so your kinect and realimage line up correctly.

Of course, use the rgb camera from the same kinect if you want to use the person.

Dale's dp.kinect2 will be your solution.

anyways, good luck

-----
mmmm. I've loaded the .zip file twice and thee forum won't let me.
Here is the link
https://www.dropbox.com/s/ch9s5knhkoqzaai/kinect_mask.zip?dl=0

Carlos Gomez de Llarena

This is pretty clever, Phiol. Someone had mentioned to me the Kinect could help with this. I do have a PC running the patch at the gallery.

The tricky part might be the optics: particularly the angle of view needed by the camera for the space, which I could only achieve by using a Canon EOS 60D DSLR with an 8mm fish-eye lens (see photo below).

I need to test wether the Kinnect camera is able to fit the entirety of the space (44'w x 21'd x 14'h) from the ceiling and figure if I use it's built-in RGB values or get Kinect + DSLR feeds into the patch and somehow make them match to pull this off. I was hoping to do everything with the DSLR to keep it simple, but this masking example you shared looks very clean and like it's just what I need.

So to get a Kinect to work with Max, I only need to buy a Kinect, right?

Thanks!

-Carlos

Gallery space photographed with Canon EOS 60D with 8mm fish-eye lens.

phiol

and saddly , you need to purchase the dp.kinect 2 from dale .
https://hidale.com/shop/dp-kinect2/

never understood why c74 never made native external like these.
^-90

hollyhook

If you can go with windows, you could also try openpose, which uses normal 2D images, so no need of a kinect here. I just tried it with a image from above, seems to work. I made a max external which lets you use openpose, https://www.patreon.com/posts/openpose-for-max-14183758

Floating Point

you probably should have a look at this page too for a great set of max/jitter computer vision objects, if you haven't already:
http://jmpelletier.com/cvjit/

Carlos Gomez de Llarena

I did check out the cvjit patches but I was not quite sure which is the right object to use to simply get the most solid white silhouette of a person seen by the camera. For some reason I was getting better results with jit,op. But I would think there's also a way to make cvjit output the white silhouette & black background I need for masking.

As for the openpose suggestion, that looks interesting too. I may try a patch with that as well, although I don't really need the skeleton gesture data for this project in particular. I need more a contiguous white silhouette of the people seen from above as illustrated by the sequence of images below (I need improvements on step B in particular).

The gallery has a polished concrete floor, i'm only interested in shapes, movement and colors of people in the space.

I can achieve something like this with jit.ops but it doesn't fully get the person entirely, there's some clipping. Ideally they look more like this.

Using jit.ops again, I can use the silhouette mask to extract the background (black) and keep just the people, movement and colors over a white background.

If cv.jit or openpose can produce a great mask (step B) I would like to know which object or parameter I should be using for that? I think the Kinect would work well, but it won't work in this space due to the angle of view problem (I need a wide-angle lens for the space). I could figure the rest out.

Thanks!

Floating Point

maybe have a look at the floodfill and close examples in the cv.jit package; they can help clean up a mask

Carlos Gomez de Llarena

Thanks to all of the suggestions shared here.

I wanted to show what I was able to do in the gallery space with cv.jit.close and some of the jit.gl masking code that Phiol had shared earlier. It's working quite good for what I need it to do (it's one of the steps done on my video processing workflow). However if you find any tweaks to this patch to make this person silhouette detection look even more clean without a Kinect, I'd be curious to know. I still have a few days to do some tweaks to this

Just in case the attachment didn't work here, you can download the patch and files from here:
https://www.dropbox.com/s/hkrshxgp9k6ju12/Semifinal.zip?dl=0

-Carlos

Maksym Prykhodko

I'm super late to the party, but @Carlos Gomez De Llarena, is the patch still somewhere out there in the world?