Kinect: better for tracking heads or tracking bodies?
Hi All,
So I have just got a Kinect, and want to use it to track positions of audience members in a room. I don't need limb movements or anything, I just want to detect any movement in the space.
I want to then divide the video into a grid so that different midi triggers are sent depending on where the audience are stood.
So, what I'm asking is:
- is there much difference in ease of tracking people between ceiling mounted or front-on kinect placement
- what's the best way to filter the image to detect just the bodies (filtering out static objects like floors/tables)
- is there a better way to divide a matrix into grids for midi purposes, e.g. scissoring a matrix window, then having the 0-255 value of each individual grid tile scaled to a 0-127 midi out?
Thanks!
Firstly the kinect sensor won't follow a whole room of people with its infra red camera, it has a minimum and maximum distance that isn't very high.
Check out the cv.jit objects (http://jmpelletier.com/cvjit/) , they work with kinect or a standard camera and allow you to do blob tracking, this means that the people in the room are recognised as being different than the background so they get a blob around them, the position of the blob, size,xy data can then be used to control midi.
Far easier than using jitter matrixes and jitter effects.
If you put the camera facing forward in the room than people can interfere with each others blobs, but you do get a bigger picture, ceiling mounted would be best but without a wide angle lens its probably difficult to get the distance you need to cover the whole floor.
Don't forget the kinect module has a great high quality rgb camera built in so its way better than a standard webcam.
The kinect has 3 different operation modes:
- RGB camera: same as any camera, use for example cv.jit to analyze
- depth camera: gives you a depth map of the field in front of the cam, can be used to subtract floors/walls from the image, what's then left are your users. Again cv.jit can be useful here.
- user/skeleton tracking: using Kinect SDK it will tell you where up to 6 people are (center of mass position) and do skeleton tracking of up to 2 people. This is with Kinect SDK ie. the dp.kinect external for Max, not sure if OpenNI/jit.openni has the same limits. It works only with the cam oriented fairly horizontally, ie. in front, and has a maximum range of 4m. The other 2 modes will work in ceiling position too and don't have a max range, though accuracy degrades with distance.
Btw, dp.kinect also can use the kinect's microphones to locate sound sources and do speech recognition.