Today I got a kinect kamera and played around. FIrst of all thanks to Jean Marc and Nesa and all other contributors for making it possible to work with this great tool in Max.
I found out that in a small room it works just perfect even in darkness, which is quite impressive. It´s very fast in response and framerate.
There is one problem which I found: If the room is too big the distance sensor seems to have problems and the left output of jit.freenect.grab gets like stop and go. This makes it difficult to work on stage. As soon as I face the floor everything is ok. Is there a posibility to solve this problem? Something like: "set all distances to black as soon as there is no proper distance value". Or is this a hardware problem only Microsoft can solve:-?
You raise an important limitation of not only Kinect but all existing camera-based depth-sensing technology.
The Kinect estimates depth by projecting a dense patterns of infrared dots using a laser. (That’s the red-shining eye on the left side of the device.)
Through simple perspective, dots that fall on closer objects are going to look farther apart to the camera than those that fall on further surfaces. This method, called "structured light" is the one used by Kinect. Although the output depth map is 640×480 pixels, it actually needs to capture a higher-resolution image to be able to resolve dots that are far away. After a certain distance, though, the IR dots are either too faint or too close together for the camera to tell them apart. This is the limit to how far the Kinect can see. It will also fail to measure distance for surfaces that don’t reflect IR well, or are flooded with too bright IR (like sunlight).
It appears that if it fails to measure depth for too many pixels, the Kinect stops outputting depth maps. Even if it did output something, it would likely be unusable.
The other two techniques used to estimate depth are "time of flight" and stereo vision. Time of flight cameras emit a brief flash of infrared light and measure the time it takes for the light to bounce back to the image sensor. The few models available cost several thousands of dollars and only output very low resolution images and still only have a practical range of about 8 meters, which is arguably a little better than Kinect’s 6 meters or so.
The last, and oldest technique, is stereo vision. This works like our eyes: software compares the images from two cameras and tries to which points in one image corresponds to which points in the other. Don’t get fooled by how good we are at doing this; it’s a very difficult task. Stereo vision requires fairly intensive processing by the computer. There again, the depth limit is actually very close. Points that are far apart will essentially appear in the same position in both image. (That’s the same thing for humans, stereo vision is only good for close to mid-range distances, after that we use other clues to figure out distance.) The cheaper stereo cameras still cost a few thousand dollars and in my experience the results are much poorer than the Kinect. I found the usable depth range to be much less than 6 meters. The main advantage is that they can be used in sunlight.
So, in short, no, there is no way to solve your problem. The Kinect is an awesome device for its cost but it has its limitations and distance is one of them.
Hope this helped. (I’m pretty sure that’s not the answer you were wishing for though.)
Thanks a lot for your detailed response. Your answer helps planning a lot for me. I surely wished you writing "No problem, I fixed this last night, here is the update" but dealing with computers and hardware for many years I felt this answer is not very likely ;-). But then again it´s still an incredible big step having this tool as well it as your entire cv.jit libary in Max.
+1 for that great explanation and for your cv.jit objects!!
def +1, this explanation helped so much
Boom! + 1!!!
Clear explanation again!
One question: how to read (plus minus) exactly the nearby distance (depth) from a person with the kinect.
A lot of posts, i have read them all i think, are about getting the kinect depth but i have seen no really depth detecting system. The most of the posts about getting the Kinect depth are using [jit.3m] or mass detection but that isn’t really the depth.
The mass IS the depth but it’s the visualization of the depth of a cluster of points in an area. So it’s not hard to derive an average depth for a mass or specific depth at an X Y point (or multitudes of points). It’s just a matter of the right approach for the right application.
But the question is, how are you identifying what Your region of interest is? The kinect is just a blunt instrument. It doesn’t know the difference between a chair and a person. You have to sort the information it gives you into what’s useful an what’s not.
sorry to resurrect an old post, but i’ve been looking into this issue as well because it’s come up during troubleshooting small-room installations as well. one tool i’ve come across but have not had a chance to implement is a separate lens attachment accessory that alters the field of view to a shallow, wide-angle perspective. curious to know if anyone else has found the tool helpful…
data sheet: http://www.nyko.com/$assets$/5ec66dda-cf3c-46a4-a94d-3bbd35ad1c3a/ZoomSetupGuide.pdf
Hi, this is interesting, didn’t know about it. Let us know how it goes if you do test it out. I’m particularly curious about how it affects the depth range, guess it’s reduced.
i got the niko wide lens.
the depth map is vignetted (black corners and borders), so the field of view is only a little bigger
plus the depth map is much more unstable ,many more dead areas.
i suppose the optical components are bad bad bad
So i found it unusable….