Forums > MaxMSP

basic motion tracking questions

October 18, 2012 | 7:50 pm

Can someone explain me what is the difference between using a depth map made with luma-displacement or some IR light, where is the advantage?

Is there any difference between kinect or sony eye?
For using the eye camera with jitter.. do I have to remove the IR Blocking Filter first?

Any help will be much appreciated!!

edit: sorry, this should be in jitter area..

October 19, 2012 | 10:02 am

It’s a large topic…

Thanks to IR light you can work in amorphous light condition (eg. if you working on performance or installation and somebody want to make you a photo and using flash thanks IR you are sure it will be not interfering with your work). Depth map (in kinect) is based on IR image too and is usful when you want to measure distance between objects and camera, or if you are using some more advanced mocap techniques (eg. skeleton).

Kinect is a "ready made" IR camera (with IR light build-in). PS3 Eye you have to "hack" a bit for IR (to remove IR blocker, add visible light blocker).

October 19, 2012 | 11:26 am

Ok, I get it. Thanks a lot.

One more thing.. using IR light will give better depth/distance results of the objects then using a normal luma-displace technique, is that right?

October 19, 2012 | 11:46 am

Main difference is: the depth map allows you to work with 3-dimensional data (depth map is 3-dimensional "ex definitione"), instead of typical luma-differencing techniques working with "classic" 2d images taken from cameras.

October 19, 2012 | 3:54 pm

I cant resist to make another question.. :)

so a basic technique to use IR light would be to unpack the incoming matrix and use the z-plane to extrude on a mesh, without converting to greyscale..

October 19, 2012 | 6:14 pm

Image from infrared camera is "flat" (2-dimentional), like from any other kind of camera, except the stereoscopic (eg. kinect) – so there is no x, y, z planes here. Depth map from kinect is just an 2-dimentional, single plane matrix – every cell of that matrix is filled by the value depends on distance to camera (so, it’s in fact 3-dimentional data, ’cause you can calculate x y coords from cells positions).

For typical motion capture in max you can use cv.jit library ( by Jean-Marc Pelletier – it’s great tool for 2d motion tracking. Example patches from cv.jit should be very instructive for you. You can use those objects for IR based tracking or for work with image taken from any other kind of a camera. You can even use those object to process depth map from kinect, but it’s not best way to work with it.

I can also find some my own patches for motion tracking, but not today… I think, I can do it tomorrow. Maybe my patches will be a good "kickstart" ;-). But, anyway, you have just to start with your own experiments. It’s not really complicated.

October 20, 2012 | 9:05 am

Ok, that makes sense. Thanks a lot for your time Yaniki. I would love to see one of your examples, specially an example of how to set up the depth map with a kinect or other device like the Sony eye.

Have a nice weekend!

October 20, 2012 | 10:52 am


This is an entry-level motion tracking patch using only MaxMSP/Jitter build-in objects (no additional libraries, externals, etc.) – should work with any type of a camera, even with kinect (but in this case you have to replace "jit.grab"to eg. "jit.freenect" or other external receiving data from this device [you may to check other threads on the forum for more info about this topic]).

My patch demonstrates typical structure of motion tracking processing: from image filtering and background subtraction to converting the image into "numerical data". Actually every motion tracking system based on a camera is a variation of this model. The "jit.bounds" object (which is a most important object in the patch) is working very stable, but for more features (especially detection multiple objects in the same time) you have to use (as I mentioned in previous post) some additional stuff, and I’m strongly recomending you "cv.jit library" ( by Jean-Marc Pelletier.

  1. moCap1.maxpat

October 20, 2012 | 10:55 am

> Have a nice weekend!

Ach… not this time. Work, work, work… But thanks, anyway ;-)

October 20, 2012 | 2:52 pm

Another variant of the basic motion tracking structure in attached patch – I think, this one will be better. Have fun ;-)

  1. moCap2.maxpat

October 21, 2012 | 11:03 am

Hey Yaniki, thanks so much for that, very cool stuff.. but, I’m a bit confused. My goal is to use the eye camera and extrude with the depth map on a, in the hope to get better depth results as with the luma-displacement technique. In your examples you are using ayuv2luma too.. for what? And what would be finally the depth map and how can I connect this map to gl.mesh to extrude on the z plane?

Thanks for your time and patience!

October 21, 2012 | 12:23 pm

For depth map you need an OpenNI device (eg. kinect) – and PS3 Eye is not an OpenNI device. It’s just a nice USB camera ;-).

If you need to convert depth map from kinect into a mesh, you need just a kinect and a few objects in max. It’s simple. I made some installations using this feature (eg., and I can post you a max patch next week, but’s not a big deal: just a matrix storing depth values (received from kinect via jit.freenect) and some simple calculations (scaling, etc.) before sending to

October 21, 2012 | 12:24 pm

hmmm… something is wrong with the link to the video, so again:

October 23, 2012 | 5:58 pm

Nice video! I thought that the sony eye will make a depth map too. So is it worth in terms of depth quality to get a Kinect?

October 25, 2012 | 12:05 pm

For "real" depth map you need a stereoscopic device (eg. kinect), not a typical camera. But if you just want to create a 3d mesh from 2d image you can use pixel’s luminance for "z" coords ( – another my video ;-) ). It’s simple, but I can attach the patch if needed.

Viewing 15 posts - 1 through 15 (of 15 total)