which sensor is recommended for depth and body tracking with Max?

R_Gol's icon

which is better to integrate with MaxMSP Leap Motion or azure kinect?

Another question in that matter is - is the difference in price between azure kinect and leap motion is justified?

any other recommendations?

Edit: the azure kinect is way out of my budget. I wonder between the leap motion or older kinect V2 or V1

I mainly work on Mac OS but sensor should be compatible for both mac and windows

TFL's icon

Leap motion is for hand tracking only, not full body skeleton like the kinect does. But it has the advantage of working on both Max and Windows using the Ultraleap package (available in the Package manager).

For the Kinect V1, V2 and Azure, you can make them work quite easily in Windows with the (paid) dp.kinect, dp.kinect2 and dp.kinect3 externals. On Mac, you might be able to get data from the depth sensor of kinect V1 and V2 using freenect based externals, but nothing for skeletons. And not even sure if it works at all on Apple Silicon.

Another approach that might be worth to be considered is to use Google's Mediapipe Pose landmarker AI, which allows for real time full body skeleton tracking using a simple webcam. You can find an example about how to bring this in Max here (both Mac and Windows). But it won't give you a depthmap of your scene obviously (as it doesn't rely on a depth sensor), only skeletons (in 3D space, though). I don't know if it can track multiple bodies at once though.

R_Gol's icon

Another approach that might be worth to be considered is to use Google's Mediapipe

Thanks for this suggestion!!

R_Gol's icon

So I look into the jweb-pose-landmarker.maxpat from the github link you shared.

The data receiving is very much jittering. What will be best method to smooth it?

Max Patch
Copy patch and select New From Clipboard in Max.

R_Gol's icon

video that shows the jittering:

TFL's icon
  • I guess because it is a body detection model, it works bad at detecting only a chest+head. If the camera can see you entirely, you might get better results. Good lighting and close-fitting clothes might help.

  • Mediapipe models are focused on efficiency rather than precision, as they're meant to run on smartphones. By the way, the image actually used to perform the detection is automatically cropped and downsized to 256x256, so don't expect very precise results anyway.

  • The provided example makes use of outdated resources. More up to date version may give better results. For that, check the external js scripts linked in jweb-pose-landmark.html and remove the '@0.1' part at the end to get the most up to date version for each of them. Same for the resources linked in jweb-pose-landmark.js: remove @0.10.0 at the end of "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.0", as the last version is 0.10.2. It appears twice in the file (lines 3 and 221)!

  • You can find various ways to smooth the data a bit in Max by searching for "smooth data" in the forum.

R_Gol's icon

@TFL that jweb objects can't work with my mac builtin camera?

TFL's icon

what is your question? It works for me with my builtin camera (macbook).

R_Gol's icon

It won't recognize my mac built-in camera

R_Gol's icon

OK it is working. I did not have all the necessary files in the same folder as my patch