Designing a Face/Body Motion Tracking With DSP Live Rig

fas11030's icon

Hi everyone,

I’m starting a project to build a live performance rig that incorporates motion tracking with a camera feed, which I’ll be processing in Max with various DSP effects (mosaic tiling, fractalling, video delay, etc.) before projecting the visuals in a music venue. As I’m getting started, I have a few questions and would love some insights from those with experience in this area.

I’m looking for a camera with strong motion tracking capabilities, particularly one that excels in accurate and reliable face and body detection. The camera will be positioned in front of a stage and needs to consistently track a performer moving across the full width—including the extreme edges—so precision and reliability are key.

From my experience, many motion-tracking cameras can be hit-or-miss in terms of reliability, so I’d love recommendations on specific models that perform well in this type of scenario. Alternatively, are there any software solutions that can improve tracking accuracy for standard PTZ cameras?

Additionally, I’d love suggestions on useful Max externals or patches—particularly those related to computer vision, motion tracking, or PTZ camera control—that might be relevant to this project. My background in Max is primarily in audio DSP, so any advice on getting started with these kinds of video applications would be greatly appreciated.

Thanks in advance for any recommendations or insights!

TFL's icon

As far as I know, you have two possible approaches:

  • Kinect V2 or V3 on Windows with dp.kinect2 or dp.kinect3 external

  • Regular camera with low latency and a machine learning solution to perform face/body detection. You can find examples of this on the forum if you search for "Mediapipe", which is a quite efficient solution but maybe not the most accurate. There's probably other ML solutions I'm not aware of that you could pipe to Max through nodejs or OSC.