Cheaper alternative to thermal infrared for capturing human motion?
Here’s the setup: I’m projecting my video onto a screen, and I want people to be able to interact with the projected image – for example, someone goes up and waves their hand in front of the image, and the video of just their motion gets sent to Jitter. I need to be able to capture just the people and not the video, otherwise motion in the video gets captured and causes a runaway feedback loop. The easiest way I can think of to capture just the people in front of the screen is using a thermal infrared camera, which would detect body heat and not detect the projected image at all. Unfortunately these cameras appear to run in the tens of thousands of dollars, which is out of budget for my broke ass. I’m trying to come up with a cheaper alternative that accomplishes the same thing.
Can anybody suggest a relatively cheap way to capture just the people in front of the screen while filtering out the video projection? Or do you know where I could get a thermal infrared camera for cheaper, like under $1000?
Well, assuming that human beings are the only "invasions" on your video image, have you considered frame differencing?
This would probably mean a fair bit of patch tweaking, but better than 1000+!!!
Thermal cameras are definitely the best way to do this, but obviously out of reach for most people. FLIR is a company to look at, but even though they are cheaper than others they are still well above $1K.
A cheaper approach:
- IR light source: IR illuminators that will cover your space with an even, diffused pool of IR light. These can be found for relatively little money, but generally are LED’s made for "spotlight" applications so you may need to look at diffusion filters or multiple illuminators. One alternative might be incandescent theatrical lights covered with many layers of dark red filters, which essentially will provide a lot of IR light with only a faint red visible glow.
- B&W cameras that are sensitive in the near-IR spectrum, with in-line filters that block visible light but not near-IR light. These will give you a image that only shows the IR light reflected by the objects that you are tracking. Supercircuits is a good source for B&W cameras, look for the low-light cams; Lee is one company that makes the filters I’m talking about.
- any projectors you are using must have IR block fliters to prevent IR spill from the projected image.
You should be able to put together a reasonable rig using this approach for under $1K.
Thanks Jesse, that’s great info! I’d been thinking of doing something like this but wasn’t sure of specifics. I’ll check out the manufacturers you listed.
Paul, I’ve attempted to solve this problem with frame differencing in the past. As far as I have found, there are two ways to do it and each has an issue:
Method 1: difference the current frame from the previous frame. This has the downside of capturing the motion of the projected video from frame to frame as well. Since the motion affects the video animation, this leads to the runaway feedback I mentioned in my original post.
Method 2: difference the output frame against the input frame from the camera. Makes more sense logically, but the main issue is that the camera has to be lined up absolutely perfectly to capture the projected image exactly as it is in the original output. With color shift, blur, screen texture, image degradation etc. I’ve found matching these up perfectly to be difficult if not impossible.
If you can suggest a way to overcome either of these issues, I’m all ears. I’d love to find an in-patcher alternative.
Sorry to resurrect an old topic, but I find it interesting myself. Generally I would go the bw-camera+filter route suggested by Jesse, but from theoretical standpoint I’m interested in seeing how far can one get with some variant of your Method 2 – difference the input frame from the camera against expected input frame if the space was empty. To know what to expect you would first have to train the system somehow – project some sort of calibration patterns, capture them back, and figure out what’s going on (all those distortions you mention plus distortion of geometry). I guess something similar to projection mapping / structured light scanning. The result of the calibration would be a filter preset that models the projection environment to some extent. You can then keep projecting in the empty room to calibrate the accuracy of the model (not make it more accurate but estimate how accurate it is – how real input differs from model prediction statistically).
In the actual tracking mode you would compare the camera input against model prediction and mark only the pixels that differ above the calibrated model error.
I wonder if anybody went this or similar route and what one can do with jitter / cv.jit.
"Method 1: difference the current frame from the previous frame. This has the downside of capturing the motion of the projected video from frame to frame as well. Since the motion affects the video animation, this leads to the runaway feedback I mentioned in my original post."
Could you invert the projected video onto itself?
that has the same implications as Method 2: the camera "sees" video filtered (distorted, blurred, colour shifted etc), to properly compensate the projection you have to be able to filter what you project in a similar way.
(and both approaches will suffer from lack of synchronization between projector and camera – you don’t know for sure which projected frame is in the camera frame), this will be a problem if the visuals are dynamic).