Felix

Hello. I have a technical question about motion tracking that I'll like to 'share' in this forum. 

I'm working in an interactive installation where i would like visitors to interact with a video projection. The idea in fact, is to make an interactive floor (something like 'the lighting district' from time's up). 

The tracking system is a fire-i with an IR filter, and a video projector, both with fish eye, mounted in the ceiling, and 'pointing' to the floor. 

To simplify the understanding of the interaction, we can imagine that the installation works like this: the video projection creates an image of a grid in the floor, when people enter each rectangle; the motion tracking system detects it, and gives an audiovisual feedback...

My question is: What is the smartest way to conceive and program an 'auto calibration system'?. 

In other words: Who can the fire-i recognize the squares, or interactive zones, that are projected? (Don't forget, the fire-i is only sensible to IR).

Or, how can the motion tracking zones automatically correspond to the video projection zones?.

The installation is going to be permanent, so i need the system to recalibrate itself (The different motion tracking zones must fit in the image grid).

I thought of different solutions. For example using reflective 3M tape to mark the floor according to a fixed video projection area, so that the IR camera can recognize it and adjust the motion tracking grid. 

Or maybe i could use another fire-i without an IR filter, to do the calibration work.

I hope my question is clear, and that it will contribute the discussions about motion tracking in the forum (I haven't found any info about solutions to this specific situation).

Hello. I have a technical question about motion tracking that I'll like to 'share' in this forum. 
I'm working in an interactive installation where i would like visitors to interact with a video projection. The idea in fact, is to make an interactive floor (something like 'the lighting district' from time's up). 
The tracking system is a fire-i with an IR filter, and a video projector, both with fish eye, mounted in the ceiling, and 'pointing' to the floor. 
To simplify the understanding of the interaction, we can imagine that the installation works like this: the video projection creates an image of a grid in the floor, when people enter each rectangle; the motion tracking system detects it, and gives an audiovisual feedback...
My question is: What is the smartest way to conceive and program an 'auto calibration system'?. 
In other words: Who can the fire-i recognize the squares, or interactive zones, that are projected? (Don't forget, the fire-i is only sensible to IR).
Or, how can the motion tracking zones automatically correspond to the video projection zones?.
The installation is going to be permanent, so i need the system to recalibrate itself (The different motion tracking zones must fit in the image grid). 

I thought of different solutions. For example using reflective 3M tape to mark the floor according to a fixed video projection area, so that the IR camera can recognize it and adjust the motion tracking grid. 
Or maybe i could use another fire-i without an IR filter, to do the calibration work.
Or ...

I hope my question is clear, and that it will contribute the discussions about motion tracking in the forum (I haven't found any info about solutions to this specific situation).
Thanks!. 
Felix


specific-motion-tracking-situation

>My question is: What is the smartest way to conceive and program an 

>In other words: Who can the fire-i recognize the squares, or 

>interactive zones, that are projected? (Don't forget, the fire-i is 

>Or, how can the video projection zones automatically, correspond to 

>The installation is going to be permanent, so i need the system to 

>recalibrate itself (The different motion tracking zones must fit in 

>I thought of different solutions. For example using reflective 3M 

>tape to mark the floor according to a fixed video projection area, 

>so that the IR camera can recognize it and adjust the motion 

>Or maybe i could use another fire-i without an IR filter, to do the 

reading quickly through your questions, it makes me think that it 

sounds like a job for softVNS - square zones (defined in vns) are 

the b-a ba of the program you can define as many as you want - for a 

somehow similar project(s) i often used grids forming 64 zones - each 

of them being then used (or not) to control a different parameter (or 

you can also hand-draw the zones (in any shape) in the soft itself 

__as for the auto-calibration, i have done such a patch in softVNS - 

i could send it to you, (don't have it right now, there) but I 

remember it was not difficult to make - and it worked (stayed a few 

days in galleries, where it had to deal by itself with day light, 

>
>My question is: What is the smartest way to conceive and program an 
>'auto calibration system?.
>In other words: Who can the fire-i recognize the squares, or 
>interactive zones, that are projected? (Don't forget, the fire-i is 
>only sensible to IR).
>Or, how can the video projection zones automatically, correspond to 
>the motion tracking zones?.
>The installation is going to be permanent, so i need the system to 
>recalibrate itself (The different motion tracking zones must fit in 
>the image grid).
>
>I thought of different solutions. For example using reflective 3M 
>tape to mark the floor according to a fixed video projection area, 
>so that the IR camera can recognize it and adjust the motion 
>tracking grid.
>Or maybe i could use another fire-i without an IR filter, to do the 
>calibration work.
>Or ...

reading quickly through your questions, it makes me think that it 
sounds like a job for softVNS -  square zones (defined in vns) are 
the b-a ba of the program you can define as many as you want  - for a 
somehow similar project(s) i often used grids forming 64 zones - each 
of them being then used (or not) to control a different parameter (or 
sets of parameters of course)

you can also hand-draw the zones (in any shape) in the soft itself 
and even have zones overlap

__as for the auto-calibration, i have done such a patch in softVNS - 
i could send it to you, (don't have it right now, there) but I 
remember it was not difficult to make  - and it worked (stayed a few 
days in galleries, where it had to deal by itself with day light, 
electric lights, etc etc

kasper
-- 
  Kasper T. Toeplitz
noise, composition, bass, computer
http://www.sleazeArt.com

I haven’t tried SoftVns for the moment. I was patching with cv.jit. 

Good to know that there is a vns object for tracking square zones!. 

Yes, I’ll love to see your patch to understand how you auto-calibrate the motion tracking ‘system’. I hope you can send it to me once you find it … ;-)

The auto-calibration function must respond to the situation: What happens if the camera is moved a little bit?, or if anyone changes parameters from the projector (like for example the zoom) ???. The motion tracking grid must automatically re-adjusts to this situations. 

What I don’t really get is: How can the IR camera detect the position of the projected grid in the floor and re-adjusts, if she is ‘Blind’ to normal light?.

Hello Kasper,
Thank you for your reply!.
I haven’t tried SoftVns for the moment. I was patching with cv.jit. 
Good to know that there is a vns object for tracking square zones!. 
Yes, I’ll love to see your patch to understand how you auto-calibrate the motion tracking ‘system’. I hope you can send it to me once you find it … ;-)
The auto-calibration function must respond to the situation: What happens if the camera is moved a little bit?, or if anyone changes parameters from the projector (like for example the zoom) ???. The motion tracking grid must automatically re-adjusts to this situations. 
What I don’t really get is: How can the IR camera detect the position of the projected grid in the floor and re-adjusts, if she is ‘Blind’ to normal light?.
Thanks,
Felix


Quote: Felix _ OtherSounds wrote on Mon, 25 September 2006 04:11

> The auto-calibration function must respond to the situation: What happens if the camera is moved a little bit?, or if anyone changes parameters from the projector (like for example the zoom) ???. The motion tracking grid must automatically re-adjusts to this situations.

I know this might not be the answer you're looking for, but: make sure no one touches the camera or the projector. Not everything is best done in software and good image analysis starts with a good physical setup.

What you want to do is much more in SoftVNS' alley than cv.jit. However, you can do the analysis part using standard Jitter objects only. The method works this way: prepare a "map" image of your zones. This can be done in a drawing program or using jit.lcd. The map is a greyscale image where each zone has different-valued pixels. Make sure to use a lossless format when saving and that there is no anti-aliasing.

Then, identify the foreground pixels using the method of your choice. Take the resulting binary image and multiply it with the map image. That way, all foreground pixels in "zone 1" are going to have the value 1. Using jit.histogram, you can then easily find how many foreground pixels are in each zone.

> The auto-calibration function must respond to the situation: What happens if the camera is moved a little bit?, or if anyone changes parameters from the projector (like for example the zoom) ???. The motion tracking grid must automatically re-adjusts to this situations. 

Then, identify the foreground pixels using the method of your choice. Take the resulting binary image and multiply it with the map image. That way, all foreground pixels in "zone 1" are going to have the value 1. Using jit.histogram, you can then easily find how many foreground pixels are in each zone.


>I haven’t tried SoftVns for the moment. I was patching with cv.jit.

>God to know that there is a vns object for tracking square zones!.

>Yes, I’ll love to see your patch to understand how you 

>auto-calibrate the motion tracking ‘system’. I hope you 

>The auto-calibration function, must respond to the situation: What 

>happens if the camera is moved a little bit?, or if anyone changes 

>parameters from the projector (like for example the zoom) ???. The 

>motion tracking grid must automatically re-adjusts to this 

>What I don’t really get is: How can the IR camera detect the 

>position of the projected grid in the floor and re-adjusts, if she 

in what i did the grid is not on the floor/wall, but in the software

for what you say (if somebody chages the setting, moves the camera, 

changes the zoom etc, the best would be to hire a guard to prevent it 

then maybe with what is called "head-tracking" in softvns??? in any 

case, in my works, i just made sure no one would touch the camera - 

should be easy specially if you hang it on the ciling...

the auto-calibration of course depends of the nature of the tracking 

system you choose and the general way of conceiving the patch...

>Hello Kasper,
>Thank you for your replay!.
>I haven’t tried SoftVns for the moment. I was patching with cv.jit.
>God to know that there is a vns object for tracking square zones!.
>Yes, I’ll love to see your patch to understand how you 
>auto-calibrate the motion tracking ‘system’. I hope you 
>can send it to me once you find it…
>The auto-calibration function, must respond to the situation: What 
>happens if the camera is moved a little bit?, or if anyone changes 
>parameters from the projector (like for example the zoom) ???. The 
>motion tracking grid must automatically re-adjusts to this 
>situations.
>What I don’t really get is: How can the IR camera detect the 
>position of the projected grid in the floor and re-adjusts, if she 
>is ‘Blind’ to normal light?.

for what you say (if somebody chages the setting, moves the camera, 
changes the zoom etc, the best would be to hire a guard to prevent it 
!!!!

then maybe with what is called "head-tracking" in softvns???  in any 
case, in my works, i just made sure no one would touch the camera - 
should be easy specially if you hang it on the ciling...

the auto-calibration of course depends of the nature of the tracking 
system you choose and the general way of conceiving the patch...

if you just need a 8by8 grid consider doing so with simple floor sensors. this could prove more relaible.

> I know this might not be the answer you're looking for, but: make >sure no one touches > the camera or the projector. Not everything is > best done in software and good image 

> analysis starts with a good physical setup.

You are totally right, but I wanted to think about the worst scenarios … 

So, to adjust the projected video image, to the motion tracking ‘map’, you and Kasper seems to agree that the best way is to bee sure the physical setup will stay perfectly fixed.

> What you want to do is much more in SoftVNS' alley than cv.jit.

Thanks, if you say so … now I’m sure I need SoftVNS

>However, you can do the analysis part using standard Jitter objects >only. The method works this way: prepare a "map" image of your >zones. This can be done in a drawing program or using jit.lcd. The >map is a greyscale image where each zone has different-valued >pixels.

Do you mean that each zone should have a different grayscale value?

>Make sure to use a lossless format when saving and that there is no >anti-aliasing.

>Then, identify the foreground pixels using the method of your >choice.

This way I’ll make the Background elimination?

> Take the resulting binary image and multiply it with the map >image. That way, all >foreground pixels in "zone 1" are going to >have the value 1. Using jit.histogram, you >can then easily find how >many foreground pixels are in each zone.

Ok. If I understand, this way I adjust the motion tracking grid to the foreground and they become a ‘value 1 zone’, so that I can detect ‘presence’ in every zone. 

Thank you both for your answers, they are of great help.

Of course, I’ll appreciate any other comments on the subject.

Hello Jean-Marc!
Thank you for your reply.
> I know this might not be the answer you're looking for, but: make >sure no one touches > the camera or the projector. Not everything is > best done in software and good image 
> analysis starts with a good physical setup.

You are totally right, but I wanted to think about the worst scenarios … 
So, to adjust the projected video image, to the motion tracking ‘map’, you and Kasper seems to agree that the best way is to bee sure the physical setup will stay perfectly fixed.  

> What you want to do is much more in SoftVNS' alley than cv.jit. 

>However, you can do the analysis part using standard Jitter objects >only. The method works this way: prepare a "map" image of your >zones. This can be done in a drawing program or using jit.lcd. The >map is a greyscale image where each zone has different-valued >pixels. 

>Make sure to use a lossless format when saving and that there is no >anti-aliasing.
>Then, identify the foreground pixels using the method of your >choice.

Ok. If I understand, this way I adjust the motion tracking grid to the foreground and they become a ‘value 1 zone’, so that I can detect ‘presence’ in every zone. 
Am I right?

Thank you both for your answers, they are of great help.    

I'have worked with sensitive floors in the past. 

They are not super solid for permanent installations... 

Hello Yair!.
Do you mean, piezo sensors?
I'have worked with sensitive floors in the past. 
They are not super  solid for permanent installations... 
Thanks.
F


peizo sensors are an overkill, if what you need is onoff pulses there are

many of-the-shelf industrial contact mats, but you can build one for

2 sheets of Plexiglas with a heavy duty aluminum foil glued to the

sides+some kind of foam to keep the circuit open. i recommand a midi input

board, as they are fast and expect up to 128 digi inputs per unit, per 1ms~

if you are short you can use a hacked keyboard . if you plan to make the

installation outdoor this can get tricky, but not more then (for one)

dealing with changing lighting in a cv tracking setup....

> I'have worked with sensitive floors in the past.

> They are not super solid for permanent installations...

peizo sensors are an overkill, if what you need is onoff pulses there are
many of-the-shelf industrial contact mats, but you can build one for
yourself that will satisfy your needs.
2 sheets of Plexiglas with a heavy duty aluminum foil glued to the
sides+some kind of foam to keep the circuit open. i recommand a midi input
board, as they are fast and expect up to 128 digi inputs per unit, per 1ms~
, cheap at midibox.de/
if you are short you can use a hacked keyboard . if you plan to make the
installation outdoor this can get tricky, but not more then (for one)
dealing with changing lighting in a cv tracking setup....

2006/9/25, Felix :
>
>
> Do you mean, piezo sensors?
> I'have worked with sensitive floors in the past.
> They are not super  solid for permanent installations...
> F
> --
> Felix Luque
> http://www.othersounds.net
> felix@othersounds.net
>


> if you are short you can use a hacked keyboard .

it was mentioned before, but hacked computer keyboards only allow few 

keys to be pressed simultaneously (try it yourself using Apple 

Keyboard viewer). I learned it the hard way.

Also setting up the installation when your guinea pigs are typing 

into any selected object might be frustrating - i;m doing this often, 

but it is way more satisfying to use hacked game controller, as they 

don't interfere with normal operation of computer.

>
> if you are short you can use a hacked keyboard .

it was mentioned before, but hacked computer keyboards only allow few  
keys to be pressed simultaneously (try it yourself using Apple  
Keyboard viewer). I learned it the hard way.

Also setting up the installation when your guinea pigs are typing  
into any selected object might be frustrating - i;m doing this often,  
but it is way more satisfying to use hacked game controller, as they  
don't interfere with normal operation of computer.

> Do you mean that each zone should have a different grayscale value?

Yes. Fill in the pixels that belong to zone 1, with a greyscale value of 1, 2 for zone 2, and so on. Using a paint program is good because your zones can be of any shape you want. You can load in a still picture taken by your camera and paint the zones on top of it to make sure the map image matches your setup perfectly.

> This way I?ll make the Background elimination?

You need to distinguish people (foreground) from floor (background). How you do this depends on your particular setup, but in the end pixels that correspond to people should end up with greyscale values of 255, and floor pixels should be at 0.

> Ok. If I understand, this way I adjust the motion tracking grid to the foreground and they become a ?value 1 zone?, so that I can detect ?presence? in every zone.

With a histogram you can count the number of "people pixels" for each zone. The binary foreground image you calculated becomes a mask. Look at the patch fragment below. It's not functional as-is, there's some stuff you'll need to fill in, but it shows you the workflow.

> Ok. If I understand, this way I adjust the motion tracking grid to the foreground and they become a ?value 1 zone?, so that I can detect ?presence? in every zone. 

Sorry for the delay of this reply, but lately i didn't have any time to work in max/jitter.

I have been doing some 'quick colage' with your jit.histogram example, and a background subtraction patch that you post in the forum sometime ago (it woks pretty well with my setup). 

In this 'colage patch' (see below), I first get the mask (255 for foreground, 0 for background), then i multiply it with the map (that I did as you explain to me: an image divided in 9 square zones, each one with a different greyscale value (values 1 to 9)). the output goes to the jit.histogram object.

Then it gets complicated for me. I don't understand how can i distinguish, from the output of the jit.histogram, the number of foreground pixels of each of the 9 different zones of the map image (with greyscale values from 1 to 9).

Histogram gives back 256 values, it creates automatically a jit.cellblock of 1 row and 256 columns. The only values that 'change' are row 1 - column 1 and row 1 - column 255.

It seems to me that in the jit.cellblock, row 1 - column 1 value is the number of 'black' pixels, and that row 1 - column 255 value is the number of 'white pixels' or foreground pixels of the entire image.

I'm sorry for my 'incompetency', but I'm new to jitter. 

Hello Jean-Marc.
Thank you for your help!.
Sorry for the delay of this reply, but lately  i didn't have any time to work in max/jitter.
I have been doing some 'quick colage' with your jit.histogram example, and a background subtraction patch that you post in the forum sometime ago (it woks pretty well with my setup). 
In this 'colage patch' (see below), I first get the mask (255 for foreground, 0 for background), then i multiply it with the map (that I did as you explain to me: an image divided in 9 square zones, each one with a different greyscale value (values 1 to 9)). the output goes to the jit.histogram object.
Then it gets complicated for me. I don't understand how can i distinguish, from the output of the jit.histogram, the number of foreground pixels of each of the 9 different zones of the map image (with greyscale values from 1 to 9).

Histogram gives back 256 values, it creates automatically a jit.cellblock of 1 row and 256 columns. The only values that 'change' are  row 1 - column 1 and  row 1 - column 255.
It seems to me that in the jit.cellblock, row 1 - column 1 value is the number of 'black' pixels, and that row 1 - column 255 value is the number of 'white pixels' or foreground pixels of the entire image.

I'm sorry for my 'incompetency', but I'm new to jitter. 
Thanks again,
Felix 

> Histogram gives back 256 values, it creates automatically a jit.cellblock of 1 row and 256 columns. The only values that 'change' are row 1 - column 1 and row 1 - column 255.

> It seems to me that in the jit.cellblock, row 1 - column 1 value is the number of 'black' pixels, and that row 1 - column 255 value is the number of 'white pixels' or foreground pixels of the entire image.

I just tried it, it works fine for me. Are you sure your map image is loaded properly? When I try it, I get changing values for columns 0 to 9, as expected.

> Histogram gives back 256 values, it creates automatically a jit.cellblock of 1 row and 256 columns. The only values that 'change' are  row 1 - column 1 and  row 1 - column 255.
> It seems to me that in the jit.cellblock, row 1 - column 1 value is the number of 'black' pixels, and that row 1 - column 255 value is the number of 'white pixels' or foreground pixels of the entire image.

I just tried it, it works fine for me. Are you sure your map image is loaded properly? When I try it, I get changing values for columns 0 to 9, as expected.


I think I do everything right. I use exactly the patch and map image I send you. And I use the exact message “importmovie greyscale160_120.tif” to load the file. The file is in the same folder as the patch file, and when I load it, the print object send me back: 

“print: importmovie greyscale160_120.tif 1”. This looks ok to me …

Are you saying that you get different results using the exactly same patch and map image that I send you?. You have columns 0-9 changing in a “jit.cellblock” of 256 columns?.

I only get values changing in column 0 and in column 255. When I move in the room, the mask changes perfectly according to the motion tracking. But values only change in these two columns. I think column 0 is the total number of black pixels and column 255 is the total number of White pixels, because when I reduce the threshold, this two values change proportionally.

I’m using Max/Msp 4.5.5 and jitter 1.5.0 in a dual G5 with OSX 10.4.4.

Hello!.
I think I do everything right. I use exactly the patch and map image I send you. And I use the exact message “importmovie  greyscale160_120.tif” to load the file. The file is in the same folder as the patch file, and when I load it, the print object send me back: 
“print: importmovie greyscale160_120.tif 1”. This looks ok to me … 

I only get values changing in column 0 and in column 255. When I move in the room, the mask changes perfectly according to the motion tracking. But values only change in these two columns. I think column 0 is the total number of black pixels and column 255 is the total number of White pixels, because when I reduce the threshold, this two values change proportionally.   

I’m using Max/Msp 4.5.5 and jitter 1.5.0 in a dual G5 with OSX 10.4.4. 

Thanks for your help and time … 
Felix


This is weird, I tried again and at first it didn't work, as you described. I sent another bang, and now it's working. So, try to see if banging the map matrix again doesn anything.

I'm going to try the patch in a Windows XP computer right now, and i'll tell you if it works.

No, it doesn't ... 
Weird!.
I'm going to try the patch in a Windows XP computer right now, and i'll tell you if it works.
thanks.
Felix


I tried it in a Windows XP computer with max 4.5.5 and jitter 1.5.1 and it works.

But I only get changing values in 8 columns (8 +1). Do you get 9 (9+1)?. 

For the OSX bug, do you think that it could be solve with a max/jitter upgrade?

I think this patch could work pretty well, how could I implement the analysis?. 

Could I take advantage of any softVNS object??. 

(Kasper talk about “square zones (defined in vns)” and “head-tracking”).

Weird, it must be a bug...
I tried it in a Windows XP computer with max 4.5.5 and jitter 1.5.1 and it works.
But I only get changing values in 8 columns (8 +1). Do you get 9 (9+1)?. 
For the OSX bug, do you think that it could be solve with a max/jitter upgrade?
I think this patch could work pretty well, how could I implement the analysis?. 
Could I take advantage of any softVNS object??.  
(Kasper talk about “square zones (defined in vns)” and “head-tracking”).
Thanks,
Felix


I change the map image, giving RGB values from 2-10 to the 9 different square zones. Now everything looks ok. I get changing values in 9 columns (9 +1). 

Finally.
I change the map image, giving RGB values from 2-10 to the 9 different square zones. Now everything looks ok. I get changing values in 9 columns (9 +1). 
…. 
I think this patch could work pretty well, how could I implement the analysis?. 
Could I take advantage of any softVNS object??. 
(Kasper talk about “square zones (defined in vns)” and “head-tracking”).
…
Thanks,
Felix


Specific Motion Tracking Situation