Forums > MaxMSP

Detecting spatialization of binaurally recorded stereo file?

November 13, 2012 | 4:48 am

Hi everyone,

I’m working on a project in which I’d like to map the dynamics/intensity of various points of the stereo field to LED-driven lanterns. A nighttime symphony of frogs, specifically. Kind of the complement to something I’ve done a lot of before, which is chop up a jitter image into several vertical slabs, and then use the brightness/etc to control sounds.

How I thought I’d do it is something like this:

(a) play the stereo recording in max (sfplay~ or similar) – it’s quite spatial b/c it was recorded with Sonic Studios DSM mic
(b) somehow, on the fly, divide it into 14 channels, left to right (these will not be going out to actual channels, though, just used for #s.
(c) measure the intensity of each of these streams and spit out into a number, say 0.-1.
(d) translate each of these numbers into 0-255, and send each one out via Arduino Mega to 14 LEDs to control dimming.

I think I have most of it figured out, but I’m banging my head against how to analyze the stereo file and divert it into 14 areas.

Essentially, what I want is if there are discrete sounds/attacks at the same time say far to the left, 3/4 the way to the right, and all the way to the right, but nowhere else, that the LEDs in those locations would light up. Preferably through dimming, so they flicker.

Any and all ideas welcome. I could be missing something easy – I hope I am.


November 13, 2012 | 6:01 am

Here is a beginning, but it’s not working as expected, likely due to the fact that I don’t know anything about ambisonics!

It uses Graham Wakefield’s ambisonic externals.

– Pasted Max Patch, click to expand. –

November 13, 2012 | 5:28 pm

detecting the azimuth and/or elevation will only work by comparison; you need a reference signal of the same sound or the source you are goijng to analyze must already contain a movement (of static/repeating sound material).


November 13, 2012 | 11:41 pm

Hi Roman, thanks. The sound file I have has sections which are pretty static/repetitive, ie lots of clicking frogs. What do I do with that?

Everyone: what I’m trying to achieve is: imagine a sine wave panned hard left to hard right. Imagine a series of 14 lights placed left to right on the stage. Imagine the lights flickering from left to right. Now imagine that instead of a sine wave you have a stereo file creating a much more complex pattern of lights.

Is this possible?

November 14, 2012 | 2:51 am

the mic you are using is a stereo mic, so you can’t decode it into multiple channels– any sense of spatialization is due to the induced perception due to interaural time differences and level differences, plus possibly some spectral cues due to the design of the mic (pseudo-binaural);

in order to do what you want with the current recording you have, you would have to get the computer to mimic the way the brain processes these cues before the computer could ‘tell’ where the sound is coming from. This is not possible unless you have a few decades of research time spare.

My advice is to go back to the pond with an ambisonic mic, or perhaps even better, place a number of omni mics around the area, and do a multitrack recording, one track per led light.

November 14, 2012 | 5:35 pm

Hi Terry,

Thanks. I was afraid of this. Unfortunately, the recording is from rural Thailand, so I won’t be getting back there anytime soon.

I’d hoped that there might be an algorithm that *could* decode those subtle time differences as well as differences in level; I guess I’m happy that brains are still more powerful by far than computers, but it would be cool to translate stereo spatialization visually.

Oh well. Thank you for saving more time trying to track down the impossible!

November 14, 2012 | 9:12 pm

the impossible ideas are always the best…

Viewing 7 posts - 1 through 7 (of 7 total)