Forums > MaxMSP

Strategies of spatialization for moving objects & ears….

Jun 07 2012 | 12:47 pm

Hi there,

in my studies for myUniverse project, I need to study the sound spatialization.

I’d need some leads to work on my spatialization in myUniverse (= 3D Space with a cam and moving objects emitting sound)
My cam is the place where ears are.
I still don’ t know if I’ll use a more than one microphone approach. One microphone = 1 dimension panning, enough actually.
My objects emit sounds.

Inspiring myself by the discussion here about doppler: , I guess the key is to consider that I’d have to put a spatialization module inside all my objects.
I’d have to calculate angles & position related to objects.
Depending on those values, I’d have to tweak the spatialization module of the considered object.

Am I in the correct way ?

if I want quadriphony, I guess I can approximate it by using only 1 virtual mic.

Imagine that, the mic is representated by normals to my cam, pointing to forward direction (direction of view)

that way, I’d be able to calculate angles between my direction and every objects around me.

if the source is at 270* (totally on the left), then, channel front L and Rear L will be totally popping out the same volume of sound. I could even make the front R and Rear R to pop out a sound saying… a bit more filtered to make the sense of spatialization a bit more felt.

Does it make sense ?
Would we have some formulas to calculate gain/volume for each 4 points depending on angles/distances ??

I’m just afraid, now, because I’d have to mix all my sources without saturation anymore.
if I have 7 objects around me, I can send the whole stuff to my 4 virtual channels, but I’d have to use a limiter for sure.

Any leads or idea would be appreciate :)

Oh … I’d use Super Collider in order to keep things made by different binaries

Jun 07 2012 | 1:20 pm


One possibility would be to use ViMiC (virtual microphone technique) from the Jamoma distribution. You could then first describe the position of your virtual sources, and set up one or more microphones that you would dynamically move around according to your camera movements.

If you set up 4 microphones configured as a 1st order ambisonic microphone (omni and 3 figure 8 mics) you could pass the 4-channel signal on to the eminent but expensive Harpex plugin for binaural decoding. It might eb a challenge to do all of this on one laptop, as ViMiC and Harpex both are CPU-demanding, but this way you would get distance attenuation, doppler, directional cues and binaural decoding.


Jun 07 2012 | 4:16 pm

Hi there,
thanks a lot for your answer.
One of the reason for which I'm building my own system is the fact I don't want to have a big solution and to have to switch of 50% of the features because I don't need them.
Even if I have to design all parts I need, it would result into a more matching stuff (even if more messy at the first time…)

I got the concept of doppler effect using delay lines with continuous time value variation depending on the distance between the sound source & the cam.

I'd like to be able to sketch the spatialization stuff.
does someone help me with a model to use ?
I mean, intuitively, I would measure the angle made by my view direction and the segment going from the cam to the object.
If 0° is the line going in front of me, 45° would mean all the sound is in the front right speaker … 180° would mean same volume into rear left & rear right.. for instance.
I would handle the 3D by projecting all vectors on the cam plane.. and I'd measure angles..

Here is a little sketch made with Max (even if I won't do that, with Max)
Does it make sense ?


  1. sketchingSpatializationSTuff.PNG


Jun 07 2012 | 4:22 pm

hm…i don’t know if it’s on any help, but since i’m working on something with a similar aim and with those things, i’ll indicate that there is th ircam’s spat Max objects suite which does this : doppler effects on sound sources in a virtual space and reverberation too, though i’d be more than interested to see a less expensive take on the problem (spat is part of the ircam forum, is closed source and horribly expensive, but it does the job well). Maybe there are articles on the topic, somewhere on the web, that would help ?

edit : ninja’d

Jun 07 2012 | 7:28 pm

Hi vichug,
I won’t go with something closed & expensive.
But especially, I won’t go to an already packaged existing system as I wrote before.

Btw, this is a nice solution I didn’t know before :)

Jun 07 2012 | 8:08 pm

I’d like to be of any help to you, since it is really something that would intereste me on the highest point… but the thing i made for kind of the same purpose is really messy, 2 dimensional only. I’ll describe it if it is of any help, but the only part i really made myself is the UI for placing items and moving.
So it uses [nodes] object to place sound sources on a plane – each one corresponding to a numbered node -, the problem with it is that you can’t have more than 64 nodes (sound sources) in one [nodes]. A first [nodes] is linked to another [nodes], he sends all information relatives to the position of each node to the other one, and i modificate the second [nodes] after modifying the first one, because one is used to place the sounds in the space and the other one to place the triggers of the sources in the space, by default those positions are the same but sometime you wish to trigger a distant sound and some other sounds nearer from you at the same time.
The spatialisation is exclusively handled by ircam’s spat, and i have a poly~ with 64 instances, one for each soundsource, inside each instance is a spat, thus positions of each source is handled separately.
To ‘play’ this thing you need to move the cursor in [nodes] obviously, and for now i didn’t use the radius change, assuming the observer always face the same direction, it does do some nice effects sometimes.

From all this, if i have learnt anything while doing it, it is : you should have one way to place sound sources in space, and one other way to place the triggers of the sounds. It really depends on what you want to do, but for me it was quite necessary…

edit : by the way, are you willing to share any of it later maybe? :) if you want to have a closer look at my try no problem, but it’s really a mess, far from finished, not intended for anybody else to use for now and not necessary a relevant help to what you’re trying to achieve.

Jun 07 2012 | 8:14 pm

yeah very interesting system :)
I didn’t play a lot with [nodes] but I will.
I like the global ideas you developed.

Indeed that UI ..
I’m currently testing to make an external UI because I really need to separate tasks if, one day, I want to use more than one computer, but especially to not to add weight to the job of the main core part.
Indeed, while performing, I won’t use the GUI anymore and I could even not to run it.

Actually, the GUI which is very very early prototype, is made using OpenFrameworks / C++
it is feeded by my JAVA Core in my Max6 patch, and it sends back informations when I use it.
All messages are transported by UDP.

Jun 07 2012 | 8:36 pm

Hey Julien,

I am very interested in this as well. I am looking to
develop a multichannel performance environment that
lets me place and move sounds in 3d space.

One thing I am looking in to is using ambisonics.
It is multichannel encoding that allows you to
specify x,y,z coordinates of a sound object.
It also supports variable speakers configurations.

Here is a link to the Max externals…


Jun 07 2012 | 8:51 pm

Hi Anthony and thanks a lot for your link and info.

Unfortunately (or fortunately, I still don’t know!), I’m using Super Collider.
I want to keep task in really separated "thread" ; also because I’ll probably use more than one computers in the future.
More, I cannot rely on stuff without sources and a bit old (and maybe unsupported)
This project is too much important for me to have to control each part of my system

btw, I’ll definitively have a look at this stuff.

Jun 07 2012 | 9:22 pm

i’m thinking to separate ui as well… it’s in a separate patcher already, the idea would be to make some calculations about positions of sources etc in this patch, then send all those UI related information through udpsend and receive on another computer.

Btw, yet another set of (more recents :p) ambisonics tools, not sure if they are still under developpement, but sources are open, but all in max…

Jun 08 2012 | 9:14 am

yes, vichug.
separate things is (almost) every time better, also permitting units tests etc.

about ambisonics.
I would do my sound processing in Max but Super Collider. (separating things here too)
BUT I’ll prototype probably in max.
maybe, if it worked fine in max, I would use it .. (wow a breaking news here)
Not sure yet

Jun 08 2012 | 10:25 am

thanks Anthony & Vichug… now I’m tempted to make all in Max6 ..
The fact I wanted (and still want) to use an external sound generator, here Super Collider, was/is driven by the fact I want to be able to easily separate modules.
Indeed, I could even use another computer with the sound generator … in Max6 (with 2 licenses of course :D)

I mean, positions/distances/angles could be sent using OSC even to … the same patch if I want to stay close to my first idea…

dudes, you just tempted me :D

I tested ICST stuff, btw.
Indeed it works fine.
Still not sure one module like that in my polyphonics machines would take care of the CPU
tests required . . .!

Jun 08 2012 | 11:42 am


I’ve been reading your posts with great interest, since we (my collegeus and I) made a framework (in Java with a max/jitter overlay) to study interaction with sound-objects in an immersive environment.

For sound we use an octophonic setup and the vbap objects with the ICTS ambiomonitor. It has taken me a lot of time too to decide on what is best for our needs.

here is a panning structure that works for us:

-- Pasted Max Patch, click to expand. --

Jun 08 2012 | 11:57 am

hi llumen,
thanks for your interest.
Definitively, it is hard to decide.

What didn’t you go with ICTS ambisonics whole stuff ?
I’m very interested.

vbap seems nice.

In my case, each object would have to integrate the ambisonic stuff..
I mean, each voices.
Indeed, this is the only strategy I have in mind.

A voice (represented by a visual object) will "know" distance to cam, angle to direction of view etc and will be able to emit sounds to the correct speaker…
I’m afraid of CPU burning :D

Jun 08 2012 | 12:33 pm

which set of vbap are you talking about ? teh define_loudspeakers object doesn’t exist for me

Jun 08 2012 | 12:41 pm

hm, is latest verison windows only ?

edit : no it’s not

Jun 08 2012 | 1:40 pm

Here is an ugly sketch.

I need to be able to distinguish each object both in visual & sound, of course.
I mean, one object = one visual stuff + one voice.
I wrote voice, indeed, it could be a synth, it would be easier maybe.
Not sure yet.

Anyway, I'd like to know what you'd think about the place & the way to go with:
– distance attenuation
– doppler shift
– ambisonic stuff (= I mean the amount of the sound sent to each speakers depending on the position of objects)

Intuitively, I would say I'd put all these 3 modules in my poly~
poly~ would have a lot of voices but not a lot playing at the same time
(I'd instantiate a voice for each object, on a static manner for messaging purpose)

distance attenuation + doppler shift cannot be done outside.. I mean, on a global output or whatever like that.
indeed, it totally depends on the object itself, which means .. depending on the voice.

My question is about ambi stuff.

would you put ALL inside the poly ?
Is there a way to encode in the poly, then to decode outside ??
It would reduce computer job
1/ case all in poly => if n poly, there would be n ambicode + n ambidecode
2/ case only ambicode in poly=> if ne poly, there would be n ambicode + 1 ambidecode

ANY ideas would be appreciate :)


  1. polyAndAmbi2.PNG


Jun 09 2012 | 8:26 am


I indeed ended up with doing all calculations in each voice. The patch I added was a subpatch that runs in a poly~

I used to use only the vbap objects but for some reason the calculation of the position of the sound was always an issue.

I then tried to port it into the ICTS ambiosonics, but I found myself going back to the vbap. I really just needed panning and no ambiosonics

Jun 09 2012 | 8:57 am

I probably didn’t understand all the power of ambisonic.

In my case, I need to position sources in order to make :
– sound sources distribution on my 4 speakers
– distance attenuation (quite easy)
– doppler fx (quite easy too)

the first stuff is well done with ICTS stuff.
I’m still not sure I’ll put that in poly~ like that.
I’m a bit afraid of CPU burning at some point !

vbap seems very nice.
can I dispatch my sound to 4 speakers by sending some relative position to the cam of my source according to x,y,z relative, or azimuth/elevation/distance with vbap ?

Jun 11 2012 | 10:46 pm

ICTS is low on CPU as it is not truly Ambisonics. They do something called "Ambisonic Equivalent Panning"
It is similar to vbap but ICTS is more elegant and it has an excellent interface. True Ambisonics can be more processor heavy and also can be too particular for most audience situations.

you can do calculations to translate AED to XYZ and vice versa using cartopol and the like.

With ICTS I’ve had 10 sound sources going into an octophonic system which I used in a live context so I needed little to no latency as musicians were performing live and it worked perfect. Here’s an article I did on it:
That was using Max4! ;)



Jun 12 2012 | 1:04 am

As for SuperCollider option: there is great VBAP implementation (in sc3-plugins). Also there is very advanced Ambisonic implementation ("Ambisonic Toolkit" in quarks). IMO if you intend to use only 4ch, Ambisonics (esp. at higher orders) is a proverbial cannon for killing flies. As for dynamic/elastic/light dsp engine, MSP/poly~ is nowhere near scsynth, BUT if you didn’t use SC before you can have many surprises, esp. when translating MSP stuff to SC UGenGraphFuncs. You have direct control of all resources and processing, otoh this means intensive bookkeeping on your side. Did you plan to use bare bone scsynth, or the whole SC environment (sclang+scsynth)?


Jun 12 2012 | 7:57 am

Thanks a lot for your answers, all !

Andrzej, I didn’t use SC yet…
but nothing really frightens me (not because I feel like a genius, just I can fight difficulties to death :p)
I WOULD use the scsynth only, pushing the synth def/creations, mute, etc via OSC from max.

Probably, I’ll prototype using MSP.

The idea:
I’d like to design a kind of "synth-template" in SC.
That one would react / act when it would receive OSC messages.
Basically, the messages will only be:
– aed
– mute/unmute when distance between cam and object related to that source is (range + x) (where x is a safety distance)

all aed calculations will be done in max
the synth creation would happen at the startup only, then, I’ll unmute things, mute the others and play :)

if I can achieve the synth template design, it would be very fine !

Jun 14 2012 | 11:36 am

Your way is of course possible, but to achieve interesting sound/music you will have to build quite large synthdefs (and this works for drones or v.simple musical ideas). Scsynth is much like a very efficient but dumb performer. It’s not good at making music. A separate synth is more like a piano key (or even a hammer, string or sound board) than a whole instrument with pianist included, sent to space voyage. ;-) If you want to create a bunch of synths at startup and then pause/resume (= mute/unmute in sc nomenclature) them, this is very static approach, not taking advantage of dynamic character of the engine. Output of synth can be (and most often, is) sent (by mixing or replacing contents) to a number of global busses, and other synths can read from them. This aspect doesn’t exist in Max, where you patch from point to point, and cannot reuse the same cord. So in SC the order of processing is very important (and there are many ways of controlling it efficiently, in RT of course).

Controlling scsynth with OSC is quite easy and well documented, but creating and sending synthdef can be a challenge[1]. Writing them in sclang assures proper construction (there are a few signal rates (ir/kr/ar/dr), dealing with multiple channels, etc.) and optimizations (ugens are connected by so called ‘wirebuffers’, which number is limited (per synth), but they can be reused).
There is ‘jcollider’ java library[2], which may be a little dated, but also may be of some help. Its creator moved his interests to Scala language. Also see [3].
You can write synthdefs to files on your hd, and then tell scsynth to load them (or if you put them in proper place, it will load them on startup).

Just download it, and try to make some noise. Years ago I started from Max (4.x) and then moved to SC, and I remember the habits from using Max were obstacles to conceive clear solutions in SC. But maybe it was my bumpy learning curve… :)




Jun 14 2012 | 11:49 am

thanks a lot for your precious experiences & informations, Andrezj.

My words weren’t totally exact because I don’t know SC architecture well.
Yes, as you wrote, my way would be to design those synthdefs before and to tell scsynth to load them.

I’d first prototype in MSP.
Maybe, I still don’t know, I’ll keep MSP for sound too. not sure. It will depend on quality & performances of my system using that.
About not reusable wires, I wouldn’t need that in my system.
this would be more like a bunch of synths wired statically to a fx & master mixer chain, all dynamically muted/unmuted, modulated etc.
Even when I’ll compose in my system, I’ll create/remove objects (each object will instantiate a poly probably)

I’d probably test in SC if I have problem doing things in MSP and/or if quality isn’t good enough (I don’t mean better or worst software point of view at all, just the "my ear appreciation" about the synth/sound generator I’ll create)

Viewing 27 posts - 1 through 27 (of 27 total)

Forums > MaxMSP