Articles

Book Review(s): Reading Up on Spatial Audio

It probably won’t surprise many of you to discover that our customers ask us about resources for learning more about the concepts they work with as Max users. While there are lots of you out there who have easy access to mentors or professors or friends in the know, there are a lot of Max and Max for Live users out there whose burgeoning hunger for knowledge may not be as easy to satisfy with a local call or over a cup of coffee.

Unsurprisingly, the recent release of the Audio Routes tools for multichannel audio routing in Max for Live has resulted in quite a number of those requests. While the world is full of Max for Live (and Max) users who have fired up the tools and marvel to converting their audio playback into a veritable Holiday On Ice exercise in audio choreography, there are people out there who are now more curious about the subject in general - trying to understand the relationship between what they see in the UI and what they hear, wondering what they could “do next,” or just wishing to know a little more about the tools they’re using.

And of course - there’s the arrival of Max 8’s MC multichannel audio possibilities….

The Terrains of Possibility

The term “spatial audio” covers a multitude of topics - surround sound (the spatial audio system that lots of us have in our homes already… hooked to our TV), binaural audio, vector-based amplitude panning, ambisonics, multispeaker arrays, and so on. Although those terms are all subsumed under the general idea of spatialized audio, they all have different meanings, different histories in terms of development, and - since we’re talking about Max - different in-house and third-party resources that support spatial audio in those different forms.

As a reader, you might be interested in learning what the difference between the different approaches are, or interested in knowing more about a single approach in more detail. It would seem as though there are lots of you out there who may not have thought much about life beyond stereo who now find yourself with some questions and that little itch to know more.

I’d like to recommend a pair of books to you that may be of use to you.

Let’s start with Francis Rumsey’s Spatial Audio - one of the volumes in Focal Press’ Music Technology series. It’s a great low-impact introduction to the psychoacoustic aspects of the subject, and provides you as a reader with the chance to pin down and conceptualize your own current understanding of stereo and binaural audio. If you don't consider yourself as someone intimate with or fascinated by mathematics or a more academic approach to questions whose answers are presented in the form of circuit diagrams, you'll find this book a useful starting point.

From the opening introductory material, the book expands out to multichannel and surround sound presentation and (very usefully) monitoring. Besides the great context and backgrounding and the process of laying out the transition from stereo/binaural sound to multichannel audio, this book really shines for me in its final two chapters on recording techniques - mics, panning, and reverberation topics associated with spatial audio.

A disclaimer/note: This is the book I started with, and the person who initially recommended it to me was using it as a resource for their beginning music tech students. I found it a great help, and have only recently moved on to the book I'll recommend to you next.

The second book I’d recommend to you is a bit more academic in terms of its organization and content, but one I’d consider to be a standard work in the field. Immersive Sound: The Art and Science of Binaural and Multi-Channel Audio is also published by Focal Press, but in collaboration with the Audio Engineering Society (AES). This one’s an anthology, with content from authors in various parts of the field - industry people, researchers, and recording engineers all share the space in the Table of Contents. What initially caught my attention in the book was a chapter on the history of 3-d sound, which was full of things I was unaware of.

The anthology delves into similar territory as the Rumsey book, but at a different level of detail - here’s a simple example of the level of detail in the anthology: there are separate chapters on binary audio through headphones and binaural audio through speaker system (and yes, there is a difference).

While touching on ambisonics and multichannel mixing techniques, it also pushes into a few different areas - object-based audio, speaker height, and Wave Field synthesis (the chapter on this was worth the price of admission for me all by itself).

So here are a few places to start if you've got the urge to learn a bit more about spatial audio. If any of our readers — particularly those of you who are educators have suggestions of your own on useful resources, please post them here, and thanks!

by Gregory Taylor on March 3, 2020

Roman Thilenius's icon


not sure if the vanilla mc.objects would, for us max users, be the main gamechanger. it in my estimation they do not really save CPU or make patching connections easier compared to before.

i am using formats such as 8 or 10 speakers in a circle with a custom channel format (i.e. avoiding b-format wherever possible, i found other ways how to position point-sources), and until now i was always very happy with prepend/route signal connections.

of course there are exceptions: using mc.vst~ is for sure easier than doing the same with 10 vst~ objects - my abstraction which does that once was half a day of really dirty work.

what is more interesting about mc. is that third party objects like HOA might one day use it, so that you can start patching multichannel stuff with multicores from the beginning on, without the need to make some abstraction for it yourself.

what i generally find interesting about the topic is that ambisonics in in theory so much "better" than all of that dolby prologic home theater shit - yet it is, after 30 years of ambisonics, still impossible to share music made in such formats.
because there is no industry standard/exchangable format (except b-format(s)) for it, and so everyone has a different setup of speakers and what technology you use mostly depend on the personal situation and application, and because ambisonics only makes sense in a halfways anechoic studio room - or at an outdoor festival - and that is just not how my mother listens to music.
hughe arrays with a bigger hot spot than for a single person exist, but those all require an incredible lot of encoding...

John Carter's icon

Great introduction, Gregory and great references, too.

What we found at Bose (who was a 'pioneer' (so to speak) in the importance of spatial and temporal aspects of perception (along with the tonal, of course). We believed then, and still do now, that often people prefer spatial diffuse (low interaural cross-correlation -see Manford Schroeder's work) sound fields over less diffuse fields. Schroeder influenced the team at Lexicon to produce the most realistic reverberator at the time.

Of course Max and Max with MC can do way (way) more than this. For me it is a constant reminder to consider spatial effects and being rich (yet neglected) areas to improve performance and synthesis.

bbob drake's icon

At $78 and $98 for paperback, definitely time to hit the library. Any recommendations for more affordable resources?

John Carter's icon

If technical (like upper division math), you might want to read Schroeder's papers on perception and reverberation. You can find a good list here:
https://muse.jhu.edu/article/393605

But this is only one small facet of the breadth of these books. Sorry, I don't know of comparable resources.

Gregory Taylor's icon

Bbob: I am mister used book guy, myself