Sound Design, spectral manipulation between audio sources
Mmmm, tricky title.
Frustrated with commercial audio manipulation for sound design I'm stepping back into Max after a long hiatus.
What I want to do is be able to manipulate a SFX (audio sample) using the spectral characteristics of another. For amplitude envelopes this is straight forward but say for example...
- I have a long constant sound I like the timbre of, say a hissy, ethereal ghost voice (sfx_A)
- I have a sound with a cool shape in terms of amplitude and frequency content, say a cat screaming/fighting (sfx_B)
I want to emboss sfx_A with (all or choose just some of) the spectral info of SFX_B; amp/freq/pitch envelope.
Can anyone point me in the right direction for controlling this??
max objects/threads/existing plugins/DAW techniques etc.
Thanks!
if you know how to do it for the overall envelope, you are already half there, because in an FFT you have the amplitudes per bin (read: for the frequency bands) already laid out for you and it is not really difficult to manipulate them.
the main issue with all of this is that human perception and the code which gets you there often do not match at all.
i also have the feeling that you still need to make it more clear to yourself what "spectral characteristics" actually means in terms of... err... numbers.
there is far more you can analyze than the amplitudes and phases just bin by bin. if you want to sort stuff by their noisiness, bark, dynamic transients, harmonicity, fundamental frequency and formants, livelyness or harshness, find similaritis or group the inputs into classes, you need to derive a lot of additional data form the FFT bins, such as the centroid, median, average, the maximum lenght, the count of bands which are louder than 0.1 but lower than 0.3.... or whatever helps to describe the event best.... and then still give the user a dozen parameters to influence the analysis to find the best setting sound by sound. :)
I suggest you look into using a vocoder (like the classic vocoder from the Max examples). This is a spectral technique that does not use the FFT. Rather, the limited number of filters is modeled on ear perception.
Another technique is FFT-based "amplitude-multiplication cross-synthesis". There is an example in the Max examples (cross-dog if my memory serves me well). Sonic results depend on your source material. Keep in mind that with this technique, for each frequency bin, you multiply the energy from a sound with the one from the other sound. If your sounds are rich, but don't have any energy in common (energy lies in different bins), then the "cross-synthesis" result is null.
'Timbral/Spectral Interpolation/Morph' are other names i've heard for similar ideas.
many techniques still undiscovered... (i've been toying with a simple one based on interpolating between 'instantaneous frequencies' of two sources, using a certain amplitude threshold... but i don't share that patch, as it's my most favorite thing to work with in Max of all time... the description i've given here, though, is enough to give it all away 😅 ...the technique of interpolation is key, and i've not learned how to make it ideal yet)...
advice for when you start creating: there's a great amount of data when combining two sources of FFT that is either redundant or unnecessary, so figuring out different ways to tame the phase-based information by creative amplitude-based detail, can be very helpful(you can also filter the sources beforehand, but that will introduce extra processing).
detailed phase-processing using Max's FFT capabilities will often require understanding/use of objects like 'framedelta~', 'frameaccum~', and 'phasewrap~' ;)
you could study 'Musimathics Vol. II' (Vol I is also great) to get some of the theory involved,
as well as these tutorials from Tim, this is just the link to the last one, but the last one contains links to all the others before it:
and then there's this set of tutorials that have always helped me immensely:
welcome back to the rabbit hole! 🐇
interpolating the pitch from one frequency to another (in cases where fA and fB are quite near) - instead of only mixing the amplitudes against each other - is one of the fields where you could probably make dozens of interesting experiments to try out what could make sense... and for what material or type of effects.
the goold old max examples for vocoding using reson~ and simple cross synthesis in pfft~ are definetly good starting points, as well as jfc´s spectral tutorials.
however i always feel that inside a pfft there are so many paths to go, that it can get quite hard to decide for one.
you sit there and learn and program for 5 hours and when all is done, and you are modifiying the spectrum of sound A using the spectrum of sound B while applying the envelope of sound C on it, you might find out that it sounds dull no matter what you send into it.