Re: Extremely precise sonogram ?

Forums > MaxMSP > Extremely precise sonogram ?
Mar 23 2010 | 3:14 am

Sorry for late response but i was looking a bit more to RavenPro and to your quite interesting fft/jitter tutorials on the share pages.

First i have to explain a little more why i want deep horizontal AND vertical resolution : i’m working on additive synthesis and would like to examine deep details in acoustics instruments sounds like bassoon, clarinets, contrabass, etc., to get inspiration on ways to reproduce them in a 200-harmonics-with-blur-factors-additive-synthesis expressive system for a two pens wacom screen. (see below**)

>> Oh, but you want both a great frequency AND a great time resolution.
>> À la fois le beurre et l’argent du beurre. Well, this is simply impossible.
>> It’s the audio / wave equivalent of Heisenberg’s uncertainty principle.

Of course i don’t agree with this.
Only true for standard FFT algorithm, but not necessary true for all time/frequency views in the whole world : Listen to the attached mp3 below, it’s played from your fft patch "3-record-play-speed-control". Ok, there is a jitter fft view, and we listen to the sound computed back from the fft view : This is a destructive transformation : the rhythm fidelity is poor (like you said, we have the frequency precision, then we don’t have the time precision.) This sound is just "a vague memory of my sound", thus, the graphic view is also, only, a vague memory. At this point i think that, In spite of his uncertainty principle, Heisenberg, would have, rather logically, agree with me, that if a "blurry paté" view is only a vague memory of a sound, then something is missing…

If building any sound by additive synthesis is virtually possible, then there must exist in the universe a way to decompose any sound, without anything missing.

option a :

"multi-resolution" in the way that Izotope explain, i’m not sure, but maybe with a 3D matrix, the third dimension representing 12 different fft sizes from 16 to 32768. Then, add or multiply (or something between) the 12 different planes of this third dimension.

option b :

Wavelet transform ? On the wikipedia page that i linked above, they say :
"[about fft:] A narrower window gives good time resolution but poor frequency resolution. (…) This is one of the reasons for the creation of the wavelet transform, which can give good time resolution for high-frequency events, and good frequency resolution for low-frequency events."
Thanks Vanille for the link to [wavelet~]. Well, I don’t understand how to manage this. to make a sonogram… the only thing i was able to make was a pitch-stretch (in attachment). Does anybody have seen a nice sonogram from wavelet transform ?

option c :

Playing with sampling rate ? Well i fell that if you do upsampling, the frequency resolution should go down, and when you downsample, then the frequency resolution goes up …for the preserved low frequencies. (Plus an other idea for high frequencies then, not sure : highpass filter > freqshift~(down) > downsampling => better frequency resolution for high frequencies too ?)

I’m not sure, maybe a mix between option a and option c. Hum, un peu une usine à gaz… (french expression, literally: i bit like a gas factory)
I was hopping that someone already had a nice solution because it is not that i’m lazy but well, so many things work on. I’m not sure to start on this now, plus i’m not so experienced with jitter.

Anyway, cool to read people interested by this topic,


** About additive synthesis for a good imitation of natural sounds : i want to clearly understand exactly where, in the period of the waveform, is the energy of which frequencies, and how all this change in time during the attack and the sustain of the sound : Trying with my example "funny_additive-synth" ( ), i see that the phase information is dramatically important for low frequency instruments (less important on high frequency instruments) Also, I’d like to see, for a soft violin for example, how blurred the harmonics are, and which ones, etc. (additive synthesis from resonators can make some interesting blurred harmonics : mp3 example in: )

P.S : "million pixels" view : it was only a way of speaking, i didn’t mean "one million frequency bins", i think 16384 or 32768 would be enough.


Subscribe to the Cycling ’74 Weekly Newsletter

Let us tell you about notable Max projects, obscure facts, and creative media artists of all kinds.

* indicates required