Re: Extremely precise sonogram ?

Forums > MaxMSP > Extremely precise sonogram ?
Aug 09 2010 | 7:23 pm

>> jebb said:
>> Alexandre, did you find some solutions for your problem ?

Sorry i didn’t find the time to go over all this, plus AlexHarker is right that it should better involve some C or Java (at least for the REASSIGNMENT algorithm.)

>>But this oversampling technique … That was Alexandre’s goal I think.

No! it was just a useless idea i thought about at the beginning of this threat. As AlexHarker pointed many times in the threat – Thanks for your patience, Alex :-) – and explained to us, beginners in fft, OVERSAMPLING in itself will NOT increase resolution in the FFT.

>> Is it possible to find such a High Resolution sonogram ??

I’m sure it is. When i said above: "this reassignment method is only the 3/5 in the way to the best sonogram that can be done." i should have said the 1/5…

We should think about pixels in sonograms as "probabilities" for a frequency to be there. As many guys pointed above, at a special instant of sound, at a special sample, there is no frequencies at all, we can only guess a probability for that frequency to be around. A sonogram is nothing real or physical. It is just an imagination of a sound. Exactly like our brain treatment, when we listen to music and sounds, is. So, looking for "the deep truth of the signal" is not the good manner to think about sonograms. The only physical truth is the signal itself. So what i dream as an "extremely precise sonogram" is just something that approch the "special treatment of probabilities" that my brain is doing when i listen to sounds and music. The "REASSIGNMENT method" of Izotope RX, used with *32 X and Y overlaps, is far better than standard FFT, but it’s still far from ideal because of the spiderweb-like artifacts produced everywhere…
…How do we know that they are artifacts, and not real pitched sounds ? This is because they change their positions completely everytime that you change the FFT window size! they are MOVING! And there are some areas in the sonogram full of these spiderweb artifacts: What does this means ? : A potentially infinite number of frequencies in this area, or, wrote differently, an equivalent probability for all the frequencies in the area: This is what is called NOISE. Now take 10, 20 or even 60 differents sonograms like that, each one made using a different fourier window size, then mix them all, and you’ll approach – i think – the holy grail : The spiderwebs should then disappear to show some rather soft "fields of noise", while "very-high-probability-paths-of-pitched-content" will still look like really pitched content. You will SEE WHAT YOU HEAR. Not blurry patés nor spiderweb.

>>…be able to show just a 1-second-long line of 1 pixel of width at 440hz, but instead, there will always be artefacts.

Using a bunch of mixed sonograms using the "reassignment method", It will be possible, because, again, the artefacts are MOVING when you change the fft size :
Take 5 Izotope RX sonograms (with reassignment option checked and *32 x and y overlaps) from 5 differents fft sizes. Now do it again after resampling your sound at 8 differents frequencies between 32Hz and 48Hz.
(Because FFT works only with powers of 2 – from what i understood, for efficiency reasons – resampling the signal can be a workaround: i will be equivalent to non-power-of-2 fourier windows sizes.)
So now you get 5*8= 40 screen-shots… then mix them all together using jitter or even photoshop: what should happen is that the artifacts produced by the "reassignment method" would just disappear, because they are different for each screen-shot, but not the 440Hz tone.
Do you guys start to see what i mean ?

> seejayjames said:
> ..especially the saw~ one. Would love a high-res version of that too

hehe, you americans like ufos! Here below is a not-so-much-more high-res of the ufos. You can make your own using the free trial of RX: This is not a real "saw" but a kind of dephased saw made from cosines instead of sines* (made using that : ) (*the first second of the sound attached below)

> Was wondering if there’s any way to put poly~ to work on this

Go ahead!
I see the steps like this:

1- Time-overlap is already an option in the fft objects in max, but not Frequency-overlap : Creating it by shifting a bit the sound, 16 or 32 times (using [freqshift~], maybe) should work, i think. (shifting amount = Bin width divided 16 or 32) Then interleave all the 16 or 32 FTTs in one jitter matrix.

2- Find the way to apply the "Reassignment method" on this, either writing a C external object using LORIS: (maybe even swap the first step and process all the FFT inside the external), or using the example from volker: (patch near the end of the thread)

3- At this point, i think each poly~ should be used to compute a different sonogram using a different FFT window size. And some of the poly~ should also use a resampled sound, like i explained above: it will be equivalent to non-power-of-2 fourier windows sizes: Examining reassigned sonograms in RX, my feeling is that powers of 2 for window sizes are not enough to clear completely the artifacts.

4- Simply mix all the sonograms together. The more sonograms you compute from different FFT sizes, the more clean the result, i think.

Note: Even you have a 24-core cpu, i think you’ll stay far from a real time sonogram from adc~…
(how many TeraHertz are our brains analyzing music ?)


Subscribe to the Cycling ’74 Weekly Newsletter

Let us tell you about notable Max projects, obscure facts, and creative media artists of all kinds.

* indicates required