and would there be any way to intentionally acheive this smearing, other than having a massive bank of oscillators equal to the number of bins?

]]>In essence, a number of short-duration FFTs are taken one after another in sequence and the changes in the amplitude & phase of the frequency bins between each of these frame consitututes the changing information over time. The longer each of the frames, the better the frequency resolution but the more the temporal smearing (i.e. poorer time resolution) and vice versa. Getting a good FFT of a quickly-changing signal is a tradeoff between frequency and temporal accuracy.

The disjunctions across frame edges are handled by ‘windowing’ each frame to shape the amount and character of the overlap between successive frames – altering the quality in various ways.

I’d reccomend a perusal of any good computer music or digital audio text – Roads, Moore, and Dodge/Jerse are good starting points.

]]>> if an fft is performed on a peice of audio, then an inverse fft is

> performed, the original audio is reproduced perfectly, but i’ve never

> understood why this is – where in a series of frequency bins is the

> information about time retained?

The short answer is: the information is in the phase, or more exactly in

the difference between the phase value of a specific bin in consecutive

frames…

Stefan

–

Stefan Tiedje————x——-

–_____———–|————–

–(_|_ —-|—–|—–()——-

– _|_)—-|—–()————–

———-()——–www.ccmix.com

when run through the IFFT? ]]>

> Does that mean that the phase difference is interpolated

> when run through the IFFT?

Think of it as if there are two sine waves with a small difference in

frequency (within the band of a single bin). Then you overlap them with

the window function… That creates something like an interpolation, but

not exactly… ;-)

Stefan

–

Stefan Tiedje————x——-

–_____———–|————–

–(_|_ —-|—–|—–()——-

– _|_)—-|—–()————–

———-()——–www.ccmix.com

> frequency (within the band of a single bin). Then you overlap them with

> the window function… That creates something like an interpolation, but

> not exactly… ;-)

hang on, there is no window function in what i’m talking about (i think) just a single fft..

]]>> time information, i.e. the frequncy information is smeared across

> the entire frame.

i wouldn’t say this is entirely true.

the time information is in there (otherwise you couldn’t transform

the data back without loss),

but you can’t “easily see” it.

it’s the same as when you try to “see” the frequency information in

a time domain representation.

in both cases all information is there, but you use different kind of

representations to have access to what you are interested in.

one way to get rid of the time information is indeed to zero-out the

phases.

but then you will loose a lot of the frequency information, too.

(phase is a complex animal…)

the resulting sound will always have a harmonic spectrum based on the

delta bin freq of the fft (with an fft size of 1024 this will be

around 43 Hz).

if you are talking about a really large fft/dft, then this could

work, as your frequency bin resolution is very high – resulting in a

very low fundamental.

volker.

]]>> work, as your frequency bin resolution is very high – resulting in a

> very low fundamental.

it is a very large fft indeed. the trouble is, removing the phase resulted in a big attack at the start of the IFFT audio result, since all the frequencies were in phase with each other initially. i’ve also tried randomising the phase of each bin, which works really well, but is still not perfect because it seems that time based information is still there in some way, even if a random way, in the interactions between frequencies – destructive/constructive interference etc.

what i really want is for the resulting sound to remain as static as possible, as if there were a bank of unchanging oscillators as big as the number of frequency bins, with each its amplitude set by the magnitude from the fft. of course, there would still be interference between frequencies here, but not in a “set” repeating pattern of the length of the original audio.

]]>>

>> if you are talking about a really large fft/dft, then this could

>> work, as your frequency bin resolution is very high – resulting in a

>> very low fundamental.

>

> it is a very large fft indeed. the trouble is, removing the phase

> resulted in a big attack at the start of the IFFT audio result,

> since all the frequencies were in phase with each other initially.

> i’ve also tried randomising the phase of each bin, which works

> really well, but is still not perfect because it seems that time

> based information is still there in some way, even if a random way,

> in the interactions between frequencies – destructive/constructive

> interference etc.

>

> what i really want is for the resulting sound to remain as static

> as possible, as if there were a bank of unchanging oscillators as

> big as the number of frequency bins, with each its amplitude set by

> the magnitude from the fft. of course, there would still be

> interference between frequencies here, but not in a “set” repeating

> pattern of the length of the original audio.

>

yes, randomizing the phases is the better way. i have just tried it

and it sounds very nice.

as i said, when you mess with the phases, you will loose frequency

information, resulting in a purely harmonic sound with a fundamental

corresponding to the size of the fft (number of bins). if the fft is

large you might get a good enough frequency resolution cause the

fundamental is very low. but although the periodicy of the harmonic

sound might not be perceived as an audible pitch, it still repeats

periodically, i.e. you hear it as a repeating rhythm.

is that what you are talking about?

maybe you can upload the original and processed sound somewhere to

compare results.

volker.

]]>> and it sounds very nice.

>

> as i said, when you mess with the phases, you will loose frequency

> information, resulting in a purely harmonic sound with a fundamental

> corresponding to the size of the fft (number of bins). if the fft is

> large you might get a good enough frequency resolution cause the

> fundamental is very low. but although the periodicy of the harmonic

> sound might not be perceived as an audible pitch, it still repeats

> periodically, i.e. you hear it as a repeating rhythm.

>

> is that what you are talking about?

> maybe you can upload the original and processed sound somewhere to

> compare results.

original:

http://www.thirdmeaning.net/misc/guitarscale.wav

processed (looped, with overlapped windowing):

http://www.thirdmeaning.net/misc/outputtest.wav

you can hear the “pulsing” which is the length of the file, which you would of course expect, but the question is, is there a way to avoid this, to just generate the frequencies at the right magnitudes infinitely, rather than just creating something the same length as the input and then looping it.

if so then it would still have periodicity as you say, but i think it would be far less obvious, and very large.

]]>> original:

> http://www.thirdmeaning.net/misc/guitarscale.wav

>

> processed (looped, with overlapped windowing):

> http://www.thirdmeaning.net/misc/outputtest.wav

>

> you can hear the “pulsing” which is the length of the file, which

> you would of course expect,

the pulsing i hear in your processed sound is much shorter than the

size of the original file.

so i guess there’s still something wrong in how you calculate the thing.

compare it to this one:

http://www.esbasel.ch/Downloads/outputtest-vb.wav

this has a period of roughly 3 secs, since the fft routine i use can

only process file lengths that are powers of two.

> but the question is, is there a way to avoid this, to just generate

> the frequencies at the right magnitudes infinitely, rather than

> just creating something the same length as the input and then

> looping it.

>

hm, you would need an oscillatorbank or a realtime version of the

ifft (of the same size as your fft).

then find the peaks and calculate the true frequencies (from phase

increment or magnitudes) etc.

quite expensive for large fft sizes.

if you don’t take the infinity too seriously, you could also use an

fft which is even larger than your file, and zero-pad the rest of the

frame.

this file is processed from a frame size of 524288 samples (~12sec.)

http://www.esbasel.ch/Downloads/outputtest-vb2.wav

volker.

]]>> you can hear the “pulsing” which is the length of the file, which you

> would of course expect, but the question is, is there a way to avoid

> this, to just generate the frequencies at the right magnitudes

> infinitely, rather than just creating something the same length as

> the input and then looping it.

If you randomise the phase each time you play the complete frame it

should not pulse any more if you have the ideal windowing. But maybe you

need a really good random generator, which is the domain of Peter

Castine’s objects…

Stefan

–

Stefan Tiedje————x——-

–_____———–|————–

–(_|_ —-|—–|—–()——-

– _|_)—-|—–()————–

———-()——–www.ccmix.com

> size of the original file.

> so i guess there’s still something wrong in how you calculate the thing.

maybe it’s because of my overlapping and adding with a window?

>

> compare it to this one:

> http://www.esbasel.ch/Downloads/outputtest-vb.wav

>

> this has a period of roughly 3 secs, since the fft routine i use can

> only process file lengths that are powers of two.

yes, this sounds much better. how are you looping the result? (wouldnt there be a click if you just looped it straight?)

> hm, you would need an oscillatorbank or a realtime version of the

> ifft (of the same size as your fft).

> then find the peaks and calculate the true frequencies (from phase

> increment or magnitudes) etc.

> quite expensive for large fft sizes.

sounds difficult..

> if you don’t take the infinity too seriously, you could also use an

> fft which is even larger than your file, and zero-pad the rest of the

> frame.

this sounds like a very good idea, so the magnitudes would remain unnaffected, except they will all be proportionally lower?..

> this file is processed from a frame size of 524288 samples (~12sec.)

> http://www.esbasel.ch/Downloads/outputtest-vb2.wav

that’s really good, just what i’m looking for! thanks for your help, i’m finding this stuff really interesting to play around with.

]]>> should not pulse any more if you have the ideal windowing. But maybe you

> need a really good random generator, which is the domain of Peter

> Castine’s objects…

one thing i did try was generating 10 (for example) different random versions, then when looping them, randomly choose a different one each time. there was still pulsing, which i guess is because my windowing function isnt right, but could also be because of a poor prng – i’m just using the rand() standard C function

]]>>

> yes, this sounds much better. how are you looping the result?

> (wouldnt there be a click if you just looped it straight?)

no you don’t need no windowing, just loop the processed buffer –

that’s the cool thing about it. since all the harmonics are integer

multiples of a (very low) fundamental, there can’t be clicks.

at least this is true for the power of 2 buffer length i’m using.

don’t know about fftw – should be the same though.

>

>> if you don’t take the infinity too seriously, you could also use an

>> fft which is even larger than your file, and zero-pad the rest of the

>> frame.

>

> this sounds like a very good idea, so the magnitudes would remain

> unnaffected, except they will all be proportionally lower?..

don’t know exactly what you mean by “porportionally lower”.

magnitudes will be more or less the same, just more…

zero padding is a common trick in dsp to enhance the frequency

resolution.

with larger fft sizes you get more bins over the same frequency range

(0 -> SR),

i.e. a lower fundamental and thus a longer loop if you transform it

back to time.

>

>> this file is processed from a frame size of 524288 samples (~12sec.)

>> http://www.esbasel.ch/Downloads/outputtest-vb2.wav

>

> that’s really good, just what i’m looking for! thanks for your

> help, i’m finding this stuff really interesting to play around with.

fine.

this is a single fft frame transformed back into the time domain.

should be seemlessly “loopable”.

volker.

]]>>

> If you randomise the phase each time you play the complete frame it

> should not pulse any more if you have the ideal windowing.

this would require to calculate the ifft in realtime, which for

_large_ fft sizes is not practical.

v

http://www.thirdmeaning.net/misc/outputtest3.wav

as if the phases are lining up a bit at the start and end.

i’m doing this for each bin, does it look right?:

mag = sqrt((fftresult[i][0] * fftresult[i][0]) + (fftresult[i][1] * fftresult[i][1]));

phase = (PI*Random()) – (PI/2.0f);

fftresult[i][0] = mag * (cos(phase));

fftresult[i][1] = mag * (sin(phase));

> phase = (PI*Random()) – (PI/2.0f);

> fftresult[i][0] = mag * (cos(phase));

> fftresult[i][1] = mag * (sin(phase));

just in case anyone is interested, it turns out the problem with the above is that phase should be in the range -PI to PI, rather than -PI/2 to PI/2 like i was doing for some reason.

]]>about the nuts and bolts of FFT/IFFT.

Are you guys doing all this in a Max patch or a source code

external? If there is an example patch, can someone please post it?

> external? If there is an example patch, can someone please post it?

sorry, i’m doing it in an external – i wouldn’t know where to begin doing it as a patch. the msp fft objects seem to be geared towards “real-time” frequency analysis, by doing partitioned ffts (short-time fourier transform?), rather than letting you move things around in buffers outside of audio time.. if you see what i mean.

source code is here if you want it:

http://www.thirdmeaning.net/misc/staticresynth~.c

it just needs the fftw library (http://www.fftw.org/), and this random number stuff:

]]>a try. ]]>