FFT, overlap and latency
Hi, I've been searching this info for weeks, I think that I found something but cannot figure out how to do it in max to prove it.
Well. One sure thing, FFT introduce latency. For a FFT of size 512, you'll first need to wait for 512 samples, then apply the FFT, make your things and ouptut it. At this point we're at 512 sample + processing time.
Let's consider overlap. We need to use overlap to manipulate FFT bins and that everything ends up smoothly. Using an overlap of 2, that gives us a latency of 768 + processing time that would be 512 + 256 (I need my 2 blocks to reconstruct everything, and my second is delayed by half FFT size.
Now the interesting part. If I decide to use an overlap of 4, then I'll have 4 blocks of 256 (to each of witch I apply a FFT of 512 by zero padding). Then my latency should be 384 + processing time that would be 256 + 128 (I need 2 blocks to reconstruct and my second block is delayed by 128). Which is half of my latency with an overlap of 2 ! But twice processing..
Then, supposedly, I can lower latency rising the overlap.
Did I do/think anything wrong ?
How can I show this in max ?
I'm working on a noise reduction algorythm. Maybe I understood everything wrong, here is the post that put me on this.
Thanks !
I don't think overlap would have anything much to do with latency-- it is all about the window size of the fft (along with vector size of the max msp audio engine, which is what you are calling "processing time").
the fft algorithm would (i think) depend on a whole window of data (dft certainly does), but the overlap is just a simple linear addition of corresponding samples with constant temporal offset, which can be "vectorized" by the audio engine; it is conceivable that, if the code isn't optimized, the latency might be some function of window size and overlap, but it's more likely to be independent of overlap.
you can ignore overlap.
and you must take care, several things in max are a bit different from what you can find in acdemic literature. fft length count or biquad coefficients are only two of many examples. :)
if you want to be sure how to do it right, you dont have to understand the theory anyway. just measure things yourself. the best method to get a latency compensation right is to send a spike (click~ object?) throught your code and compare it against the original.
The thing is that I'm studying this algorithm in max to later realize it in C/C++.
But well then I guess I won't be able to study this part with max.
What I stil can't figure out is that : if I use a FFT of 256, my latency is half my latency for FFT of 512.
If I zero pad my 256 time input to make it a 512 and apply a 512 FFT to it, I'll get 512 bands while keeping a latency of a 256 FFT right ?
To me this would look like increasing overlap ?
Once again I might be wrong somewhere I juste don't see where..
well, on planet earth, fft latency in samples is: window size minus ( fft size / overlap )
in max/msp the window size is always identical to the fft size, and everything else is a matter of trial and error and not getting any answers from the developer :)
Ok many thanks !
...
Wait.. what ? In real world, with windows size eqal to fft size, overlap 2, that would lead to a latency oh half window size ?
...
This thing drives me crazy.
Hi Nicolas,
Windowing plays no role in the latency calculation.
If you an fft~ object with a 512 framesize then your latency is 512 samples. The presence of an additional fft~ object that is processing 256 samples later doesn't change this. That fft~ starts 256 samples later and ends 256 samples later. It also has a latency of 512 samples. But they are two independent processes happening in parallel on different parts of the audio stream.
If you notice additional latency then there are some places you can look. For example, if you are doing an fft~ and an ifft~ then the ifft~ will also add 512 samples of latency. Thus the round trip into and back out of your FFT routine will be 1024 samples.
Cheers,
Tim
Hi Tim, thanks for your answer.
"Windowing plays no role in the latency calculation."
Well what I don't get is that what's inside a 256 time frame with overlap of 2 and 256 zero padding is the same thing as what's inside a 512 frame with an overlap of 4 without zero padding.
Then theorically, using a overlap of 4 for 512 samples time frame, it would be possible to buffer only 256 samples, zero pad to 512, and apply 512 FFT. Wich would give a latency of 256. I understood that it's not working that way in max but I would love to understand how it works IRL.
An other way to ask this is, is it working the same way if we do a 512 FFT on two sets of 256 samples or on 512 samples ?
In analysis they say we loose frequency resolution, that zero padding only add interpolation. What happen when you use this to manipulate FFT bins ?
Hi Nicolas,
I'm not 100% sure I'm following your question. Perhaps a Max patcher with concrete example would help?
Since we're talking about latency, I assume you are doing the operation in real time. In real time all 256-bin FFTs will have a minimum of 256 samples of latency. All 512-bin FFTs will have a minimum of 512 samples of latency.
The number of overlaps does not impact this, the number overlaps just impact when the collection of samples to process begins.
Cheers