Gen PhaseVocoder Sound Quality issues

geddonn's icon

Hi guys,

I've been trying to port the PhaseVocoder example from Tutorial 26 over to Gen to join all of my other Gen FFT patches.

The problem I've come into is that the versions of phaseaccum~ and phasewrap~ that I have copied from the Gen pfft patches in the Examples folder seem to give a very low quality, blurry transient sound.

Here is a patch that does a direct comparison of these processes in Gen and in MSP, notice how the MSP patch sounds like the input unchanged and the Gen patch sounds all messed up.

Is this a problem with Gen? Or a special feature of the MSP objects to prevent these artifacts?
Is there anything I can do to retain the quality of the sound coming from MSP in Gen land?

Thanks

genPhaseVocoder.zip
zip
geddonn's icon
Peter McCulloch's icon

Your vectorsize parameter (the signal vector size for Max) is not necessarily the same as the FFT size (1024), so there's a mismatch between the two rates and your framedelta is not syncing up with pfft's windowing.

When these two values are equal, it seems to work fine, so you probably need to pass the FFT size into gen~. You could get this from fftinfo~.

Also, you might want to be more explicit in your settings for the gen~ delays, by specifying @feedback 1 @interp step. I think the default for feedback is off, so that could really muck things up if that's the case...

geddonn's icon

Hi Peter.

Thanks for your tips.

I've implemented them, thankfully Gen has an fftinfo object too so no need to pipe it in from MSP land.

The sound quality is better, but the transients are still quite blurry whereas the transients in the MSP version are crisp, very strange.

genPhaseVocoder-2.zip
zip
Peter McCulloch's icon

I think the problem lies in the feedback with the input (where framedelta~ is in the MSP example). You don't need feedback here, since you're just taking the derivative not the integral (i.e. the difference between successive frames, not a running difference). The feedback is causing a lowpass filtering of sorts on that input, hence the fuzziness. Once you remove that, it seems to work better. Haven't seriously A/B'ed it but it seems to be the same.

Also: I think (?!) you should be using the spectral frame size rather than the frame size for the delay.

There shouldn't be any special magic in terms of frameaccum~/framedelta~. You can always test this against your code, though, and I'd encourage you to do so.

Max Patch
Copy patch and select New From Clipboard in Max.

Here's the gen~ patch that I'm using:

geddonn's icon

Aha, that's perfect. I wonder if the Spectral Delay patch in the example folder might sound less blurry with this version of framedelta.

I've done a few modifications to play around with this sound, getting some quite interesting effects over here.

Max Patch
Copy patch and select New From Clipboard in Max.

Here's the Gen patch:

Graham Wakefield's icon

Strictly speaking the @interp none/step shouldn't change the sound, so long as the delay times coming in are integers (should be). But @interp none/step will probably be more efficient; I'll change the gen examples accordingly. (For anyone reading, @interp none and @interp step are synonyms, they have the same effect.)

I'm not sure if @feedback 1 is really needed either; this is the default for [delay]. Specifying [delay @feedback 0] has two effects: 1) it does not allow direct signal feedback in the patcher, and 2) it allows delay times less than 1 sample. In a pfft context however, 1 sample delay equates to a frequency shift by 1 bin. If no frequency shifting is desired, the minimum practical delay time is depends on the FFT frame size.

I've found that using framedelta/frameaccum is 'sharp' as long as there is no time/freq shift, but once shifted it becomes blurry, and removing the shift does not remove the blurriness. Presumably this is because although the phase derivatives (angular momenta?) of the bins agree, the absolute phases are no longer in sync. I confess I don't totally understand it though...

toydrones's icon

Over ten years later this thread is still relevant!

I’ve been having a similar issue trying to port the tutorial phase vocoder patch into RNBO. Trying out the patches posted here both in RNBO and in the tutorial’s pfft~ produced the same (or near-identical) results for me: clean resynthesis of the original signal when the stretch factor is 1, but unwanted phasiness and blurring when the stretch factor != 1.

I’ve outlined this issue in more detail here. If anyone has any guidance, I’d greatly appreciate it.