Monophonic PitchShift engines benchmark / (PSOLA in Gen ?)

    Jan 10 2014 | 4:05 am
    Hi, In my musical research project, we are designing an electric violin with a different pickup-mics for each string (all C strings), the signals being sent into pitchshifters (a different shift for each string) to make nice harmonies in max from the original violin expressions played by a prestigious violinist… Knowing the input sound is MONOPHONIC (one string of violin that is being played at a time by sound chanel), we are looking for the best of best monophonic-pitchshifter for that purpose:
    - The SHORTEST latency possible. - The CLEANEST sound. - If possible, manual formant control. - basically from 1 octave down to 1 or 2 octave up. (from -3 to +5 octave is welcome too ;-) )
    We created this benchmark patch for getting a clear idea of the behaviour of the different real time pitchshifting algorithms. (One of them : supervp.trans~ is not free and is available bundled with couple of other externals on the Ircam website : )
    Take a look at the patch --> (sample rate of 48kHz and all vector sizes to 128 or 256 recommended. …and scheduler in overdrive!!)
    Our point is to transpose in real time a monophonic audio source coming from a violin instrument. We are strongly concerned about latency. (real time playing of the instrument) We came to the point that we couldn't use FFT based algorithms since fft introduce too much latency, and even if superVP sounds lovely with amazing formant style parameters and sound material remixes, but we just can't afford that much delay. (or in some case you can set short fft window and you have less latency …but then you loose the pitch precision, it sounds bad.)
    The best option we have at the moment seems to be, rather logically, an algorithm that is dedicated to monophonic pitchshifting: the PSOLA algorithm:
    - In shifter~ from Tristan Jehan. Latency is less bad here, around 30ms… …But no formant control, and in this object the formant is actually moving with the change of pitch… which is often makes the sound loosing its bringhness while pitchshift-up…
    - In the open source FTM bundle from Ircam (in our zip file) the formant parameter is just great !! …But here we have more latency, around 50ms...
    After scanning the web we couldn't find any relevant code example (and we can't really say we're felling confident with our C/C++ skills). Has anyone ever though about porting PSOLA technics into Gen, or could give us some directions ?
    (Note: As seen in this other topic about monophonic-pitchshifters , we also tried a simple tapin/tapout pitchshifter trying to adjust carefully its window size to try to avoid beats and modulation effects, but this only work more or less for octave pitchshifting, the rest of the time it's awfully enharmonic. Still, for octave pitchshifting, this solution have the shortest latence. (still an antialiasing filter is missing while pitchshifting-up))
    Also, as we are working with violin bass notes that are not lower than 115Hz, I have the feeling that there must exist a way to either optimise a psola algorithm or create an algorithm to lower the latency the maximum possible... an algorithm that, to my feeling, should not need more than twice the period of the fundamental: 17ms...
    Thank a lot! Alexandre

    • Jan 10 2014 | 8:04 am
      Did you see this one in the gen examples folder:
    • Jan 10 2014 | 1:33 pm
      PSOLA in Gen ... Sounds like a challenging project !!
    • Jan 13 2014 | 11:28 pm
      Hey Mark, Thank you for pointing this gen example! Actually I had tryed it already, and i was about to answer that it wasn't fluid, being similar to the sjt.doppler in my patch... But after integrating it to my patch to test this blur function (blur function which is in fact useless in my case), and synchronizing the window size with the fondamental of the source-pitch like i did for the sjt.doppler, It was then quite fluid at any pitch-shifting!...And I realised that my "synchronisation" in sjt.doppler wasn't done the right way!!
      So this gen example doesn't sound bad at all when the window size is correctly synchronised. And the latency is only few millisec!
      I still need to test it more with my violin. I think in some cases Psola still sound better... Also I would need to put some low-pass antialiasing filtering before the pitch shift depanding on the shift amount.
      So if this gen pitchshift is cool, I still miss a formant control... ...Any idea on how to add that in the algorithm ?
      Here is the last testing-patch, and by the way i wrongly put the externals and abstractions in folders the first time, here everything should work:
      (Also I have a strange result when i try to modify my sjt.doppler algorithm in msp for it be exactly the same than the gen example: It doesn't sound the same! gen sounds better!?! …i'm asking in the gen forum about it: )
    • Jul 02 2014 | 10:33 pm
      Hey Alexandre,
      I've a new Max user, and trying to make my own live performance instrument, for which I need to pitch shift voices with as minimal latency or timbre modulation. I am really thankful for your benchmark test of different pitch shifters--I spent all day studying it, it was really kind to upload that. I am wondering what technique you ended up using? I am using shifter~, for lack of gen skills. I came across this extraordinary video on youtube:, and I am really impressed. All the best, and thanks for posting your progress online.