best voice pitch analysis 2017

DailyM's icon

Hi all
I'm trying to find a tool for pitch analysis of monophonic audio of voices. I know there are lots of forum posts about pitch analysis tools, but many of them are quite old, or not necessarily ideal for what I'm hoping to achieve.

-not necessarily real-time. Could be analyzing a buffer with up to a 5-7 second delay
-consistent: analyzing the same file should produce the same results.
-optimized for voice

Eventually the plan would be to have changes to spectrum trigger notes in such a way that an instrument's note attacks will line up with new words or consonants.

I have tried a number of tools, and haven't yet found the results I'm hoping for. It might be I haven't found the right tool, or that I just don't know how to optimize the many (many many) parameters to suit my needs. I'm only beginning to wrap my head around fft stuff, so it's probably that.

I've tried:
-sigmund~
-fiddle~
-analyzer~
-descriptor~

descriptor~ is appealing because it is relatively computationally light because you can be selective about what is analyzed. It's also pretty confusing!

There are maybe some others I've forgotten about.
if I could get analysis results as good as when I drag an audio file onto an ableton midi track, I'd be happy. Are one of these externals, or others, ideal for vocal analysis? and what settings should I use?



Jean-Francois Charles's icon

Make sure you try the pitch detection capabilities of [retune~] (comes included with Max 7, works on Mac & PC).

DailyM's icon

Thank Jean-Francois, but I'm still back with Max 6.

kleine's icon

Well, Max 6 is quite 2012, not 17... ;-)

Bill 2's icon

6 is far more legible, though.

DailyM's icon

Yes, haven't made the jump.

...anyhow I've been having pretty good results with analyzer~ with the following settings:
@buffersize 4096
@fftsize 8192
@hopsize 512
@windowtype hamming
(I arrived at these thinking this buffer size would be large enough to capture the bottom range of the human voice)

the one parameter that I've found problematic is
@numpeakstofind:
For high voices, 7-10 peaks work well
For low voices, 2 or 3 is much better
Compromising at 5 just means it's not as good at either.

I've been experimenting with @npartials, which is supposed to change how the partials are weighted in analysis, but I can't really hear a difference, maybe because i'm low-pass filtering before analyzing.

Any other hints for getting the best results?

alistairz's icon

I was looking at analyzer~ in Max 7 recently & even though it may have been given extra functionality it still works better in Max 6. Maybe someone might like to have a look at it as when the help file is loaded, some of the patch cords coming from the analyzer~ object itself disconnect…
Despite still being 2012, Max 6 is sturdier at least.

DailyM's icon

Okay well as noted, this thread should probably be titled best-voice-pitch-analysis-2012 but anyway, I'm getting okay results using analyzer~ in Max 6 now. For my own results, it helped to compress before sending into analyzer~, and I added some math after to try and convert the "attack" and "loudness" outputs into midi velocities. Still working on that part. Here it is, takes audio in, sends midi note and velocity out.
feedback appreciated!

Max Patch
Copy patch and select New From Clipboard in Max.