Sinewave speech, formant tracking, filtercoeff~ in reverse, etc?

Feb 26, 2012 at 10:43pm

Sinewave speech, formant tracking, filtercoeff~ in reverse, etc?

So, this is a bit of an open-ended inquiry, as I can already see a few different potential ways to approach the problem… Basically, I’m looking for a way to experiment with the “sinewave speech” effect described at…

http://www.haskins.yale.edu/featured/sws/sws.html

…which entails tracking formants (their approach is LPC-based, I believe) and producing a coarse approximation of speech from a handful of sine waves. Ideally, I’d like to end up with an abstraction that can be fed recorded speech and will generate (not necessarily in real time) a list of frequencies and amplitudes for each “track.” I understand that a tool like Praat (http://www.fon.hum.uva.nl/praat/) will give me either rendered audio or filter coefficients, but probably not the frequencies themselves? Can anyone speak to this?

Having searched the forums, I’m aware of the externals that exist for doing LPC-related stuff (Mark Cartwright’s LPC Toolkit, Gabor/FTM), but (LPC being what it is), they seem geared toward the generation of filter coefficients, which leads me to wonder whether there’s an object that’s like an inverse filtercoeff~ – in other words, how might one derive a set of higher-level filter specifications (frequency, amplitude, q) from a set of lower ones (coefficients)? Anyone working along similar lines? If you are, please excuse the remedial nature of the question; I’m no dsp head.

Finally, I’m aware that LPC analysis and formant tracking are not equivalent tasks. My real aim here is formant tracking, but ultimately, I’m going more for “awesomeness” than “correctness” and am more than willing to audition results from either column.

Thoughts?

#61998
Feb 27, 2012 at 10:30am

I’ve had good results using miller puckette’s sigmund~, which has a tracks output. In the help patch it has an example and you can reduce the number of partials down to just a few and still understand what a resynthesised voice is saying.

#223906
Feb 27, 2012 at 10:44am

Yeah, second what Oli is saying there. Seems like sigmund~ does nearly exactly what you want. There is a download link over here:

http://crca.ucsd.edu/~tapel/software.html

#223907
Feb 27, 2012 at 4:12pm

Of course! Yeah, I should be able to make this work with sigmund~. Thanks very much for the tip!

#223908
Feb 27, 2012 at 10:03pm

Sigmund~ does indeed get me in the neighborhood of the intended results. Do I assume correctly that the high frequency “flutter” I’m getting across all tracks is the result of the algorithm being thrown off by the unpitched component of the speech? It’s turning out to be more difficult than expected to reduce these artifacts by tweaking the parameters of sigmund~.

Of course, this is a pretty remedial question, so feel free to point me to relevant reading on the topic. Just trying to get a better understanding of what I’m working with.

#223909

You must be logged in to reply to this topic.