Emotion and Speech

Jul 22, 2009 at 3:30pm

Emotion and Speech

Hi All,

I’m trying to find ways to automate a granular synthesis patch so that I can change the pitch / volume / tempo of a spoken sentence to make the speaker sound angry / happy/ sad. In turn I want to come up with some pre-sets for each emotion.

I’ve been playing about with the function object keep experience problems. Sometimes it automates the start of the sound file and then just stops. Is there a way of making the domain exactly the same as the sample time played through the buffer?

I’ve also been experimenting with the counter / metro objects. Is there a way of changing what the counter counts? Specifically, I want it to count upwards between 0.1 and 2, so that it begins 0.1, 0.2, 0.3, 0.4 etc…

Any other suggestions of how to automate would be much appreciated.

#44837
Jul 22, 2009 at 3:57pm

counter 1 – 20
|
* 0.1

will output range you are after. the other bit about triggering a soundfile with function is difficult to understand without an example patch. but if you want sample accurate alternative to function, the help file to adsr~ is worth looking at.

#161409
Jul 22, 2009 at 4:15pm

thanks!

I’ve used ADSR before when synthesizing midi notes. Do you think this would work on a longer sentence like “The orange is round” ? Would you recommend using one ADSR for the entire phrase or several for each word or syllable?

#161410
Jul 22, 2009 at 5:00pm

With [function] you can program in multiple break-points per sound file, with [adsr] you only get the three. In regards to making the [function] last as long as the sound file you can get the length using [info~] (assuming you’re using [buffer~] to play your sound) and then use this value as the argument to the “domain” or “setdomain” message to [function].

lh

#161411
Jul 22, 2009 at 10:56pm

This seems very interesting.
This seems very interesting.
This seems very interesting.
This seems very interesting.

You will want to use bonk~ with info~ to identify when words start in the sentence in relation to the buffer~. Presets for a whole sentence will be hard because the sentence would change with context, but you could have preset which could be applied to each word.
I would stick to function with many points but that’s because I’ve always used function and haven’t had a reason to use adsr~. But also depending on how detailed you would like to go – I think longer words may need more than 3 breakpoints.

#161412
Jul 23, 2009 at 10:14am

Thanks for this.

Are “bonk~” and “info~” objects? They don’t seem to work when I create them as such. Should they be used within the argument of the buffer?

Please help!!

#161413
Jul 23, 2009 at 10:22am

info~ is a native msp object, but needs to have an argument (the name of the buffer it will refer to) for it to instantiate – check out the help file.

bonk~ is 3rd party, made by Miller Puckette I think. Go to http://www.maxobjects.com and seach for it.

If you have the max window open while you patch, it will tell you things go wrong, like “missing arguments for info~”. It’s a big time saver.

BTW, when you download bonk~ the way to “install” is to drag/copy+paste the bonk~.mxo into Applications/Max5/Cycling74/MSP Externals folder. And place the bonk~.maxhelp into the MSP Help folder.

#161414
Jul 24, 2009 at 12:40pm

Hi thanks for your help.

How do I use the bonk with info to find the word length? Which part of bonk do I connect to info, or is it the other way round??

#161415
Jul 24, 2009 at 4:43pm

info~ won’t connect to bonk~. you would want to use info~ to get pitch / sample / loop length information from the buffer~ that you link it to by name. in your case, use info~ to retrieve the overall length of your recording.

bonk~ does live attack detection – it bangs when it detects an onset. if you have your voice recorded in a buffer~ you could connect bonk~ to something like timer / clocker and use coll in some way to collect your data for when the attacks/words occur. experiment with it.

a short time ago somebody posted a link to a new external that was kind of a cross between analyzer~ and a beat slicer. cant remember who, but search the forums for mpeg7 – it might be of use/interest.

#161416
Aug 7, 2009 at 12:51pm

goodparleyandorfing wrote on Fri, 24 July 2009 10:43

bonk~ does live attack detection – it bangs when it detects an onset. if you have your voice recorded in a buffer~ you could connect bonk~ to something like timer / clocker and use coll in some way to collect your data for when the attacks/words occur. experiment with it.

.

Hi,

I’ve had a go at this but really struggling on how to connect time/bonk/coll together to detect the attacks for the words in my speech file. Any suggestions?

– Pasted Max Patch, click to expand. –
#161417

You must be logged in to reply to this topic.