Forums > MaxMSP

Cutting up an audio file automatically for help in transcribing speech

September 23, 2008 | 5:26 pm

Hey,

I normally work with Jitter, but I have a programming puzzle and I thought max/msp might be a good solution…

I’m trying to create a max patch to cut up an audio file into a series of ~30 second chunks to help me more easily transcribe it. Since I will need to hear the full words, I can’t split the audio file in the middle of one. Anybody have any ideas on this?



Eli
September 23, 2008 | 6:19 pm

Is there a decent bit of silence (second or so) in between the 30 second chunks?
If the silence in between the chunks is greater than the silence in between the words (a moderate difference) then you could tell Max to register the volume below a certain level->then if the "silence" is a certain length cut the file there (end the recording to a file/change the output to another buffer).

I’m in class so this is just an outline, somebody can explain to you the technicalities.


September 23, 2008 | 6:25 pm

The audio itself isn’t organized in any way– the 30 second cut positions could land anywhere, which is ok to start with but I’d like to have it scan until there was a pause between words just so I could give a chunk of audio to one person and a different chunk to another and then later put their transcriptions together and not lose a word in the middle. I don’t really know how to do this in max though, so if anyone could get me started I would really appreciate it.


September 23, 2008 | 7:57 pm

Not sure how you’re playing back the file and cutting it but there may be 3rd party externals that help you detect when amplitude drops below a certain level. Actually, now that i look again, there’s also the thresh~ object.

Otherwise, probably not the best solution, there must be many others, but you can use a simple setup with a object to open/close a gate which would on/off-trigger a file-recording process. But you’ll have to fine tune this because your file may not have absolute silence in between words so you may need to use something like the downward-expander/gate within omx.comp~(see help-file) to cut out low noise. And you may even want to increase/decrease the amount of timed silence it takes to cut the recording(just in case there’s a bit of silence within a single word, or in case there’s not that much silence between words). In addition, you’ll need to sync on/off of the playback to the on/off process of the recording so that you can open a new file every time you make a cut and need to start a new recording without losing information from the playback, sorry for how convoluted my explanation is but hopefully this gets you started:

– Pasted Max Patch, click to expand. –

September 23, 2008 | 8:18 pm

Hello,

There is a way to start automatically only when a certain threshold is reached?
Is there a way to cut automatically the silence from the beginning and the end of a live recorded file?

Thanks

Luis Marques


September 23, 2008 | 10:34 pm

Hi, IcedDragon, you are asking for the same basic thing it sounds like, so you could use the same patch i posted but rework it so that recording starts automatically when the signal is above 0. But this will cause an abrupt cut which will have an audible click/pop at the beginning and end of the recording unless you also synchronize a fade-in/fade-out. Also, you may have to work in some noise-gate to zero any background noise(like audience, etc.) before the live recording. Not sure exactly what you have in mind, since, it would almost be better to manually start and stop the recording but this patch could help you get started as well, it is just a reworking of the patch i posted before, and you can rework it further to add fade-in/fade-out. Again there are probably better ways to do this, but this is the quickest/simplest way i could think of:

– Pasted Max Patch, click to expand. –

September 23, 2008 | 10:36 pm

If, however, you are both looking for non-realtime solutions(non-realtime operations over audio recorded into a buffer~, for example), I would look into MXJ as a possible answer. This post could help you out in learning further about that:

http://www.cycling74.com/forums/index.php?t=msg&goto=150413


September 24, 2008 | 12:14 am

Well, there’s another option for you (Haven’t read the whole thread, I apologize) It’s not pretty but it’d be fast, and wouldn’t harm your end goal at all – instead of cutting at 30 second intervals, overlap them. So cut one is 0:00 to 0:30, cut two is 0:25 to 0:55, cut 3 is 0:50 to 1:20, etc.


September 24, 2008 | 10:16 am

Yes, its a really easy way to do it, its only the main CORE. When i was asking for starting and stopping automatically, i was talking about doing it in instruments like a guitar or a bass, not voice. But i have another question related to want happens after recording, is Cutting, is there an easy way to cut the silence in the recorded file automatically, or at least below a certain threshold?

Luis Marques


Viewing 9 posts - 1 through 9 (of 9 total)