Live phoneme/speech recognition

damu's icon

I'm interested in using my voice to control parameters in Max. Primarily - triggering drums live by beatboxing into a microphone. Although there is tonnes of additional potential if the concept works.

It seems that any successful implementation of this would use machine learning to recognise the speech patterns and send a trigger. Op.recognise and Aka.listen are 32 bit and no longer supported.

The technology has moved on a lot since then, there are several open source libraries for speech recognition which are more than capable of achieving this. Deep Speech and Speech Brain are both well developed. There is also the Vochlea Dubler software which looks excellent and demonstrates this functionality, apparently using machine learning. But it's a closed source and I want to customise my own interfaces.

Has anyone tried doing this recently? Have any advice to offer? Or know of a Max project where this has been achieved using up-to-date technology, which I could look at?

Thank you.

damu's icon

Bump. No one been working on this at all? I'll have a go myself, but if there's some giants shoulders I can stand on, that'd be even better :)

11OLSEN's icon

Hi, yes, I dealt with that. But I didn't get very far. I think the transient detection is harder to do than with real drum material. You know these little ss ss ss tz tz tz hihat sounds and ghost note stuff in beat-box material sometimes don't have clear transients. But my attempts are long ago. At that time there was definitely no Flucoma Max Package, which might include everything you need to give it a try. (and there are other Max packages for ML and audio feature extraction and so on..)

damu's icon

Thank you! Great suggestion. I've spent a while looking at the FluCoMa tools for this application and it's very promising.

This hour-long Music Hackspace tutorial is excellent.

If anyone else can suggest any other approaches, I would be most interested.

Ostin Solo's icon

Hi contact me on instagram find me under the name ostin.solo, I have been developing somenthing similar. it's very useful for you research I am looking for a collaborators to finalise it.

Gabriele Strada's icon