How to make Wekinator recognize a sound in Max?

Thomas

Hi!

I once went to a conference/event/small gathering thing about different audio tinkering stuff, and there was this guy who had been messing with Wekinator and audio inputs. I remember he had made a voice controlled version of Wolfenstein. I found a link of a video demo here: http://picbear.com/media/1597832627128957439_4400702211
I'm trying to do something similar yet simpler. I have Max and Wekinator talking and it's working fine. I'm trying to figure out how to make Wekinator analyse the sound from a mic and then send back a 0 or 1, depending on what's going into the mic, e.g. whistling or growling like in the video. I don't know how to process the audio from the mic to make into something that Wekinator can interpret.

Anyone have any good ideas? Thanks!

Zancudo

Hi Thomas,
Wekinator does not do audio analysis. You would need to do the analysis in max and send the values to wekinator. Have a look at the zsa.descriptors package

Thomas

Cool, I'll take a look at it. Thanks!

Carlo Cattano

In the ml.start package there is a frequency band splitter that you can use if analyzer~ is not working for you . Then just choose "dynamic time warping" in wekinator and send values from the fq bands

mzed

zsa.mfcc~ would be my choice for audio feature extraction into Wekinator.

Carlo Cattano

@MZED great object , didnt know about it thanks!

Thomas

I'll take a look at the zsa.mfcc~ and ml.start package. Didn't manage to get it to work yet, so I'll have another try when I get time. Thanks.

Thomas

I can't manage to get Wekinator anything it can use. I'm attempting to make Wekinator distinguish between 3 different samples, so that should be easy enough. I've managed to make Wekinator recognise streams of numbers with dynamic time warping, so I'm sure it works, but I can't get it to recognise sounds. I've tried with mfcc~ and mel~. I unpack the lists and send them to Wekinator (you can check the patch if you want). Maybe I'm doing something wrong with the zsa objects. Any pointers would be greatly appreciated! :)

Wekinator Simple In Out - Dynamic Time Warping.maxpat

Max Patch

Chris_DeCh

Thomas - I haven't looked at your patch but I think you need a bit more grounding in Wekinator. You need to train the model you select. I would highly recommend spending some time taking Rebecca Fiebrink's excellent course on Machine Learning for Musicians and Artists on Kadenze. Rebecca is the developer of Wekinator.

Additionally I have a video that more or less demonstrates what you are trying to do which can be found here - https://youtu.be/HcoFWN89BQw

It uses zsa.mfcc as the feature extractor in a Max program which sends the MFCC’s over the network to Wekinator. In Wekinator I recorded about 20K data points of a flute , clarinet, trumpet, tympani and cello for model training. I used Finale to record all the training data as files to play back in my feature extractor program.

In wekinator I tried training many different models and k-nn( nearest neighbor) worked about the best for classifying the instrument timbres with a better than 90% accuracy. The test file I play in the video plays the Bach Minuet in G, one instrument at a time, and this was also done in finale. My Max program in addition to sending MFCC’s to wekinator also receives the classifier values back from wekinator over the network. The program lights up the class value and also graphs the probabilities of each class value. The probabilities are quick and dirty quasi Bayesian values from wekinator.

As the file plays you can see it very accurately identifies the flute, clarinet, trumpet, tympani, and cello. That is all it was trained on so thats great. But I wanted to see if this basic training could be used to identify the family of the instrument accurately from just training on one of its members. It did better than I thought it would, around 50% or better. For me that means this technique is promising. A little more training data and it should be able to identify all solo instruments in the orchestra at 90% or better.

Chris