creating a patch that can detect pitch

    May 07 2013 | 10:03 am
    Hi. I'm doing a patch for a school project.
    The concept is a game that can compare two audio samples (melodies) against each other, and based on how similar they are, give a score to the player. How do I approach this?
    I've found an external object called fiddle that can do pitch detection, but I dont know how to use it in my case.
    Any help would be greatly appreciated.

    • May 10 2013 | 1:38 pm
      Getting this working really well is likely to be more tricky than a typical school project, but a the following ideas may be good enough to get you going for your project.
      - Using [fiddle~] you can convert each melody into a list of integers (MIDI values). You can either take values out of the data stream at regular fixed time intervals (say, every 100 or 500 msec — experiment to see what works better) or, alternately, you could just track changes in the MIDI values (round the output from fiddle~, which is in floats, to integer MIDI values and then use the [change] object).
      - At the end of the above you should have a list for each melody. Assuming they are of equal length (more on the length thing below), you can calculate a measure of distance. There are a lot of these in the literature, the simplest approach with a decent statistical pedigree is to calculate the squares of the distances. [vexpr] may be your friend for this.
      - The above will return zero for a perfect match and larger positive values as the "difference" between the two lists increases. You may want to normalize the maximum value to scale your calculations to the unit range [0 .. 1]. Part of this would be scaling by the length of your lists, but you'll also need to calculate what the maximum "distance" can be. The latter will depend on assumptions you make about lowest and highest notes.
      - Now, what do you do if the melodies are of unequal length? You can either pad the shorter list with zeros (or some other value); you could trim the longer list (probably the worst approach); or you could implement some kind of stretching algorithm for the shorter list, interpolating intermediary values.
      I'm not sure how detailed your work is supposed to be, so I'm just throwing out some ideas for how you might approach this. There are further things you could try if you have time for more research. I know there are more recent (and sophisticated) approaches to pitch recognition from audio data, but I think fiddle~ (and friends, there are a few objects floating around that tweak the fiddle~ algorithm) is still the most real-time friendly.
      Hope this gives you some ideas, and good luck with your project!
    • May 10 2013 | 2:19 pm
      Yeah, this is tricky.
      fzero~ also does pitch, but you might want something richer: analyzer~ at might work. If you can stand to be non-realtime, the Echonest object is also interesting: