Music Recognition and Matching
Hello,
I asked the question below back in 2016 and decided to revisit this project. Since it is almost 10 years later I was wondering if any new possibilities have been created. I was looking at spectral recognition as a possibility like using Zsa.Descriptors etc. Any help or ideas would be greatly appreciated.
Original post:
I am trying to take music from a vinyl record and have the audio recognized and matched causing it to sync to a digital video in real-time. I want to be able to put the needle anywhere on the record and have the video go to that exact place in the video too. The sync has to be very precise. I don't want to use timecode vinyl. I need to use just regular music records. I am initially trying to do this for a specific vinyl of my own music and a specific video that goes with the music.
I am more of a jitter person from way back in the day and have been getting back into max recently but extremely rusty.
Any advice or direction on how to achieve this would be greatly appreciated.
Gracias
Way beyond my (in)expertise, but have you looked at the FluCoMa package? Lots of data and analysis tools in there.
Ok, cool. Thank you!
Could we get a bit more context?
What are going to do with the vinyl? Is scratching, changing pitch is considered?
What's the nature of the source material? Does it contains repeating loops in it?
If the answer to one of these questions (the second ones) is yes, then I have doubts about the descriptor/FluCoMa approach. Maybe pitch changes or even scratching can be handled (not sure though), but if your audio contains repeating loops, I can't imagine how you would know at which time you are given you can put the needle anywhere.
If you really want to go that way, it's probably worth checking the Ircam's MuBu package as well. Similar features to what FluCoMa can do, but with a quite different approach.
Out of curiosity, why don't you want to use timecoded vinyls? Assuming they can be interpreted in Max, it might be the most robust solution around. After a quick search, there's quite some old discussion on that topic, and it seems that MS Pinky is still a viable solution.
Another solution that might be worth exploring is computer vision tracking: you put a red dot on top of the part containing the needle (at the very end of the tonearm), you put a camera above it, and after some calibration you can guess at which position in time you are at depending on the red dot position in space. It comes with its own range of constraints and difficulties, but at this point maybe not more than the full audio approach.
Another idea: did you think about scratching VHS tapes?