Fastest onset detection? (native or external)
I've used a couple of different things (spectral stuff) and have settled on this for all my onset detection.
It's peakamp~ based with some autothreshold adjustment stuff going on.
Anyone have anything faster or more accurate than this? (preferably native, but externals are good too)
This, ripped straight from Peter McCulloch's transient designer patch ( http://www.subtlesonic.com/envelopeshaper/ ), works well for me,
Cheers
Roger
That looks pretty good.
I'm trying to figure out an objective way to measure the difference in time between them but it doesn't seem to be working like I think it should be working (trying to use timer to tell the difference between attack and its detection).
Couldn't figure out how to measure the difference in response time between the two, but doing some less objective testing it seems like the volume differential version handles repeated fast attacks better. I need to try to figure out an autothreshold for the audio rate one so it doesn't need tweaking between input types.
edit: Hmm, the volume differential has a real hard time detecting the vibes example audio attack.
I usually use a different raise time and down time in the env following part, replacing in your patches the average~ (or the bucket which is only averaging random-ish values from the speedlim) by the following stuff. I know it is not by-the-book but it is much more nervous, and I can change dynamically the attack and the release of the detection...
p
Not 100% sure I follow. So you're saying replace the average~ by your abs/slide, then still go into the slide that's there?
By doing this it works more or less exactly the same EXCEPT it actually tracks the vibes attack (where as the average~ version does not).
I've dropped the bucket/speedlim thing as it's doing the same thing as the other onset detector but in the max scheduler.
Here's a comparison with what Roger posted and what I think you're suggesting.
I think it's pretty much a swings and roundabouts thing; average~ works better in some cases, slide~ in others, depending on what you set the threshold and range values to and what the source is.
For example, If you try the drumLoop, setting the threshold to 0.3 and the range to 0.9dB , average~ seems to be better at picking up the little flams - but that's only good if that's what you want. If you just want the basic beat, you could say that it's more susceptible to false triggers, and slide~ gives better results.
However, if you try the same settings for the vibes loop, slide~ is the one giving false triggers, giving a double trigger on each onset!
I wouldn't know where to begin trying to automate the process - the thresholds are always going to depend on the nature of the source,
Cheers
Roger
I think the slide~ works best for my general purpose stuff.
Now I'm trying to extract a reasonable velocity value from the attack too but not finding a good place for this. Since this new way (slide or average even) are audio rate, that happens faster than my max scheduler version which extracted a velocity.
Here is what I was doing before, but adapted to the slide~ onset detection. By the time the bang happens the velocity is lower than it peaked at.
I also tried sah~ to keep it audio rate, but the velocity value still isn't very good.
Any thoughts?
Don't suppose our brush with mayan apocalypse gave anyone any insight to extracting velocity from onset detection?
Yea but it's only accurate to 74 Haab'..
But not if I run the onset detection in the Haab scheduler.
I prefer this envelope follower for percussive signals, since you get the RMS from average~, but it preserves the transient characteristic and exponential decay of the sound:
average~ 100 rms
|
*~ 1.4132
|
*~ slide~ 200 2500
Other tip: make two scope~s, one for the signal, and one for the envelope. Set one of them transparent, then overlay. This really helps with seeing how things work together. (you also may need to delay the input signal to account for the latency of the envelope follower)
Please correct me if I'm wrong, but
thresh~ 0.5 5.
|
change~
>~ 0.
|
edge~
gives the same result as
thresh~ 0.5 5.
|
edge~
fwiw, fzero~ does some onset detection.
@Rodrigo: revisiting this thread in response to another post, how about peakamp~ for velocity?
At signal rate, you could use a running maximum that gets reset with each onset. One other tweak would be to look at the derivative of the onset for detecting the peak. When it switches to negative, patch the previous value. (this premised on a percussive signal with a smooth envelope)
Both of those are good ideas. I'll give that a spin.
In the end (a couple of years ago!) I ended up keeping it signal rate and taking the value 180 samples after the onset is detected (I think I played a bunch to a sweet spot value).
I also built a drum-trigger type lockout so you can have 2 triggers on difference surfaces and there won't be any cross triggering.
Here's the audio I used to test with (direct recording from DDrum triggers on my kick/snare using my soundcard).
Hi,
I looked at your example. I gather that when a p is in an object it's referring to another object file. What is "p onset_velocity" in your example? Would you be willing to share the complete patch that includes the "p onset_velocity" object? I mean, when I type "onset" within an object it's light brown, so I don't think there's such a thing as an "onset" object.
Further, do you have any example patches that are for live music? I would like to see if I could adapt them to send MIDI or OSC, for which I see some examples.
About myself, I'm new to Max and still don't completely get the inputs and outputs nor special characters. The examples I've seen can be like spaghetti. I find it difficult to pinpoint where all the lines are going. I wish there were an easier way to make sense of objects that have been smashed together.
Best regards,
Truth
If you double click "p onset_velocity" it will open up and show the guts. "p" just stands for "patcher" or a subpatch, which is embedded in the patch itself. An abstraction is typically named and needs to be in the search path.
In terms of sending MIDI or OSC, it depends on where you want to go. The examples above spit out a velocity based on the incoming audio. Here is a more cleanly labelled version of things that takes incoming audio (or whatever) at the top, then spits out a "bang" on the left, or a velocity (0-127) on the right. You can then use those to send MIDI or OSC messages to whatever program you'd like.
Thank you Rodrigo. That is awesome. I added MIDI out to it.
I copied and pasted them in order to make multiple outputs, so I wanted a switch to enable and disable them. First I tried a switch object, which didn't work, but then I found the gate object.The gate works but I have to stop and start the speaker icon before Max recognizes the change. Is there a way to make the X toggle the gates on after the audio is on? For example, let's say it's all working. If I uncheck the X toggle the signal keeps triggering. But if I turn off the audio and then turn it back on then the MIDI triggering won't start. It's just that it's the audio being cycle from off to on that seems to recognize changes in the X toggle icons.
If I understand right, this should do the trick. Each one has a gate at the very end that you can close/open. If you turn off the audio (by turning off the speaker) that literally freezes everything where it was, so anything else happening (with audio) in the patch will stop too.
With this "gate at the end" method, the onset detection keeps happening, you just turn off the reporting of it.
Yes, I think you understand. I want the audio to stay on, uninterrupted. That works but I was also hoping to reduce CPU usage by disabling the ones not being used. It's okay though because it's very easy to quickly turn the audio off and then back on.
I copied the way you made the gate. Why is loadmess 1 and 11 in the gate object is needed? Seems like idiosyncrasies as a newbie. I obviously need to read more and watch more tutorials. Anyways, now I have two toggles, one at the beginning and one before the bang.
I suppose that I can try to make something that cycles the audio off and back on when the gate is toggled. I wonder what the best way to do that would be?
It's generally good practice to NEVER turn audio off during a performance. If you want to save CPU you have to do other things (poly~ is what is normally used), but it's not straight forward a process. Either way, things like this use next to no CPU on a modern computer.
So don't think of the speakers as things you turn on or off at all.
The [gate 1 1] is a handy trick to make a gate with one outlet (the default), but having that one outlet be open, by default.
So [gate 1 (this one defines the amount of outlets) 1 (this one sets that outlet's state)].
The [loadmess 1] is there to turn the toggle 'on' when the patch opens. It's not technically necessary as the [gate 1 1] already does that, but it's also good practice to have UI elements reflect the actual state of the patch. (In this case, the toggle needs to show 'on' because the gate is actually open).
You shouldn't use two gates, as it's redundant. Just the one gate at the end is enough. Also using [gate~] to control audio isn't good because it cuts the audio without any fade out/in, creating clicking. In this kind of patch, that can actually trigger an onset, since it makes a loud thumping sound.
I totally know what you mean, about having these weird rookie things that (seem to) make no sense. But that's part of the learning process. Hope this helps!
So been working on this stuff again recently after a residency with @PATremblay and although it's working better (better differential calculation, an absolute noise floor, and a much more accurate velocity measure (PA's wonderful contributions!)) it's still not great for doing fast rolls and such.
I've been looking at Sensory Percussion (http://sunhou.se) triggers recently, and the accuracy/response time is absolutely ridiculous. From the webpage it's clear that it's not just a regular trigger as there's a magnet (hall sensor for onset detection?), and possibly a contact mic on the body too (http://help.sunhou.se/start/magnet.html). Now I'm sure that helps in getting a clean signal with little to no crosstalk, but what caught my attention the most (in wrestling with this onset detection stuff) is their thresholding/differential configuration.
In this setup video you can see the panel that handles this:
https://youtu.be/4LoUqu-QpVE?t=35s
So the bottom is an envelope follower and the top is the volume differential, but it appears to be calculated quite differently. As in, it doesn't appear to be "a fast envelope follower minus a slow envelope follower" going into a thresh~ type thing. Each onset produces (is produces by) a positive differential spike followed by a negative one.
This is most evident in this part of the video:
https://youtu.be/4LoUqu-QpVE?t=1m23s
So this 'resonance removal' applies a slew to the uptake of the volume differential.
So the system appears to be extracting very clean and very fast onsets from a percussive signal, using a type of differential/thresholding that I've not come across before. A lot of this I'm sure comes down to the hardware side of things, particularly since the actual differential could be being calculated in a different manner, with the envelope follower being there purely for amplitude tracking.
Any thoughts on what's going on in the video?
the efficiency of the system in this video is fairly incredible, I'd like to know too how it works...
Indeed!
I had bumped up a super old thread on a median filter and gotten some improved results, and notably the differential now looks more like theirs (than using two regular envelope followers), but it still needs quite a bit more work:
Hi Rodrigo,
I was looking for an alternative to Bonk~ for Win 8. 64 bits. Your patch works fine for me. Just wanted to let you know that the 2nd argument of your pak object is an integer, so it doesnt work the way you want and I've just added a simple signal comparaison at the top of the patch that enable/disable a simple gate system that does the job for stopping reporting onset when no audio is going through. Thanks for your sharing.
so many people are always using a -to-db conversion when they do dynamics stuff. i wonder why? i always work with control signals of 0. -1. after demodulating the input audio.
I imagine it's more about the more perceptually accurate measure of change in differential rather than actual numerical range. (then again, the initial patch I started building from (Peter's) had the atodb in there and I just kept rolling with it)
My thought was (probably) that the meaning of a +/- 0.01 difference matters more at small amplitudes than at large ones, hence the conversion to decibels. For just extracting the amplitude envelope (to apply to something else), I don't usually do the conversion, but YMMV.
Here's a great paper on strategies for building digital compressors that I have recently found helpful:
well if you use 0. 1. linear gain, you´d use multiplication of course.
let me rephrase my comment: i dont even understand the purpose of the atodb signal object.
unlike when using mtof~ i dont see where you would like to modulate (or produce) a signal which represents gain or relative gain difference in db/A.
Atodb~ is so you can do your gain calculations and smoothing in dB before converting back to linear gain through dbtoa~. Not all tasks require it, but it makes some calculations much simpler.
I think of atodb~ as being more similar to ftom~ than it is to mtof~; it converts "natural" values into units that relate to our working units and perception (am thinking octaves rather than semitones on this, but the value of working in a log domain is still helpful).
sometimes i think i am optimizing to much.