What leads to analyze audio à-la VDMX audio analyzer ?

Julien Bayle's icon

Hi there,
I used the VDMX Audio analyzer a couple of time (http://vidvox.net) but I'd like now to "translate" it into MSP world.

This involves a multiband frequency analysis and a conversion of the signal amplitude mean in each band I guess..

Is there something built-in around here?
I explored bonk~ and the others but not satisfied about the multi-band analysis..

I guess I cannot do that using Gen~because it is sample-based and the analysis requires time windows..

any leads would be nice.

best,
julien

Julien Bayle's icon
pure's icon

fffb~ ?

Julien Bayle's icon

sounds like the right object.
I'm so happy to be able to understand very well some parts & to still discover objects I never used...

I'd probably have to pre-filter my signal to have precise bands.
Then, I'll make a mean of values in each band
Then convert it as a scaled float..
I guess it can work
Posting more as soon as it works!

pure's icon

well, you will find out through try and error what yields the best results but I don't see the need for pre-filtering because fffb is a filter. you can freely define the frequencies bands you want to look for. I'd make them adjustable so that you can scan for the frequencies with the most meaningful values. you might find the bands much faster if you stick a spectroscope in front of it (thats what i always do in ableton live when I need to eq stuff)...

Julien Bayle's icon

pure,
okay about that.

In my case, the analysis is to drive some videos effects according to the spectral content of my music.
This music is almost all synthesized (with device m4l or native in Ableton Live, btw) and specifically made for the video

I'm looking for the best way to make my analysis
I'd like to make that:
- define 4 bands for each audio piece
- calculate real time a significative value of amplitude for each band

maybe the value could be a mean, or a normalized stuff or I don't know.

Max Patch
Copy patch and select New From Clipboard in Max.

the current patch isn't even a prototype and is :

maybe, it would be nice to plunge that calculation into matrices with jit.poke etc.

Julien Bayle's icon

jit.catch~ & https://cycling74.com/docs/max6/dynamic/c74_docs.html#jitterchapter48 currently helping me a lot..
more infos soon

Roman Thilenius's icon

you wouldnt need windowing ... you would just accumulate signals to rms them.

if you do not need certain crossover frequencies and/or linear bands, making an
analyzer is very easy and low on CPU using FFT.

when using filters such as fffb or biquad, downsampling for x16 should be
absolutely enough for making control data for video and similar applications.

-110

Julien Bayle's icon

Hi Roman and thanks for your answer.
I still have not a nice result.

I played a bit with Emmanuel's zsa externals (http://www.e--j.com/?page_id=499)
I guess I would have to combine things but indeed, maybe it is a bit a massive destruction weapon only for a very smaller purpose.

In the snapshot, something I made with jit.catch~
I'd like to have something a bit universal (I mean a global machine I could use every time) with presets for each track for instance.
I know how to make presets etc but I'm still struggling with the core part..

4212.analyzer.PNG
PNG
pure's icon

the jit.catch thing looks like total overload to me except you want exactly this. what I forgot about the fffb~ is this would of course only "prepare" the source. to get control data from that you have to measure the 4 outlets by their loudness. there are peakamp~, avg~, average~ and the right outlet of meter~ that all do this in similar but different ways. then you have a stream of floats to throw at any jitter parameter you like.

pure's icon

oh and then there are tristan jehan's objects for sound analysis... a must!

Julien Bayle's icon

thanks a lot, pure.
ok about the jitter way which could be a bit "exaggerated" here for my purpose (indeed!)

yep, tristan's stuff is amazing.

I guess I have all in hands now.
I have to test several ways to see what is better (performances + accuracy considering what I need)
I'd prefer to stay close to native objects.

pure's icon

one more advice: you might want to send the amplitude floats through [atodb] to iron out the 20log-blabla-ness of the loudness before further mapping. happy fiddling!

Julien Bayle's icon

here is something a bit more working
avg~ help suggests to use a metro but considering what I'm trying to do, I don't need an high reporting accuracy.

I'm using an arbitrary value as a multiplier
this part + clip has to be adaptative of course

4214.analyzer.PNG
PNG
Julien Bayle's icon

crossed messages =)

yes about non-linear scaling
checking that

Julien Bayle's icon

ok.
I'd need to have better bands selection.. I mean, more cutting edges filters, which means higher order :-/ which means more cpu
this is my main problem, to be able to select frequencies in precisely defined bands..

woyteg's icon

"In my case, the analysis is to drive some videos effects according to the spectral content of my music.
This music is almost all synthesized (with device m4l or native in Ableton Live, btw) and specifically made for the video"
By the way, it may yield much more accurate results if you rather used the data that was creating the music, instead of the music itself. Envelopes, LFOs, NOTEONs, etc. Sorry if this is obvious.
Cheers

pure's icon

better bands selection - then you can stack as many biquad~s in bandpass mode as you need. each additional biquad doubles the filter steepness. without knowing the sound you are tracking it's hard to say but my intuition is that the problem is not the fffb filter Q...

Julien Bayle's icon

@woyteg, yes I thought about that. But more about analyzing audio directly in the master track of Live. Indeed, a note on message is nice, but sometimes it triggers long decay sound, sometimes more percussive etc. I mean , it wouldn't be enough for me, wanting to really illustrate visually the character of the sound itself.
About this idea, I'm already using some data from Live like panning values, mixer tracks' values in general.
And you don't have to be sorry!! Nothing obvious and I'm always happy to discuss anything I could have forgotten or done bad. So, thanks for your interest :-)

@pure, the idea is to use the analyzer on any sounds coming. If I'm often inside ambient soundscapes, I can be more percussive too...
I have 10 songs to play, each one will have particular analyzer's parameter (basically , I'll tweak the bands to have the best fit with the audio of each song... I mean: in order to have te most significant values)
I'll add some other parameter like brightness of my signal ... Probably it will tweak the noisiness of my video..

How would you use fffb~ to have bands like, for instance , 0 to 200, 200 to 1200... ?
The question is more about how to use biquads with that... After each fffb outputs would make sense but I'm afraid about perf...
And I don't want to buy VDMX only for that ... even if it seems to be a very nice toolbox too...

pure's icon

with fffb you dont define regions as such (actually you never do this with any filter), you define the top points of bandpass filters and the steepness of the attenuation. I dont have any exact numbers about the filters but if you send "freq 0 120 666. 5200., Q 0 100 100 100" to the fffb in the help file you will hear that they isolate these frequencies pretty well in the white noise. if you use biquad (instead of fffb) you would do exactly the same except you need an object per frequency. but if you never need to track more than 4 frequencies I wouldnt be concerned about the performance. just make some quick test patches and compare ffffb with biquad. you could also put them into a poly~ and downsample it (that's what roman already suggested I think).

Julien Bayle's icon

hi pure,
I'm almost okay.
fffb~seems indeed enough in my case.
I'm using downsamp~ with 16 as argument. I'm using it before the fffb~ (tried also after, not a lot of change)

My main problem is the scaling.
Scaling of inputs are always something to tweak well.
Indeed, if I'm okay to use frequencies list, qualities list as presets, I'd like to be able to scale the resulting signals correctly, dynamically.

normalizing the signal before to calculate the avg helps, but should I use another strategy ?

I also tried average~ instead of avg~ which needs a bang to produce a result.
I find average~ a bit more smooth, especially with 100 ms & rms mode.

pure's icon

you are talking about scaling the input, not about mapping the output, right?
i am not sure how much i agree. if you play sound files you would normalize them beforehand as part of the preperation efforts but normalizing a live signal before analyzing takes out some of the information you are looking for (amplitude does not equal volume but still...). a quiet signal should give some small value from a volume analysis. if the resolution of the analysis is high enough a small value is not a problem imo. "High enough" means so that the analysis outcome can manipulate any other parameter in a satisfactory degree/with sufficient precision. Or am I missing your point?

Julien Bayle's icon

I would scale after fffb~
By scaling inputs, I just meant when we use external signals (external to the considered core system I mean), we have to deal with that.
Sensors, etc have to be scaled.

Here, I just meant:
if I have a big low sound, and a very small high one, if I don't scale things coming from my analysis, I could have my low sounds triggering my system a lot and the high sounds not. If I want the high one to trigger things like the other one (which would mean a situation where the high sound would have a low volume BUT I want it to trigger things), I need to zmap or scale or .. my values.

normalizing the signals popped out by the fffb would be a way to have all my signals levels high enough to be able to trigger things .. then, inside my visuals system (the one that needs to be modulated), I could rescale things appropriately consider all values incoming would be normalized.

does it make sense ?

pure's icon

makes more sense now =) i think the term mapping describes better what you are trying to do because your scaling decisions depend on the target parameter you want to control.
let's say you get the volume in dB as a value between 0. and -infinity in 32bit float precision. this gives... uhm.. how many steps? my math is too DIY to immediately tell how many steps you have but it's A LOT ( I think it's 2^16). where are the math cracks?

and you want to map it on the brightness of a movie -> 256 steps. the bottleneck will always be the 8bit of the brightness and not the slice you take of the 32bit range from the audio domain.
so the (esthetically) deciding and tricky part is the transformation from one range to the other, not the starting range. me thinks.

Julien Bayle's icon

Yep I agree
But what I meant is: if a particular song contains very few low frequencies ( from -inf to -30 for instance) I need to scale/map -inf -30 to, for instance 0. 1.
This is why I thought about normalizing all ( indeed, it destructs the difference between my bands)

Maybe, the solution is to have a zmap just before my visuals value to modify, and to map each value for each song ... Because it totally depends on the freq content of each song, of course :-)

pure's icon

why not just [scale -120. 0. 0 255] where -120. and 0. are changeable? and, yes, you probably need different presets for each song...