Optimizing Large "Soundbank" Parsing

Loukas Perreault's icon

Hello!

Currently, I'm sorting all my samples into two categories: those with a duration of 3 seconds or less, and those with a duration greater than 3 seconds, using the sfinfo object.

In my "master folder," which includes subfolders, I have approximately 16,000 samples. To gather their filenames, I employ a recursive search using JavaScript, resulting in an array of filenames. Subsequently, I feed this array into the sfinfo object to retrieve the length of each sample.

The process of obtaining the array of filenames takes around 3-4 seconds, which is efficient. However, the subsequent step, where sfinfo analyzes the files, is notably slow, taking approximately 3-4 minutes. During this time, Max appears unresponsive, with the spinning wheel suggesting a potential crash.

My query is whether there's a method to execute this task in the background or optimize the process for speed. Essentially, I'm looking for a way to offload this heavy-duty analysis to another thread or improve its efficiency to eliminate the unresponsive behavior.

Thanks!

Source Audio's icon

You probably do this once and store infos somewhere, or even move

samples or rename them.

Why is then few minutes so relevant ?

You could insert audio length detection in your java script,

or better controll timing between java execution and sfinfo querry ?

or outsource this completely to for example ffprobe or other software,

and create text file or whatever you need ?

Source Audio's icon

actually It takes much less to get length using sfinfo, here almost 15000 samples

took about 2 ms per sample, Max (833) on Mac Intel quadcore i7

had no strain at all.

Roman Thilenius's icon

one could defer the uzi or even use metro 5, but then it takes even longer.

why is foreground responsiveness during the process important? what about opening a "please wait, doing something" kind of nagscreen which informs the user about the process?

Loukas Perreault's icon

Thank you for your answers!

@Source Audio i'm just finalising my device and maybe want to share it with you guys but I find the "import sound section" kind of clunky and not very pleasant... I looked for audio length detection inside javascript but I haven't found anything... Is there an object that can do this?

I will look inside my patch cause there might be something causing the process to be slower...

@Roman It was purely UX wise, I didn't want people using the device thinking "oh why is max freezing"...

Source Audio's icon
Max Patch
Copy patch and select New From Clipboard in Max.

try this patch.

I don't know what else you do with the files, but from this one could easily

combine full path, or only names with length etc.

or split by length

Loukas Perreault's icon

@Source Audio your solution is really neat I love it, I really like seeing the sounds being scanned, that's exactly what I meant by "execute this task in the background". I don't know if it's the pipe object that make this so elegant but wow thank you so much!! I'm lucky you stumbled across my post!!! :D

What I do with the files is store the fullpaths and filename into a file. I might be able to do so with the File object inside js. At first I wanted to store using colls but it seems not efficient...

Next there will be a series of customizable regexp to split into 8 different files... Something like that :P

Source Audio's icon

pipe here helps to avoid stuck overflow when triggering next message.

output of menu can be rerouted by different objects into text object and then write it to disk.

If that is what you mean.

you can also split folder hierarchy into different files

Loukas Perreault's icon

Wow thanks again that's a nice tip, I think it will work better than my js script hihi.

Here is an example of what I'm working on (@Source Audio you did mostly all the work...)
I'm thinking of maybe using colls instead of text because the text object don't seem to have a length method... unless I'm wrong. I could've used counters but it seems a little but tedious.

Max Patch
Copy patch and select New From Clipboard in Max.

Source Audio's icon

if you only want to include samples of 3 sec and longer, you can simplify a bit.

text object can tell you how many lines it has, see query message in help file.

one problem is added cr (new line) after each input,

which can be fixed by reducing output by 1.

Another little bug is that text allways has 1 line even when one clears it.

But all that can be fixed using some logic.

Loukas Perreault's icon

Thanks a lot @Source Audio!

Whoopsie for not having seen the query message...

I decided to go with the coll still.

I will also keep my samples under 3 seconds, I might use them in the future.

Thanks a lot, especially the pipe, sel, combined with gate and next for iteration is super efficient I love it!

I also created a little loading bar hihi :D

Max Patch
Copy patch and select New From Clipboard in Max.

Source Audio's icon

your patch assumes that folder named lib exists next to it ?

If it does not , read/write functions would fail.

then - if there is a space in the path, it will get broken,

you need to use sprintf symout %s

Loukas Perreault's icon

Yes the folder is already in the patch folder. Good tip on the spaces thanks!. I also need to format for mac/windows.

Source Audio's icon

I hope you don't mind if I repeatedly suggest things

this puts a strain on coll object, it could tell it to output length before reading file gets done

this is better way to get coll length

also when you query length after storing files

Loukas Perreault's icon

I truuuuly don't mind, your tips are awesome!! I will implement this method right away! Thanks a lot!