Tutorials

Advanced Max: Learning About Threading

Understanding how the threading model in Max works will help you patch more efficiently, and also be on the lookout for potential bottlenecks and trouble spots. In this 20-minute video, I'll briefly describe the threading model that Max uses, and show you some Max externals you can use to optimize your patches.

threadcheck.zip
application/zip 486.57 KB
Download the files used in this tutorial.

by Timothy Place on November 8, 2016

mvf's icon

Thank you so much for this video. It changed my world even before tonights election is over ;) Could you maybe provide your magic thread-meter object?

mvf's icon

Uh, just woke up and am shocked...

Julien Vincenot's icon

Thanks for this Timothy. I didn't know the qlim object, exactly what I needed ! I did a few tests and it seems it saved me for a patch with big timing issues.
Basically I use line object to send values to several multisliders, to make some kind of "karaoke" cursors for musicians to follow during a performance, on top of score images.
It's supposed to be really accurate, since everything is synchronized with soundfiles. Before my patch accumulated around 400 ms of delay by the end of the score. Now using qlim between the line objects and multisliders made me reach 1 ms delay...
Maybe my way of measuring this (with cpu time) is not bulletproof, but at least it seems like a big progress !
Thank you again

Rodrigo's icon

These more advanced videos are great! (or I know enough to appreciate them now)

I've used [delay 0] in the past before when stuff was acting weird and never knew why it worked. Threading, in general, still seems like voodoo stuff, but at least I have a clearer idea of what type of voodoo to try to go with when timing issues arise.

sepulcky's icon

It's not voodoo when you imagine the scheduler thread as the Java's PriorityQueue where the tasks are ordered according their execution timeouts, and the main thread as simple Queue where tasks are ordered in a FIFO/LIFO manner. When you “send a message” you do actually put a task in the one of these queues, where they are beeing executing in parallel one by one starting from the bottom of the queue.

Timothy Place's icon

For those looking for the objects, the package can be downloaded at https://1cyjknyddcx62agyb002-max7.s3.amazonaws.com/threadcheck.zip .

Nikolas K's icon

Great video Timothy, quite enlightening! One question still remains, where is JS scripts run? any chance there will a video about that?
(P.s. the links seems to be broken)

Rodrigo's icon

"Something's wrong here...

Uh oh, Bitly couldn't find a link for the bitly URL you clicked.
Most Bitlinks are 4-6 characters, and only include letters and numbers (and are case sensitive)."

Timothy Place's icon

Link has been updated to one that is long and ugly, but works :-) Sorry for the snafu.

Timothy Place's icon

Good question about JS, @Nikolas.

The thread support for javascript has changed over the years. There used to be a way to allow methods to be called at high-priority (scheduler thread) but in every version of Max over the last 8 or 9 years all messages are deferred to the main thread.

Rodrigo's icon

Are there any plans to do a follow up about audio threading?

I remember that from Max6 (or was it Max7) that top-level patches run in their own threads (is that right?), but it would be great to have some info/guide on how (or if it's possible) to do this inside the same top-level patch. For example running a really expensive VST (Kontakt) inside its own poly/thread.

johyde's icon

That was super useful Timothy - thanks so much! I've been looking for more info on this for ages. I'd like to echo what Rodrigo just said though - I'd LOVE to learn a bit more about audio threading. I specifically want to know about poly~ with the parallel attribute on, and how to consider what the thread count might be set at - I'm using this to sort out a super-expensive audio patch. I think it kind of works (I've settled on 8. I also have a 4 core machine - this number was just arrived at by trial and error and I don't really know what I'm doing). Hi Rodrigo - I've not forgotten that you gave me some useful answers on this on the forum not so long ago - thanks again!

davebeckford's icon

hi just tried min. threadcheck on win 32 bit. came up orange with "Error 126 loading min.threadcheck"

Roald Baudoux's icon

Thanks for the video, Timothy, very interesting. Didn't know about qlim, quite useful!

A follow-up with a complete description of the scheduler's settings would be quite helpful too.

Herr Markant's icon

jitter user should also have an eye on [jit.qball]

Timothy Place's icon

@Herr Markant: Yes, jit.qball in usurp mode works in the same way as qlim in my example. In defer mode it works the same as the deferlow object.

The ref page for jit.qball may serve as useful reading in this subject area.

@Roald: I agree that the other scheduler settings would also make a good topic for discussion as they seem to be shrouded in mystery for most people. Will consider that for the future

@Rodrigo: I decided to avoid the subject of top-level patchers with their own threads and also Max-for-Live devices in order to simplify the material. For most people things will work as I described. The subject to which you refer is that top-level patchers operate with their own audio threads (and schedulers) *if* the preferences to "Enable Mixer Parallel Processing" is turned-on, which is not the default.

@Johyde: "What happens in poly~ stays in poly~" :-). Since I was specifically addressing messages that are passed between objects I also saved the subject of poly~ for a rainy day. In general each audio voice of a poly~ can potentially run on its own audio thread (if the parallel processing attributes are enabled). Within the patcher in a poly~ everything works as I stated, and outside of the poly~ everything happens as I showed -- the poly~ outputs mix everything together to hand it off to the parent patcher audio thread.

These have been some great questions! I wasn't sure how much interest there was in the subject, so it's good to see that it has sparked some inquiry and discussion!

Timothy Place's icon

@Dave What version of Max are you using? These objects should work with Max 7.3 but may not work on earlier versions of Max.

Rodrigo's icon

@Timothy
Good to know about the "Enable Mixer Parallel Processing" thing. I wouldn't have thought of that, but either way, I don't run multiple top-level patches.

About poly~'s parallel processing, does that apply to the first voice as well? (i.e. using poly~ as a dsp-muting container, and not really as a poly~ voice handler)

@johyde
I remember! And no problem :)

Adam Murray's icon

This video was so good. Very clear and very helpful. Please do more like this!

Joseph Hyde's icon

Hi Timothy,

Thanks so much for answers there. It seems cheeky of me to ask yet further questions, but I would really appreciate a bit more clarification if you have a moment. I feel like I'm nearly there with understanding this! So, with the parallel attribute set to 1, patchers embedded in the poly~ object will run on separate threads, in addition to the three main ones, right? Will this only be true of audio objects/networks - ie, if there are regular Max objects in there might they run on the Main or Scheduler threads? And finally, can you give me any pointers as to the optimum number of threads to specify (using the threadcount message) - I neglected to mention that the reason I've specified 8 is that I'm running 8 instances of the patch in the poly~ object, but I've no idea if that's a good reason.

Thanks again,

Jo

Dan Nigrin's icon

Thanks Tim - agree with all that this was really helpful.

I'll also point out Joshua's wonderful post from LONG time ago - I still refer back to it. Might be helpful to others.

Chris H.'s icon

Thank you for this nicely composed video, Timothy! Do you know about any other resources on performance analysis of Max/MSP patchers? Ideally it would be nice to be able to benchmark them from externally. The CPU-usage monitor in the software is nice but a more detailled analysis of the system resources being used would be really helpful! I'd like to be able to watch and log those measures as well. Currently, I am working on Windows and use perfmon to do that.

Timothy Place's icon

You guys are keeping me on my toes!

@Rodrigo: yes, if you have poly~ parallel processing turned-on with a threadcount of 1 then (based on my quick read of the source code) it should run that audio on a separate thread.

@Joseph: The documentation for poly~ suggests using a threadcount equal to the number of cores on your machine. My experience has been that a threadcount of 2x the number of cores works best for balancing the load. Ultimately it really depends on what your poly~ voices are doing, and so you have to test it empirically to know what is best for your situation.

On a related note, the maximum threadcount for poly~ is 64 threads.

@Dan: Thanks for digging up that link! It goes into much more detail than I had time for regarding things like feedback loops and a host of other topics.

@Chris: This a really deep topic (time profiling and benchmarking) which I plan to address in a future video. Stay tuned!

Roald Baudoux's icon

I read Joshua's post about scheduler and priority a long time ago however I have never been sure how to interpret this sentence: "Note that only the high priority scheduler maintains timing information, so if you wish to schedule a low priority event to execute at a specific time in the future, you will need to use the delay or pipe objects connected to the defer or deferlow objects."

Should a pipe or delay object be connected after a defer or deferlow or is it the opposite?

Rodrigo's icon

@Timothy
Heh, indeed! This thread is turning into a Q&A session!

That's really good to know about the singular poly~ working on its own thread, as that would be my main interest in threading (isolating out 'expensive' plugins).

@Dan
Handy post that one!

Timothy Place's icon

@Rodrigo: yes, this is pretty fun!

@Roald: What believe Joshua is saying is that if the timing is critical, but you want the event delivered in the main thread, then you keep it in the scheduler until the last moment and defer at the end.

In fact, this is exactly how the qlim object works internally. If you specify an argument then the speed-limiting timer runs in the scheduler thread then at the end it defers back to the main thread.

If there are more questions then keep them coming! :-)

Timothy Place's icon

I'd also like to point out, for reference, that there is some info in the discussion on the YouTube page for the video which may (or may not) be interesting for some of you...

Joseph Hyde's icon

Thanks again Timothy - I appreciate the extra effort in fielding quite a lot of further questions! Re. poly~ I guess I did exactly what you suggest of just empirically mucking about until I got what seemed to be the best result, and my findings were the same as yours - double the number of cores I think gives the best result. Nice to have that backed up!

Rich Smith's icon

Great video Timothy - appreciate the step by step examples.

Q: If I design a patcher for use in Max for Live, will any of the thread preferences or prioritization schemes be overwritten and relegated to the Audio thread?

davebeckford's icon

tried to install min.threadcheck on 7.3.1 64 bit. still came up orange
"Error 126 loading external min.threadcheck"

64 bit version of external on windows.

still have old max 7 32 bit on computer would that be an issue?
looking forward to trying this out.

Jason Palamara's icon

Hi Timothy...thanks for the great video.
Any news on this issue with Error 126?
I am running into the same issue that @davebeckford is having. Also on 64 bit windows with Max 7.3.1.
I do have Max 5.1 installed...perhaps there is some sort of issue with that?

Timothy Place's icon

An additional note on poly~:

In local mode, each poly~ voice has its own scheduler, which is run immediately before processing a vector of audio. In other words, it’s not asynchronous at all, quite the opposite.

This “run the scheduler before a voice processes a vector of audio” will be the case if parallel processing is on or off.

Others can correct me if I’m wrong but I think the only thing you really get with local on is that you can have SIAI on within the poly~ while keeping it off globally. If you adjust the poly~ vector size, local would mean the scheduler could run more often inside the poly~. This too could be a minor advantage if for some reason you didn’t want to run the scheduler so frequently everywhere in your patch.

Timothy Place's icon

Regarding the Windows externals, I have just updated the download package. The new Windows extern builds should be dated Dec 13.

I am unable to reproduce the 126 errors locally because I have Microsoft's developer tools installed, but I do think that these builds should solve that problem.

Cheers!

davebeckford's icon

https://1cyjknyddcx62agyb002-max7.s3.amazonaws.com/threadcheck.zip .
tried the amazon link above, here is what I get:

AccessDeniedAccess Denied734895F10E21AA57
HaxGWizlxZ4zT+bd5dtx8LXPF204P6t4SmTFryDT3aPqRCqu7hnlKKZAEwAIK/ysDHUlG6PxdN8=

pdelges's icon

I can't download threadcheck.zip neither.

Lilli Wessling Hart's icon

Sorry about that, folks. Looks like the permissions were incorrect. Please try again:
https://1cyjknyddcx62agyb002-max7.s3.amazonaws.com/threadcheck.zip

pdelges's icon

Thank you Lilli!

Ed Perkins's icon

Hi Timothy

Thanks for the great video, it's the clearest explanation of threading in Max I have heard and has really revolutionised my patching.

I'm working with the serial object and want the fastest possible route for the data stream to minimise latency from my hardware (a ribbon controller and accelerometer over XBee radio). I've tried to push everything to the scheduler using 'pipe' as I presume this is my best approach for low latency?

I'm also using the 'ZL' objects to format the serial data into something more useable and have noticed that the output from these objects (zl group and zl slice at least) seem to occur on both the main and scheduler thread simultaneously. Is this the way they were written or do I need to look at another way to organise my serial data?

Ed

Timothy Place's icon

Hi Ed,

I believe the zl object is thread-agnostic. Thus, any output will come out on the same thread on which it was received. This is the way most Max objects work.

It is difficult to reason about latency. One problem is that by "promoting" a message to the scheduler thread it actually is getting queued up to happen at a (slightly) later time instead of happening "now".

Cheers,
Tim

OAudioMonitor CDS's icon

Great tutorial, Very Educational !

Since scheduler in OverDrive is a bad idea when working in Jitter, specially with "qmetro 1". How can I make sure that all the resources are focus on the jit.grab and nothing else can bother it, like its own CPU? Changing the size of the patch window for example, or even adding an MSP meter would make a huge difference, I can clearly see a lack of performance (dops in FPS).

Thanks!

Mariano

CTS's icon

Thanks for the great video! I'm also having trouble downloading the threadcheck object from any of the links provided above. Lilli's link takes me to a page with this:

This XML file does not appear to have any style information associated with it. The document tree is shown below.
<Error>
<Code>NoSuchKey</Code>
<Message>The specified key does not exist.</Message>
<Key>threadcheck.zip</Key>
<RequestId>9DB1ADD9E5A31958</RequestId>
<HostId>
8USlY5KpmROrolVR1EPgxDms3nuFxFMO4nYQ49tL9iSrId5l/d6lSLWRnUNY2HqV4hskkl1l8Mc=
</HostId>
</Error>

Timothy Place's icon

Thanks @CTS -- I appreciate you bringing it to our attention. I've now incorporated the proper version into the body of the main article up above. Hope this helps!

CTS's icon

Thank you! Really awesome external.

JFS's icon

Also having trouble with all of the links to the external..

This XML file does not appear to have any style information associated with it. The document tree is shown below.

<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>BAD38407694E6EA4</RequestId>
<HostId>
di3oCV/9uv7+oPY2APEBtuvjZ63Fm4MA1sofCstCj2TvTeh+vrjVuRNoCair35CvljjmmBMnYns=
</HostId>
</Error>

Kristof Kerti's icon

Thank you for this video, never used qlim before, will do in my future projects!

Roald Baudoux's icon

How is it possible to have a 0.1 ms grain within line while both the scheduler interval and event interval cannot go below 1 ms? I am puzzled now...

lysdexic's icon

This is still such vital info for Max users! Any chance of expanding this to discuss the audio thread?

Roman Thilenius's icon


"How is it possible to have a 0.1 ms grain within line while both the scheduler interval and event interval cannot go below 1 ms?"

not sure if that was a serious question, but it is not a wrong question to discuss.

it is possible the same way a metro can bang every 100,1 ms: that the scheduler runs at 1ms does not at all mean you could not have an event at 0.1 ms.

it only means it would be skipped/ causes a buffer overflow at the next full ms when it cant be executed within time - instead of beeing further delayed, as it would in the main thread.

Andy's icon

@Timothy Thank you for the video! Any chance you could compile the external for apple silicon, or maybe make the repo public so we can try to compile it ourselves?

JBG's icon

If you're looking for min.threadcheck, it's available here

Nicolas Kaniak's icon

So you can have a multi-core pc and incredible threading, but max only makes use of 3 threads? Kind of disappointing

Rodrigo's icon

The main audio thread is just a single thread (unless you explicitly multithread using something like dynamicdsp~) otherwise you incur latency and/or heavy CPU tax for jumping around threads.

The rest of the stuff, as far as I know, can get threaded about by the CPU (e.g. drawing to the screen, even though this happens in the low priority "thread", I always assumed this can be spread across multiple cores/threads as far as the actual CPU goes, even though to Max, it only presents as the "low priority thread").

Could be wrong about that part.

But the audio stuff exists on a single thread/core for latency/efficiency (and probably complex computational) reasons.

slo ~|•'s icon

For those interested I notice the [threadcheck] object is included as of Max 9.0