gen~ CPU usage rules of thumb

red's icon

Reading Generating Sound & Organizing Time – which, by the way, is excellent, even (or especially?) if you know a lot about gen~ and DSP, I'm noticing a tendency to use objects, like sin and cos, in situations where I'd normally consider using table lookup to avoid CPU expense. I imagine using trigonometric objects instead of table lookup adds clarity to the examples, and I'm wondering what the difference in CPU is.

More generally, are there overall rules of thumb regarding CPU usage of gen~ objects? (I'm presuming general programming rules, like avoiding divides when multiplies could be substituted, for example, hold.)

Graham Wakefield's icon

Hi there,

Thanks for the kind words on the book! Yes, in the book we have opted for clarity rather than efficiency, not just to make the material easier to digest, but also because optimizing for efficiency can be quite dependent on the algorithm and hardware, and involves tradeoffs, so there's no easy answer.


Also it's not always obvious how to optimize a patcher -- there are some things that can be tried, but sometimes they won't make much difference or might even makes things slower. Part of the reason is that the underlying gen code generator does a few optimizations of its own, and then the target code translator (e.g. Clang/LLVM for gen~ in Max) will do even more code rearrangement & optimizations for you. E.g. avoiding divisions (and modulo) where possible can sometimes help, but again the compiler might already be doing this for you.

A good, classic general rule is 1) make it work 2) make it right 3) make it fast, i.e. only focus on optimization once the thing is functionally "done", not before then. And then, use the Pareto rule, in that 80% (or more) of the work is done by 20% (or less) of your patch, so find out what that expensive 20% is and only try to optimize that. To do that, disable or change one thing at a time and measure the CPU performance difference. The gen~ object has a @cpumeasure attribute that will give you better indication than the Max audio performance meter, as it measures only the gen~ object itself -- see the gen~ help patcher.

A few quick options (worth trying first): There are some operators in gen~ that offer cheaper mathematical approximations of certain math operations, such as [fastsin], [fastcos], etc. -- make a new object and type "fast" to see what there are -- these are very cheap on the CPU and you might not be able to tell any difference outside of very sensitive signal paths. It's worth trying them out when you're aiming to reduce CPU cost.
The [cycle] operator uses table lookup, but might not actually be cheaper than the direct [sin] or [cos] calls -- depends on the CPU hardware etc.

Similarly, you might be able to make some things faster using lookup tables rather than calculations, especially when these are used over and over again in a patch (e.g. window functions in a resampler), but it can take some time to set these up.

I should mention that there's a common misconception that writing your patch in a codebox will make it faster -- this is not true. The patcher and codebox are converted to the same underlying abstract syntax anyway. The only concrete advantage of codebox is the ability to write if/for/while blocks. Even using codebox functions might end up being slower than using gen subpatchers & abstractions in some cases!

Another common misconception is that using an if() block to selectively enable/disable processing will save CPU -- this is *sometimes* the case, but often not. It has to do with how modern CPUs use prediction to speed up their operations. If the body of the if() block is fairly small, you will be better to just compute both paths and throw one of them away, e.g. using [selector] or [switch] operators.

Again, the only real way to know is to try variants and test your specific patch. This can be quite time consuming, hence the "optimize last" and Pareto guidelines. This is nothing specific to gen~, it applies to code optimization in general.


Gregory Taylor's icon

<Bevis and Butthead voice>
Heh heh heh... he said "Pareto...."

red's icon

Thanks, Graham, for the thorough and helpful reply!

I appreciate that you do talk about efficiency in the book (branching code, codebox misconceptions, etc.), and it makes sense that many optimizations are so context-dependent, especially as many optimizations are already being done behind the scenes.

I've been going back through my gen~ patches and seeing where I can replace branches with [switch], and in general I use [fastcos] or [fastsin] for non-audio situations (e.g. LFOs).

It's sometimes a little frustrating to realize that the most elegant solutions are not necessarily the most efficient. :-)