Max 5: 256 [saw~] objects consume roughly 6% CPU at 64/64 vector settings (w/overdrive + scheduler in interrupt) on a 2.2GHz i7 MBP (Snow Leopard).
Max 6: same patch consumes 18% CPU.
I've noticed quite dramatic increases in CPU consumption in almost all of my patches.
I realise that MSP now uses 64 bit signals, so some additional overhead is unavoidable, but is a 3x slowdown something we just have to accept?
Somebody awesome (I forget who, no disrespect intended) recently shared a bunch of externals that make use of SSE parallelism to improve performance. When I saw this, I must confess that I was dismayed to learn that Max 5 didn't already make use of SSE parallelism. I would have thought that the 64 bit overhead could have been overcome, or even bettered, by using SSE instructions to process two doubles at once (Max 5 processing only a single float at once).
Now that Max 6 is Intel Mac only, I think it's time to start leveraging some extra processing power. Could someone confirm whether or nor Max 6's code uses SSE intrinsics, and can we expect to see some performance improvements in the release version?