Gen~ efficiency comparison
May 15, 2012 at 9:58pm
Gen~ efficiency comparison
Is it just me, or is gen~ require an incredibly higher cpu load to accomplish equivalent tasks using standard msp objects? I’ve attached a patch that uses a poly from a recent patch I’ve been working on. With poly instances set to 64, the no gen~ version uses 2-3% of my cpu, and the gen~ version uses 12-13%. I’m trying to optimize it, and I used gen~ with the thought that, since it is essentially compiled, it would be ‘more’ cpu efficient. I also used it for the fact that its objects all handle 64 bit floating point numbers, which is useful because I’m trying to process unique combinations of up to a set of 127 (i.e. 170,141,183,460,469,231,731,687,303,715,884,105,728). My thinking is that it is BECAUSE it processes everything with 64-bit precision that it requires so much more cpu power. Thoughts? Clarifications? Similar experiences? All comments welcome.
May 15, 2012 at 10:16pm
what version of max are you using? there was a bug in poly~ relating to some of these issues in general, which i think is fixed now (6.0.5). also, send~ receive~ is not supported in poly~ at the moment since max 6 i think.
in every gen~ patch i have ever made it has used considerably less cpu than an msp equivalent, depending on process of course. gen is definitely brilliant in terms of performance for me.
but i do not really know what i am talking about. maybe you have uncovered some other bug? i get the bad results with your patch, yes.
May 15, 2012 at 11:55pm
I’m using Max 6.0.5. What do you mean, send~ and receive~ aren’t supported in poly~? They have worked as they usually do so far, so that hasn’t been the issue, as far as I can tell. The only thing I can think of is that maybe a specific object in gen~ takes more power than it’s msp equivalent, whereas most objects take less power. My best guess is trunc. This is actually my first venture into using gen~, so I haven’t experienced this improved performance you mention. Would you mind demonstrating a situation in which the performance is better?
May 17, 2012 at 9:33am
hi. sorry maybe i shouldn’t post late at night half asleep.
anyway, this IS strange indeed. looking at your examples again and testing some edits, it appears that [trunc] IS to blame. however, i would suggest that the difference is so huge (6 – 7 %) i would consider it a bug. you should report it.
in the meantime, the example patch on performance found here:
is more informative than any example i could show you.
May 17, 2012 at 11:43am
Thanks! That’s exactly what I’m looking for. Although even in the performance patch there are inconsistent results. With the patch running (at 44100 sample rate, 256 signal vector size) the perf_gen patch, my cpu hangs around 8%, making sudden leaps to around 17% occasionally. The msp equivalent, perf_msp, stays around 19% with leaps to 24%. So gen’s the winner, right? Not so much, because perf_gen_biquads stays around 14% with leaps to 24%, and perf_msp_biquads chills out at 11%, with leaps to 15%. Sooooo… Not really sure what’s going on here. Perf_gen uses history, multiplication, and addition objects exclusively, so I suppose those perform better than msp’s equivalents (history = delay~). Perf_gen_biquads uses history, multiplication, subtraction, and pass (and param), while the other one just uses biquads.
I’m thinking it largely comes down to type and quantity of objects. With biquad comparison, the gen version just has more objects in general, even though the processing isn’t drastically different, whereas the other patches have almost exactly if not exactly the same amount of objects. I’m wondering if they were meaning to point that out with that patch.
And yes, I think I will report the trunc situation as a bug, because even though it does say in the help patch for trunc~ that the operation is computationally expensive, a 6 to 7% difference using gen~ just seems silly.
May 18, 2012 at 7:15am
Ooooook. I just finished an analysis of the cpu usage in a patch I’m making, and I thought I’d share my results. I only analyzed the audio objects that are actually in the patch (and a few others because I was curious), so there are a lot more that I haven’t touched. I’ve included the patch I used for the analysis in case anybody else wants to use it to find the relative cpu usage of an object. (Sample rate = 44100, Signal vector size = 256, Computer specs = Macbook Pro – OS X 10.7.3 – 8 GB ram – 2.5 GHz Intel Core i7)
The way I did the analysis was to make 1000 instances of an object, feed them all a constant value of 1 in an object-appropriate way, and then record the cpu utilization in a histogram for 10 seconds, at 100 millisecond intervals. From there, I extrapolated the range of cpu usage, the percentage it spent the most time on, and the per-instance cpu usage. These are listed in the format:
objectName minCPU-maxCPU, mostFrequentCPU@ratioOfOccurance | CPUperInstance
You can get an idea of how evenly the cpu values were distributed (aka how frequently it changed) by looking at the ratio of occurance, i.e. a ratio of 100/100 means it stayed on most frequent cpu value the whole time, but if it’s 20/100, that means the longest cumulative time of any one particular value was 2 seconds; very distributed.
Anyway, enough explanation. Here you go!
buffer~ 0, 0@100/100 | 0.
—-GEN~ OBJECTS(gen~cost included)—-
nothing 5-15, 6@21/100 | 0.006
—-GEN~ OBJECTS(gen~cost subtracted)—-
nothing 0-0, 0@100/100 | *0.
* may just be 0 because I didn’t do the subtraction of gen~’s cost properly
Based on the above analysis, it would seem that the most cost-effective way to patch is to use one gen~ object, and if you can accomplish what you want to accomplish with just the *, +, gate, and selector objects (other unexplored simple arithmetic objects excluded), then go for it. And if you at all can, stay away from the pow, exp2, and % objects, as these are THE MOST EXPENSIVE objects analyzed. True, this may be because they are 64-bit, but man. As mentioned in previous posts, I was running into cpu problems in my patch, and that would explain why. I think I use two gen objects with one pow, one trunc, and one % object each, and that shoots my cpu right up to about 2 percent, which is not good for an abstraction you want to run multiple instances of.
I hope someone finds this information useful, and again, if anybody else wants to do some analysis of other objects with my patch and share, feel free.
May 22, 2012 at 3:56pm
Gen *can* result in better performance than using MSP objects, but this can’t be guaranteed in all cases. In general, Gen patchers start to show better performance compared to MSP patching as the number of objects and connections increases. Not only does this amortize the unavoidable overhead of the gen~ object itself, it also allows for much greater compiler optimization within the patch (which simply isn’t possible for MSP patchers).
I have attached a performance testing patch which creates 20 copies of a particular operator, and connects them up randomly. It then generates both MSP and gen~ versions, and alternately hosts them in a poly~ with 32 voices.
The results clearly show that for many operators the difference is quite significant even with only 20 boxes, but some operators have similar performance and a few (here trunc and mod) are worse.
We are focusing efforts to improve these particular operators for the 6.0.7 release, however over the last few months we have been making major changes to the internals of Gen, and have been balancing the importance of performance improvements with the addition of major user-requested features.
: nothing.maxpat: avg 0. max 1.
: test_empty_gen.maxpat: avg 0.8 max 1.
: test_add_gen.maxpat: avg 2. max 4.
: test_cos_gen.maxpat: avg 25.3 max 37.
: test_delay_gen.maxpat: avg 22.1 max 31.
: test_delta_gen.maxpat: avg 4.1 max 8.
: test_div2_gen.maxpat: avg 2. max 3.
: test_div_gen.maxpat: avg 21.6 max 33.
: test_gate_gen.maxpat: avg 4.2 max 7.
: test_log_gen.maxpat: avg 30.3 max 43.
: test_maximum_gen.maxpat: avg 1.9 max 5.
: test_mod_gen.maxpat: avg 61.1 max 95.
: test_mul_gen.maxpat: avg 2.1 max 4.
: test_noise_gen.maxpat: avg 6.5 max 11.
: test_phasewrap_gen.maxpat: avg 12.5 max 19.
: test_poltocar_gen.maxpat: avg 21.7 max 32.
: test_sqrt_gen.maxpat: avg 3.2 max 6.
: test_trunc_gen.maxpat: avg 17.9 max 26.
Tested with Max 6.0.5 (1741622)
Jun 2, 2012 at 1:59am
That’s very helpful, thank you. I installed 6.05, and I’m seeking the reference on the API. I can only find tutorials. Is there a reference, or do I just figure it out from the tutorials?
Jun 2, 2012 at 4:38am
Links to the Gen reference pages should be there in the ‘?’ tab of the gen~ help file, and also under the ‘vignettes’ tab of the documentation sidebar – are you not seeing them there?
Jun 2, 2012 at 4:48am
Here in Sacramneto, It’s too hot to turn on my i7 and opengl accelrator workstation at the momet.
I have only one Max license. I think it’s only legal for me to onstall on one machine. Is there a way I could upgrade so I could run it on my laptop too? Thank you for following so late on a Friday evening )
Jun 3, 2012 at 5:46pm
This is a question you should ask to firstname.lastname@example.org
Jun 4, 2012 at 12:27am
HI, I’m up) I find good information in the gen~ help *patcher* and the Gen Common Operators Reference *vignette.* And the examples are much more helpful since the last time I updated. Thank you )
Additionally the recipes on the cycling74 website are fun. It just takes some time to learn it all. There’s so much )
Jun 4, 2012 at 1:44am
jfyi, the MSP objects also do it all in 64 bit resolution, so that would not make a difference.
You must be logged in to reply to this topic.