memory allocation again...


    May 05 2007 | 11:45 pm
    Hi guys,
    I know memory allocation issues have been discussed in the past, however I would like to get an update or reiteration, because it seems the previous discussions were in large part based around OS 9 idiosyncrasies, most of which have now disappeared on OS X.
    So, as an example, let's take a situation where I have an external that in response to a typed message will be scanning a list looking for a particular item. For each 'particular item' it finds in the list, the external will need to dynamically allocate some memory. Note that 'the particular item' might be more than one.
    I know that the amount of memory for each allocation call is not going to be bigger than 32k, however the total memory required by all the calls together (if the particular items in the list are many) might end up being bigger than 32k.
    In a situation like the above, I tried malloc, newhandle, getbytes, and sysmem_newptr, with the relative deallocating functions free, disposehandle, freebytes and sysmem_freeptr.
    Under a superficial look, they all seem to work. However I am interested if one is recommended over the others because of subtleties that I am not aware of. Also given the above situation I am curious if one allocating function is preferable above the others.
    Thank you.
    - Luigi
    ------------------------------------------------------------ THIS E-MAIL MESSAGE IS FOR THE SOLE USE OF THE INTENDED RECIPIENT AND MAY CONTAIN CONFIDENTIAL AND/OR PRIVILEGED INFORMATION. ANY UNAUTHORIZED REVIEW, USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT, CONTACT THE SENDER BY E-MAIL AT SUPERBIGIO@YAHOO.COM AND DESTROY ALL COPIES OF THE ORIGINAL MESSAGE. WITHOUT PREJUDICE UCC1-207. ------------------------------------------------------------
    Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

    • May 06 2007 | 9:00 pm
      On OS X you can use reall malloc() any time you want (IMS, the same holds for NewPtr(), etc). The caveat is that malloc/NewPtr can take a lot of time, whereas getbytes is fast.
      You can google around to find source code for malloc() implementations. One look at some of them should convince anyone that you don't want standard memory allocation routines called in timing-sensitive operations.
      So the old rules are really still valid, you just won't crash anymore for calling malloc() inside an interrupt.
    • May 07 2007 | 7:35 pm
      So, is there any difference worth mentioning between getbytes and sysmem_newptr ? (beside the fact that one has a 32k limit and the other doesn't)
      Do the sysmem functions call malloc under the hood ?
      Thanks.
      - Luigi
      --- Peter Castine wrote:
      > > On OS X you can use reall malloc() any time you want (IMS, the same > holds for NewPtr(), etc). The caveat is that malloc/NewPtr can take a > lot of time, whereas getbytes is fast. > > You can google around to find source code for malloc() > implementations. One look at some of them should convince anyone that > you don't want standard memory allocation routines called in > timing-sensitive operations. > > So the old rules are really still valid, you just won't crash anymore > for calling malloc() inside an interrupt. > -- > -------------- http://www.bek.no/~pcastine/Litter/ > ------------- > Peter Castine +--> Litter Power & Litter Bundle for > Jitter > > iCE: Sequencing, Recording & Interface Building for Max/MSP > Extremely cool http://www.dspaudio.com/ > >
      ------------------------------------------------------------ THIS E-MAIL MESSAGE IS FOR THE SOLE USE OF THE INTENDED RECIPIENT AND MAY CONTAIN CONFIDENTIAL AND/OR PRIVILEGED INFORMATION. ANY UNAUTHORIZED REVIEW, USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT, CONTACT THE SENDER BY E-MAIL AT SUPERBIGIO@YAHOO.COM AND DESTROY ALL COPIES OF THE ORIGINAL MESSAGE. WITHOUT PREJUDICE UCC1-207. ------------------------------------------------------------
      Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
    • May 08 2007 | 4:35 pm
      On May 7, 2007, at 12:35 PM, Luigi Castelli wrote:
      > So, is there any difference worth mentioning between getbytes and > sysmem_newptr ? > (beside the fact that one has a 32k limit and the other doesn't) > > Do the sysmem functions call malloc under the hood ?
      Yes. getbytes uses an internal memory pool (which calls malloc if it runs out of space).
      In Max 5 these calls can be used interchangeably, and if it is a small memory allocation, we use our internal pool, and if it is a large allocation, it uses the standard OS memory pool. I don't think you should worry about using either one. They both should work fine for your purposes.
      Again, Luigi, you seem to be worrying about code performance too much before running into problems. I'd like to encourage you to spend your time implementing your design and optimizing only when you find that there are issues, and then use empirical tests to identify what those issues are (for example, using the shark profiler, or instrument your code with something that takes explicit measurements). Don't waste the incredibly valuable time of your mind's CPU.
      -Joshua
    • May 08 2007 | 5:04 pm
      Has the cycling team done any perf analysis of getbytes vs. malloc? OSX's malloc routine has a staggered / pooled allocation strategy as well, eventually degenerating into vm_alloc as the allocated chunks approach 4k (the system page size).
      I'm just wondering if this isn't a bit like double-caching. In any event, it's probably useful to have an allocation interface that more or less works uniformly across platforms.
      _Mark
      On May 8, 2007, at 9:35 AM, Joshua Kit Clayton wrote:
      > > On May 7, 2007, at 12:35 PM, Luigi Castelli wrote: > >> So, is there any difference worth mentioning between getbytes and >> sysmem_newptr ? >> (beside the fact that one has a 32k limit and the other doesn't) >> >> Do the sysmem functions call malloc under the hood ? > > Yes. getbytes uses an internal memory pool (which calls malloc if it > runs out of space). > > In Max 5 these calls can be used interchangeably, and if it is a > small memory allocation, we use our internal pool, and if it is a > large allocation, it uses the standard OS memory pool. I don't think > you should worry about using either one. They both should work fine > for your purposes. > > Again, Luigi, you seem to be worrying about code performance too > much before running into problems. I'd like to encourage you to > spend your time implementing your design and optimizing only when > you find that there are issues, and then use empirical tests to > identify what those issues are (for example, using the shark > profiler, or instrument your code with something that takes explicit > measurements). Don't waste the incredibly valuable time of your > mind's CPU. > > http://www.extremeprogramming.org/rules/optimize.html > http://www.extremeprogramming.org/stories/optimize2.html > > -Joshua
    • May 08 2007 | 6:15 pm
      On May 8, 2007, at 10:04 AM, Mark Pauley wrote:
      > Has the cycling team done any perf analysis of getbytes vs. > malloc? OSX's malloc routine has a staggered / pooled allocation > strategy as well, eventually degenerating into vm_alloc as the > allocated chunks approach 4k (the system page size).
      Good point. A quick test on my MBP shows that usually malloc is faster or approximately the same. Here's the results for the code snippet at bottom with various fixed sizes. Seems like it's about 2-3 times faster when quickly reallocating/deallocating the same size over and over again, and then a bit of a toss up when growing the memory pool or allocating random memory sizes.
      ITERATION COUNT: 100000; ALLOC SIZE(BYTES): 16 POOL REUSE getbytes: 22.827000 ms malloc: 11.202000 ms POOL GROW getbytes: 30.482000 ms malloc: 34.919000 ms POOL RANDOM getbytes: 25.689000 ms malloc: 12.784000 ms
      ITERATION COUNT: 100000; ALLOC SIZE(BYTES): 128 POOL REUSE getbytes: 22.871000 ms malloc: 10.882000 ms POOL GROW getbytes: 65.705000 ms malloc: 60.475000 ms POOL RANDOM getbytes: 30.505000 ms malloc: 35.556000 ms
      ITERATION COUNT: 100000; ALLOC SIZE(BYTES): 1024 POOL REUSE getbytes: 24.227000 ms malloc: 8.754000 ms POOL GROW getbytes: 258.164000 ms malloc: 225.734000 ms POOL RANDOM getbytes: 26.731000 ms malloc: 22.175000 ms
      > I'm just wondering if this isn't a bit like double-caching. In any > event, it's probably useful to have an allocation interface that > more or less works uniformly across platforms.
      Yes. Most of this is legacy based on NewPtr not being accessible in interrupt. We could probably change to use the malloc pool in all cases, or tune some of the bottlenecks in our mem pool implementation. However, as hinted in the last message, optimization of things which aren't the bottleneck isn't usually the best use of one's time. Will try to shark it and tune a bit before Max 5 release, though.
      -Joshua
      # define A_BIG_NUMBER 100000 # define A_SMALL_NUMBER 16 // malloc test { long i,count; char *p; char *q[A_BIG_NUMBER]; double start,end;
      cpost ("ITERATION COUNT: %d; ALLOC SIZE(BYTES): %dn", A_BIG_NUMBER, A_SMALL_NUMBER); cpost("POOL REUSE n"); start = mactimer_gettime(); for (i=0;i p = getbytes(A_SMALL_NUMBER); freebytes(p,A_SMALL_NUMBER); } end = mactimer_gettime(); cpost("getbytes: %f msn",end-start);
      start = mactimer_gettime(); for (i=0;i p = (char *)malloc(A_SMALL_NUMBER); free(p); } end = mactimer_gettime(); cpost("malloc: %f msn",end-start);
      cpost("POOL GROW n"); start = mactimer_gettime(); for (i=0;i q[i] = getbytes(A_SMALL_NUMBER); for (i=0;i freebytes(q[i],A_SMALL_NUMBER); end = mactimer_gettime(); cpost("getbytes: %f msn",end-start);
      start = mactimer_gettime(); for (i=0;i q[i] = (char *)malloc(A_SMALL_NUMBER); for (i=0;i free(q[i]); end = mactimer_gettime(); cpost("malloc: %f msn",end-start);
      cpost("POOL RANDOM n"); start = mactimer_gettime(); for (i=0;i count = (rand()%A_SMALL_NUMBER) + 1; p = getbytes(count); freebytes(p,count); } end = mactimer_gettime(); cpost("getbytes: %f msn",end-start);
      start = mactimer_gettime(); for (i=0;i count = (rand()%A_SMALL_NUMBER) + 1; p = (char *)malloc(count); free(p); } end = mactimer_gettime(); cpost("malloc: %f msn",end-start);
      }
    • May 11 2007 | 1:27 pm
      Quote: jkc wrote on Tue, 08 May 2007 20:15 ----------------------------------------------------
      > Yes. Most of this is legacy based on NewPtr not being accessible in > interrupt. ----------------------------------------------------
      I was recently rereading Miller's early papers on Max, and I would venture that as much of the legacy was concern about mid-80s state-of-the-art memory allocation taking large (and indeterminate) amounts of time.
      But NewPtr() in interrupt was a pretty killer argument.
      -- P.