Forums > MaxMSP

Big coll : performance hit?

December 10, 2010 | 4:33 pm

I am wondering if anyone has experience with Very Very Large colls. Like with 50000 lines, or 150000 lines. I assume there is no maximum line limit. If I ask for line# 49834, is there a performance hit?


December 11, 2010 | 1:23 am

coll uses a linklist. So in order to fin the #49834 it has to iterate through everything. First apply the usual rule: don’t optimize, but code! and if you have a performance issue, you can for instance split your data among multiple coll by doing something more sophisticated on the key.


December 11, 2010 | 5:06 am

Thank you for the information.

Hmmmm . . .ahh, but if I am stepping through the coll using BANG or NEXT message, then perhaps it does not have to iterate through each time you step; it "remembers" or is still "there"? In other words, I step through 1, 2, 3, . . . . 29854, and the time it takes to go from 2 to 3, is the same as the time to go from 29853 to 29854. Is this correct? If so, I am OK with what I am doing, I think . . . .

I am not sure I understand the "usual rule" that you mention. Do you mean, "program what you want, worry about performance issues when they actually happen" ? I am afraid I have done that in past and then been very sad.


December 11, 2010 | 5:14 am

[text] doesn’t use a linked list, does it? am wondering for an upcoming project where I’d need several thousand word entries, each with some category flags. would this be best to do with [text] on a line-by-line basis, or with [coll]? They wouldn’t need to be changed once they’re in there, just accessed with their flags.


December 11, 2010 | 8:15 am

you will often find that it is not possible to replace a [coll] with [text] so easily.


December 11, 2010 | 1:35 pm

I think my original follow-up question got buried:

So if I am STEPPING through a coll sequentially, does that avoid the possible performance issue of having to iterate through the linked-list for each step?


December 11, 2010 | 5:25 pm

@christopher

Yes it does. next/prev starts from the last known location in the linklist.


December 11, 2010 | 5:27 pm

Hello Christopher Bailey,

Test ?

– Pasted Max Patch, click to expand. –

December 12, 2010 | 12:23 am

Emmanuel
Is the struct traversal algorithm smart enough to know that if two consecutive indexes are being addressed (ie 1, then 2) then this is the same as using ( next) ie (1, then next)?


December 12, 2010 | 4:47 pm

Hello Terry McDermott,

Is the struct traversal algorithm smart enough to know that if two consecutive indexes are being addressed (ie 1, then 2) then this is the same as using ( next) ie (1, then next)?

according this test, i should say NO ;-)

– Pasted Max Patch, click to expand. –

December 12, 2010 | 6:32 pm

Mmm… empirical data…


December 12, 2010 | 7:52 pm

We see so much topics about [coll] performance issue, sorry to repeat that again, but i find that, because of this "iterate-through-everything" neccesity, coll is the most inefficient piece of object in max. Hundreds of thousand of computer cycles to just recover one single value. ( http://cycling74.com/forums/topic.php?id=28901 )

I might be wrong but my feeling is that maxers rarely use this "Linked list" feature of coll, so i think that a kind of [arraylist] object is missing in max. This ideal [arraylist] object wouldn’t need to iterate through everything to retrieve the list n°49834 stored in it. It would just look directly for the list n°49834.


December 12, 2010 | 8:41 pm

Bear in mind that coll can use symbols as indexes.

To improve coll’s performance with longer lists, a hashtab would probably be needed. Its performance would be worse than a linked list for (quite) short lists.

There’s a project for somebody. The coll source is available in the SDK – try rewriting it to use a t_hashtab. Doing a "next" in a hashtab is impossible(?), so it’s likely you’d need a t_linklist in parallel anyway (storing just the coll indexes), ie you’re gonna wind up with an additional performance hit when storing/removing indexes etc.


December 13, 2010 | 5:23 am

Thanks for the discussion, everybody. I am using coll as a kind of storage for sequencer data (where each line contains a variable number of parameters for different kinds of note/events). So, except for an initial "Go To" event, I’ll be stepping through, so I suppose I am OK with that.

If anyone has other suggestions on how to do sequencing, I’m all ears. . .

I know I’m not the first to use coll this way..


December 13, 2010 | 7:06 am

Hello Christopher Bailey,

I use [jit.matrix] to store data for a sequencer ; but not sure that’s best …


December 13, 2010 | 10:00 am

Chris, you say:
I am using coll as a kind of storage for sequencer data (where each line contains a variable number of parameters for different kinds of note/events).

Isn’t [detonate] optimized for this kind of application? Have you looked at that?
T


December 13, 2010 | 11:32 am

[detonate] is really about MIDI notes… I’d probably use [qlist] as a more general tool. But I also happened doing the same with [coll], and it worked quite well.

@johnpitcairn: unfortunately the source for coll is not in the sdk… it might not be too difficult replicating its basic behaviour using hashtabs, but it sounds like a very boring job! on the other hand, in the sdk there are a couple of examples dealing with databases, one of which is called "dbcuelist" – it looks quite basic, but not bad at all…

aa


December 13, 2010 | 11:36 am

… or you might give a look at the slots of bach.roll and bach.score – http://www.bachproject.net

aa


December 13, 2010 | 8:37 pm

> "I use [jit.matrix] to store data for a sequencer ; but not sure that’s best …"

From what i remember, in term of efficiency, It is the best in max to do that. Around X100 faster than coll.
A while ago, i made some efficiency test to store sequence data in an array, comparing : jit.matrix, coll, ftm, peek~ and buffer, etc.. to store and read values :


December 14, 2010 | 7:20 am

Hello Alexandre,

[jit.matrix] have few problems for data collection use :

1. cannot be zero sized ; at least one.
2. always use [jit.qfaker] for high prioritize it.
3. cannot easily be resized without loosing values.

but it seems the best native object ;-)

PS : i didn’t test it with 150 000 values …


December 14, 2010 | 7:40 pm

@andrea agostini: The coll source is in the Max 4 SDK. I suspect it’s little different, if at all, in Max 5. I wonder what the comparative lookup times are for a database?


December 14, 2010 | 10:25 pm

mmm… I should have a Max 4 SDK somewhere. But if it’s a pre-obex version, quite a lot of things could have changed. I’ll go and check it out.

I’d be curious too about the performance of a database. In principle, I’d expect it to be very efficient on huge collections of data, not as much for smaller ones.

aa


December 15, 2010 | 12:54 am

I was using a coll because I can stash text, numbers, all kinds of things etc. in there. I’m not sure that jit.matrix can do this? (Although of course there are ways around that).


Viewing 23 posts - 1 through 23 (of 23 total)