How does coll search for symbols?

Oct 31, 2013 at 4:20pm

How does coll search for symbols?

If I use coll to store a lot of data where the indices are symbol names, how does coll search when I send it a symbol? Does it just go linearly or is there a way to do fast binary lookups?

I webscraped about 900 guitar chord tunings to use with a strummer object and I’m looking for the fastest way to retrieve chords by name.

Thanks,
D

#269800
Nov 1, 2013 at 4:02am

The code for coll used to be published with the SDK. As I recall, it was a linear search.

OTOH, with modern hardware, a linear search through 900 items shouldn’t be all that slow. Also note that symbol matching, itself, is highly optimized. Once a symbol is “generated”, it is represented by a 32-bit value, so matching is simply a comparison of 32-bit values (i.e., one subtraction and test for zero) rather than anything as ham-fisted as strcmp().

I would suggest trying with coll and seeing what performance is like. If it’s really not fast enough, you can think about a custom object in either C, Java, or JavaScript (or one of the other embedded languages). C is likely to be most efficient but also (for many people) the hardest to implement.

#269830
Nov 1, 2013 at 6:24am

I have it working with coll now and it does seem reasonable so far. On the other hand, I’m not doing much else yet.

Last time I looked at searching algorithms (a long time ago) the switch over point between linear and binary was about 60-70 items so having over 800 entries seems prime for binary searching. Obviously, the faster the computer, the bigger the list you can get away with.

Is the source for coll available anywhere? I’d hate to start from scratch but modifying ‘coll’ to have a binary search version shouldn’t be a big deal.

#269838
Nov 1, 2013 at 1:39pm

@Peter 32 or 64 bits ;-)

Coll uses linklist for it’s data structure, so the search is linear.

#269876
Nov 2, 2013 at 12:12pm

Why not use dict instead? Random access is hash-based, so the lookup is super fast…
aa

#269940
Nov 2, 2013 at 12:29pm

@ej: Point taken. Not that it makes much difference :-)

#269941
Nov 2, 2013 at 2:56pm

Hey, why don’t you C74 guys add the code of coll back to the SDK, so we developers might reimplement it using dictionaries.
Good idea, right? ;)

- Luigi

#269955
Nov 2, 2013 at 3:00pm

@andrea because unfortunately I’m still stuck with max 5 for production work

#269956
Nov 2, 2013 at 8:09pm

@DHJDHJDHJ but do you have performance issues?

#269980
Nov 3, 2013 at 6:40am

Not with a one-off experiment. In some basic tests I just did this morning, I’ve determined it takes

0.0005ms to find the first item and
0.0051ms to find the last item (there are only 827 items in this list)

So in the short term I’m probably safe.

#270005

You must be logged in to reply to this topic.