How does coll search for symbols?
If I use coll to store a lot of data where the indices are symbol names, how does coll search when I send it a symbol? Does it just go linearly or is there a way to do fast binary lookups?
I webscraped about 900 guitar chord tunings to use with a strummer object and I’m looking for the fastest way to retrieve chords by name.
The code for coll used to be published with the SDK. As I recall, it was a linear search.
OTOH, with modern hardware, a linear search through 900 items shouldn’t be all that slow. Also note that symbol matching, itself, is highly optimized. Once a symbol is "generated", it is represented by a 32-bit value, so matching is simply a comparison of 32-bit values (i.e., one subtraction and test for zero) rather than anything as ham-fisted as strcmp().
I have it working with coll now and it does seem reasonable so far. On the other hand, I’m not doing much else yet.
Last time I looked at searching algorithms (a long time ago) the switch over point between linear and binary was about 60-70 items so having over 800 entries seems prime for binary searching. Obviously, the faster the computer, the bigger the list you can get away with.
Is the source for coll available anywhere? I’d hate to start from scratch but modifying ‘coll’ to have a binary search version shouldn’t be a big deal.
Why not use dict instead? Random access is hash-based, so the lookup is super fast…
@ej: Point taken. Not that it makes much difference :-)
Hey, why don’t you C74 guys add the code of coll back to the SDK, so we developers might reimplement it using dictionaries.
Good idea, right? ;)
Not with a one-off experiment. In some basic tests I just did this morning, I’ve determined it takes
0.0005ms to find the first item and
0.0051ms to find the last item (there are only 827 items in this list)
So in the short term I’m probably safe.