is object_alloc thread-safe?
Oct 24, 2010 at 9:42pm
is object_alloc thread-safe?
I’m really clueless with a bug.
If I call object_alloc a real lot of times, banging simultaneously from a metro and a qmetro at a very high speed, eventually I get a crash. The overall picture is quite complex, so I can’t really describe the whole thing here, but the catch-all explaination would be object_alloc not being safe.
I guess I’m speaking non-sense, since posix-compliant malloc() must be safe, but can someone please reassure me about this?
Oct 25, 2010 at 11:50am
By convention, the object life-cycle is presumed to always occur in the main thread. By object life-cycle, I mean class registration, instantiation, and freeing.
While its true that we allocing memory, we also do a bit more than that. For example, we zero the memory, set the magic number, setup obex for the object (to support attributes etc), set the #P/#B symbol to the patcher, etc. So there is a little bit more going on.
It is not recommended that you alloc memory from the scheduler thread (e.g. in response to metro). Instead you should defer the call to get it back to the main thread. Allocating memory here has the potential significantly degrade the performance of the scheduler.
One strategy here is that you could pre-allocate a pool of whatever it is that you might need, and then in the scheduler-called method you simply grab some instances from your pool. Not sure if this helps, but hopefully it will get you on the right track.
Oct 25, 2010 at 8:10pm
Actually, this is an old model (Max 4 and earlier). All “box” objects should happen in the main thread, but otherwise object allocation and freeing of “nobox” objects is thread-safe. We allocate objects, and memory in various places in other threads besides the main thread without incident.
As Tim points out, you might want to avoid doing this extensively for performance reasons. However, it should not crash.
From what you mention (“eventually I get a crash”), are you watching to make sure you’re not hitting the memory limits of a 32-bit allocation? If so, perhaps you have a memory leak? If not, perhaps you have a crash log which might offer some insight into what sort of crash is occurring, together with some code fragments of the crashing function?
The next thing from what you describe (a qmetro and a metro running simultaneously), are you sure that whatever you are doing with the objects is threadsafe? It would need to be in such a case. For example the following is not threadsafe:
This type of memory access by two threads is a problem since there’s nothing stopping the threads from switching to one another between the two lines. This can result in:
1. double freeing an object and possibly crashing
You would want to do something like the following using either locks (easier to manage, especially in complex cases with many related operations):
This is an easy scenario to overlook. There are probably places in our own code (and the SDK) which make this kind of oversight.
Finally, if it is not a crash, but rather a “spinning wheel of death”, then, perhaps you have a deadlock scenario, which most often happens if you place an outlet call *inside* your own mutex lock (big no no). Otherwise, we’d need to know more about what you’re doing before we could offer much assistance.
Oct 25, 2010 at 9:02pm
just by curiosity (and because I’m also working now with multi-threaded stuff), is it better, worse, or just the same if instead of using the
Oct 25, 2010 at 9:04pm
Sorry, of course I was referring to
Oct 26, 2010 at 9:13am
I’ll try to make some things clearer.
Tim, I am aware that object_alloc does more than merely allocating memory – I just (wrongly) assumed that memory allocation was the only thread-risky part of the operation.
Anyway, the class I’m frantically instantiating is a nobox one.
The crash log doesn’t really help. The point is that my data are corrupted in a (very systematic) way that eventually causes a crash. What I remarked is that inside my data structure I unmistakably recognize some data belonging to a message box elsewhere in the patch – this is why I was wondering if there was a memory allocation problem, kind of the same block of memory allocated twice because of a thread-safety problem.
This thing of the message box is a quite intriguing one. The metro and qmetro are actually connected to a message box (containing 1 2 5), which in turns is connected to my object [reg]. Besides the toggles and number boxes for the metro and qmetro, there’s nothing more in the patch. When [reg] receives a list, this is a simplified version of what happens:
(As you can see, at the moment a new object is cached the previous one is freed; and the cloned object that is output is immediately destroyed thereafter.)
Now, deep inside my llll structure (not at the very beginning, though) I have two longs, one after another, that at the moment of the crash contain 2 and 5. This is not the immediate cause of the crash, but a clear symptom that something is wrong. If I change the contents of the message box, the two longs change accordingly. Now, this would suggest that someone stores the 2 and 5 inside a long or something like that – not in an atom, because in this case they would not be aligned to my two longs in the object structure, right? Since in my object the incoming message is only stored inside atoms, I’m guessing someone else has written inside my object structure. This is why I was wondering about object_alloc.
Finally, this thing of banging my object from a very fast metro and qmetro doesn’t belong to a real-world scenario – it’s just for testing. Though, I don’t like it not to work…
As soon as I get more detalis about the problem, I’ll let you know!
ps @ Siska: afaik, critical regions are basically recursive mutexes. This means that you can enter them more than once from the same thread (and you have to unlock them more than once, correspondingly). Besides the fact that you might not want this, criticals are much more expensive than regular mutexes – so in general you should avoid them when you don’t need them.
Oct 26, 2010 at 9:46pm
Yes, it sounds like something is trashing memory, or you’re using memory which is no longer valid (e.g. referencing atoms passed in as arguments after they’ve been freed or popped from the stack).
Your locking looks correct. I’ll assume that there’s some innocent typos in the above code, however.
Without seeing your llll constructor or your clone method, it’s a little hard to tell whether or not you’re dealing with your atoms and any allocated memory properly.
Let us know if you have more details and better examples for us to investigate.
@Siska: Yes, what Andrea mentions is correct. critical_new() is the same thing as systhread_mutex_new(&mutex,SYSTHREAD_MUTEX_RECURSIVE);
Oct 27, 2010 at 2:22pm
You must be logged in to reply to this topic.