As you undoubtedly all are aware of the 'synchronized' keyword, and at the risk of boring everyone, let me take a few moments to explain one possible way it can be implemented and how you can add it to your favourite language.
The keyword 'synchronized', at least in java-land is a method modifier that marks the method as not thread-safe. All thusly marked methods in the same class will be guaranteed to run non-concurrently. In other words, it will be thread-safe; you will have AT MOST 1 synchronized method executing at any given time period. It is used to maintain thread-safety with java's thread model.
In contrast, C++, who's mostly responsible for java's threading model, does not have such a keyword. To get same result in C++, you would need to manually declare a class-level mutex and ensure you call/release it before calling any of the synchronized methods.
Which brings me to today's exercise in obviousness. How to implement such feature? Well, as you can see, its purely syntactic sugar. One way to implement 'synchronized' is as above; create a private class-level mutex, remove the synchronized keyword from all methods and mark them as private, create a new public method that takes as input the name of the previously synchronized method it is trying to call. This new method would then obtain the private mutex and only then dispatch to the private, previously-synchronized method. Essentially, you're wrapping up the synchronized methods in an extra layer of abstraction, where you must have the mutex and call another function before you actually call them.
Not the most efficient way to do things; because you end up incurring the extra penalty of a function call and a lookup to get the function name.
So if you wanted to get fancy, you could post-process the code for the synchronized functions to insert the call to acquire the mutex as the first instruction and the release as the last instruction. The advantages are that you save on creating an extra useless function call but you do have to be somewhat smart about how you go about modifying the source; if you change the source code at compilation/interpretation-time, you must account for it in your debugger support; it'll look 'weird' if you end up breaking on code you dynamically inserted that the user did not write. But again, certainly possible as long as you take that into account when designing your debugger.
The next question is then; where does the mutex go? If you place it in the namespace of the class; as an 'bonus' data member, you've now modified the size of the object the user wrote. That can be dangerous if the user expects the object to be a certain size and its actually bigger. C++ has that issue because it has an extra couple of bytes for the vtable. But then, you must remember to take that into account when designing the sizeof() or its equivalent. If your language is high-enough that the user won't be doing silly stuff like that; you're fine. But if the user must write serialization/deserialization code themselves, or generally be aware of the size of the objects they're creating; watch out! Basically, if its not C, you probably won't have to worry but you should be aware of the possibility of horrible, non-obvious breakages.
The other option is to place it on the heap, in the global namespace and sufficiently mangle the name to be able to resolve it in all cases. Something like [class-name]-[process-id]-[hash-of-mutex] should do it. Placing it on the heap now means that you have memory that you must track of that is associated with an object, but is stored separately from it. Your garbage-collection code will suffer; but this is still the best option I think. You will essentially have to check that your object is alive and well to figure out if you need the mutex for it anymore. If you've sufficiently mangled the namespaces you'll be able to figure out if that object is alive and well, but you're still paying for the extra complexity and lookup time.
Finally and perhaps most interestingly, you see that Java's implemented that keyword to mean that all the methods are protected by the same mutex. But does it have to be that way? Not really, that was probably chosen because that is the most common usage. If you're designing/adding to your language, you can afford to add even more syntactic sugar to it; no reason why you can't synchronize on several mutexes. For example, 'synchronized_alpha', 'synchronized_beta' keywords. Each one has its own mutex associated with it; you're not restricted to just 1 mutex per class anymore. Now granted, cases for when this is useful are practically nil, but there it is.
A more realistic example is creating a synchronized_reader and a synchronized_writer for the many-readers, 1 writer problem. You want to allow more than 1 reader but at most 1 writer. Traditionally it is solved using a mutex and a counting semaphore. However, no reason why you can't implement that straight in the language using these constructs. User labels the reader/writer methods as reader-synchronized and writer-synchronized and you rewrite the code to get/release the associated locks at interpretation-time.
Easy-Peasy!
Comments