When I was hit with startvation of some of the update threads in BpmDj, I was a bit puzzled. After all, I did use a ReentrantReadWriteLock in fair mode. A simple profiling showed that certain transaction where substantially more heavy (a writelock taking lets’ say 10 seconds), while the databasereads would merely take 1 second.
From that I concluded that because the writelock was held longer, other threads did not have the opportunity to have a fair amount of locktime themselves. E.g: the write lock is released, the longest waiting readlock is granted: that transaction is done within a second, and the writelock is granted again. And is not released for another 10 seconds.
To test this I set up a program to create 10 reader threads and 1 writer thread. Each thread would acquire a lock, wait some time (to simulate the ‘work’ done in the locked section) and then release the lock. This would be performed in a loop for about 10 seconds. Afterwards, we could measure how much time within the lock was spend for each thread and compare that with the amount of work the thread wanted to perform.
These were the results:
The unfair lock behaved, as expected, fairly unfair. If the writer had 10 times less work than the readerthreads, its locktime would be 8 times higher. If the writer had the same amount of work, it would have 16 times more locktime and if the writer had 10 times more work, then it would be granted 50 times more locktime.
/10.0 => *7.944541604031417
1.0 => *15.917042652441687
*10.0 => *50.5366207048361
A fair lock
When we created the read/write lock in a fair fashion, the results were more in line with what we would expect:
/10.0 => /8.810361366979999
1.0 => /1.0021791947397343
*10.0 => *9.009623837637745
That is, when the worker has 10 times less work than the readers, it has 8 times less locktime. If it has the same amount of work, it receives the same amount of locktime and if it has 10 times more work, then it is granted 9 times more locktime.
This is completely as expected, yet not something we might want, because it allows heavy tasks to block the lighter tasks.
A fair lock, prefixed with tryLock()
tryLock allows an app to check for a lock, and if it is not granted the lock to continue with something else. There are two tryLock variants. The first without parameters (tryLock()), the second with a timeout.
Trylock() screws up any scheduling that might have been in place and just barges in on anything the algorithm might be planning to do.
/10.0 => /925.0224470413133
1.0 => /99.7994977890176
*10.0 => /11.036514077119893
In this scenario, the writer thread pretty much does not get anything done. Whether it is performing 10 times less or 10 times more work, its locktime ranges from ~ /1000 to /10. This is very bad, because you might expect the tryLock to make the lock unfair, yet the results of an unfair lock (see above) are completely the opposite.
A fair lock, prefixed with tryLock(timeout)
There is a second variant of tryLock: one with a timeout; which can indeed be 0. If we apply that, we get the following results:
which is in line with the straightforward fair lock.
In BpmDj, we used the tryLock() instead of tryLock(0), assuming that a timeOut of 0 would result in the same behavior between them.