OUR CASE FOCUSES ON THE CREATION of a concurrency adaptor. When working in an open distributed system, multiple components will require and provide all kinds of different locking strategies. If we want to interface with other components we need adaptors between the different synchronization approaches. The problem with this is that there are many possible synchronization approaches and that the number of combinations of different approaches is even larger. Hence we need an automatic concurrency adaptor. This chapter explains the concurrency problems we have in open distributed systems. We will gradually investigate a number of properties of concurrency guards in open distributed systems. This will enable us to identify a set of conflicts in chapter 6. The selection of these conflicts will be based on a number of variabilities of the different presented conflicts. Therefore we will present these at the end of this chapter.
IN THIS CHAPTER WE INTRODUCE OUR CASE, which is concurrency. The motivation of choosing concurrency as a case, lies in the fact that we feel that concurrency is a problem that is often overlooked in open systems. As we will explain, every component that can be used by third party software, needs to offer a concurrency strategy. When multiple components offer different concurrency interfaces, conflicts might arise that have nothing to do with the core functionality of the component involved. This makes our case interesting because if we can mediate these conflicts automatically, we have actually removed a non-functional concern for developers.
For practical purposes the problem of concurrency in distributed systems will be simplified to its bare essentials. Instead of using real database servers or transaction servers and real clients, we will create our own mini-version of the problems present in these software systems. We do this for a number of reasons:
THERE ARE TWO REASONS why distributed systems are concurrent systems. First, because they are running on multiple computers, where all those computers run in parallel and share resources. Secondly, because often distributed computations allow multiple users and have to support sessions. Concurrency might lead to a number of problems.
AS A CASE THROUGHOUT THE THESIS we will use a whiteboard. A whiteboard is an application on which a number of different actors (users or other computer programs) can put and get elements. Normally a whiteboard is used as a support tool in group discussion systems. We will use it as a means to illustrate concurrency problems graphically. First we will discuss the whiteboard and its actors without any concurrency primitives. Afterward we will explain the problems and a number of standard solutions.
|
The whiteboard has a very rudimentary interface:
The moving dot actor uses a simple algorithm to move over the whiteboard. It first checks whether the next position , left or right of the current position, is free. If it is it will mark that position and clear the old one. If the position is not free then it will change its direction. In figure 5.1, the red, orange and yellow actors are moving dots. Algorithm 6 covers the details. Please note that in all implementations we give we have to take care of the non blocking nature of the component system. In fact, we cannot wait until IsFree or SetPosition returns. However, for the sake of simplicity we omitted the original complex non-blocking code and replaced it with a more readable blocking version.
This simple actor requires and provides the following interface:
Aside from this simple actor, there is a more interesting figure that can be moved around the board: the moving line. It is a line of 10 pixels high, with a trailing dot behind it. The standard algorithm of the line (algorithm 7) checks whether the next line is free (lines 9,10,11 in the algorithm). If it is, the old line will be removed (lines 17,18 ), the trail will be drawn (line 19) and the next position of the line drawn (lines 20,21). The line's trail goes up and down relative to the origin of the line (lines 14,15,16). If the line bumps into something then it will change direction (line 24). Before checking whether the next line is free, the trail is removed, otherwise the line would not be able to turn around. (line 7). The interface required and provided by the line actor is exactly the same as the one for the moving dot actor.
The last actor we will discuss is the floodfill actor. This actor tries to fill the whiteboard by enlarging its own domain. The standard algorithm (algorithm 8) keeps a set of points that are owned by the flood actor, a set of border points and a set of seed points. A border point is a point owned by the floodfill actor, but with not all neighbors owned by the actor. A seed point is a set of points that are possible candidates to fill. They are not yet owned by the actor. Normally a point starts in the seed set. If the point is free, the point becomes a border and all 4 neighbor points are added to the seed set. If a point is solely surrounded by points owned by the actor then it becomes an owned point. Points that became owned are cleared on the whiteboard but remain owned.This can be seen in figure 5.1. In the figure the purple and turquoise actor are flood actors. The interface required and provided by the flood actor is exactly the same as the one for the moving dot actor.
WHEN WE LOOK AT THE INTENDED BEHAVIOR of all actors, we see that it is of vital importance that no actor crosses the boundaries of another. Dots should not go through each other, lines should bump, dots should not enter floodfill actors and floodfill actors should not cross each other either. Furthermore a line should always have a trail that goes up and down relative to the origin of the line. In this section we will illustrate that none of these requirements are met if we don't use any concurrency primitives. Let's investigate some of the problems
The behavior described above is typically called a race condition. It means that two (or more) concurrent processes try to get to the same resources and are actually 'racing' to be the first. To deal with this kind of problems, a number of alternate tracks exist.
NONE OF THE BEFORE MENTIONED RACE CONDITIONS would exists if we modify our server a little bit. Instead of offering only an IsFree and SetPosition operator, we could add an extra operator to the whiteboard:
For most actors (the moving dot actor, the line actor and possible others), this solution might work, however this solution cannot guarantee that every possible use of the component an be expressed in such a way that no race-conditions occur. For instance, with this operation, it would be very difficult to make the floodfill actor to work correctly, without race-conditions. The biggest problem is that the floodfill actor in essence doesn't move, instead it takes new points if they border to the actor. As such implementing a floodfill actor in a safe way would mean that we can only move dots away from a position that is bordered by three other flood-dots. This would require an initial seed of at least 5 positions and would require additional code to input new blocks in the interior of the floodfill.
This solution illustrates that extending a server to support every possible critical section in one message is no solution at all, especially not in open distributed systems. Instead, it is a very local solution only to solve the problem of specific clients. Therefore we need some better solution. We cannot for every possible critical section end up modifying the server.
IN OPEN DISTRIBUTED SYSTEMS we need some form of critical sections, otherwise different actors can change the internal state of a component, without taking into account other components. Placing all possible critical sections at the server is not good enough because this would not allow unanticipated behavior for other components. Therefore we need a more abstract way to specify our critical sections. In fact, we need to specify the beginning and the ending of a critical section in such a way that no component allows access from another component unless it has obtained an atomic operation id. In such a scenario every component would provide a certain synchronization interface, for instance, defined as:
out enter_ok(<Integer|Id>)
out enter_fail(<Integer|Id>)
in leave(<Integer|Id>)
out leave_ok(<Integer|Id>)
out leave_fail(<Integer|Id>)
How the semantics are defined is a bit more difficult. Two orthogonal features need to be investigated. First there is the question whether those atomic operations are reentrant or not, second there is the problem whether an atomic operation waits or not when entering. In total this gives about 4 essentially different synchronization interfaces. We will explain two of them in more detail below. The other two are reserved for the next section. We will define the semantics of the different approaches by means of pseudo code.
Non-waiting, non reentrant enter/leave synchronization semantics. (Algorithm 9). This way of locking allows the interface requester to start an atomic operation on the interface provider by entering it. The enter takes an argument that should be a global unique identifier. In return the interface provider will return EnterOk when the atomic operation could be started, EnterFail otherwise. At the moment a component has an atomic operation initiated on the server it cannot re-enter the server, not even from within the same atomic operation. This is called non re-entrance.
Non-waiting, reentrant enter/leave synchronization semantics. (Algorithm 10). When such an interface is provided, the same interface client can start multiple nested atomic operations on the server. This is especially useful when working with recursive functions. The algorithm itself simply keeps a lock counter that is increased every time a client enters. When a client leaves an atomic operation, the lock counter is decreased and LeaveOk is sent back. At the moment the lock counter reaches zero lockedby is emptied and another actor can enter an atomic operation. Currently the reentrant semantics return LeaveOk when an atomic operation is ended. This is not necessarily always the case. It would also make sense to differentiate here. For example, return a LeaveNested when there are still other atomic operations running and a LeaveOk when the last atomic operation has ended. In the same way, the non-reentrant semantics could be changed to return EnteredAlready when such an atomic operation has already been started.
The two algorithms presented here, require the client to supply a globally unique identifier to the server. This is clearly not a realistic requirement. Normally the server will choose a number of its own and return that as a unique identifier. In section 5.10 we will present more about this.
The two above algorithms are both non-waiting algorithms, this means that if the interface provider returns an EnterFail message, then the interface requester will need to retry again at a later time. This is a source of trouble, because, chances are high that the interface requester will immediately try to enter again until he finally could start his atomic operation. This leads us to the following problems:
TO AVOID NETWORK CONGESTION and starvation of distant components, we will now investigate how the earlier defined API can be implemented with waiting semantics. We will again distinguish between reentrant and non-reentrant semantics.
Waiting, non reentrant enter/leave locking semantics: algorithm 11). These semantics are the same as the non-waiting locking semantics. The biggest difference is when somebody wants to enter an atomic operation on the interface provider. At the moment there is already an atomic operation running, the requester will be placed in a queue. As such, an EnterFail is never returned, instead the possible EnterOk is held back for a later time. At the moment a leave request arrives with the correct id, the atomic operation is finished and the first in the waiting list is informed and can start with his set of atomic operations.
Waiting, reentrant enter/leave synchronization semantics. (Algorithm 12). The reentrant version has a similar protocol behavior. The only difference is that this algorithm will increase and decrease a lock counter in response to Enter and Leave respectively. At the moment the lock counter reaches zero, the next waiting request is considered. Some variations could exists upon these semantics.
The non-waiting atomic operations had the problem of starvation and network flooding. In the same way, waiting locking semantics have their own problems: deadlocks. A deadlock is technically speaking a situation in which multiple processes are waiting for each other to do something. So they are all virtually dead. There are two interesting situations.
There are 3 different approaches to solve the problem of deadlocks:
IN THE ABOVE MENTIONED SYNCHRONIZATION SOLUTIONS, we specify the start of an atomic operation on the server with some form of message. Once an atomic operation is started no other actors can start an atomic operation. This is not always as efficient as it could be. Consider our whiteboard example. If a client wants to set the color of two squares, it has to start an atomic operation on the whole server. This is in essence the same as locking the full server. Suppose now that a second actor also wants to draw some pixels but on the opposite side of the whiteboard. It is clear that those two operations can coincide, but won't because both need a server lock.
A solution to this problem is to introduce a lock for every square on the whiteboard. In doing so a client actor can request the server to lock a number of squares, which it can access afterward, but still concurrently with other actors that have locked other positions on the board. The API for such a lock is extended somewhat. Instead of declaring where we want to enter we specify what position to lock (or unlock). The semantics can be implemented in the same way as specified before. We can have a combination of a waiting/non-waiting strategy with a reentrant/non-reentrant locking strategy. With respect to terminology, such locks are typically called semaphores. A reentrant lock is sometimes called a counting semaphore, while a non reentrant lock is often called a binary semaphore. [Lea00]
We introduced a lock for each square on the whiteboard. We could also introduce a lock per -line or a lock per -line on the field, or a lock per 4 squares on the field. We choose to map our resources to squares.
Using a waiting non reentrant locking strategy to solve the concurrency problems when moving dots, flooding the whiteboard or moving lines with their trail is easy. Algorithm 13 shows how this can be done. Before actually checking whether a position is free, we lock the current position as well as the next position. If the next is free, we move and release both locks. If the position is not free, we also release both locks but we turn around. The same thing can be done for the moving line actor and the floodfill actor.
With this solution there still exists a plethora of problems:
|
|
Looking at these problems, we see that there are a number of possible solutions when working with non nested locks.
A typical solution to this kind of problems is called staircasing. Here locks should be acquired in a certain order. In our example it would make sense to sort all squares from left to right and top to bottom. When doing so, deadlocks are impossible. When a lock is requested and cannot be assigned we are sure that the other actor can continue because it already has all the locks it needs with an order smaller than the one it is requesting. As such, we don't need to unlock any of our already acquired locks. When using staircasing it is important to acquire the locks in order.
Another possible solution is to make locking and unlocking itself an atomic operation, so extending the lock and unlock operations with a set of locks to acquire or release. (See algorithm 14). The two green lines start and stop the 'atomic locking operation'. Within this atomic operation two locks are acquired. Please note that there is no need to release the locks within a critical section. This way we are sure that we have all the locks we need at once or none at all.
In fact we are now solving a synchronization problem between locks, and no longer between resources. We are specifying locks themselves as resources where the access to the resource (the acquiring of locks) should be guarded by another lock. This layering is pictured in figure 5.8.
Using the above techniques (staircasing or making locking an atomic operation by itself) is difficult when the programmer expects nested locking semantics. Both techniques rely on the fact that at a certain point in time the program knows all the locks it will eventually need. When working with subroutines (or subprograms) that autonomously acquire locks this can be very difficult. One workaround is first to ask all subroutines what they will eventually need, collect all those locks and then acquire them within one operation.
However, if it is impractical to require that a process specifies all the resources it needs at once, we might need to resort to transactions, which we will explain in the next section.
A TRANSACTION IS AN ABSTRACT ATOMIC operation within which changes can be made on a component. When the transaction is finished it can either commit or abort. Committing a transaction means that all the changes made within its critical section become true and visible to other participants. Aborting a transaction means that all the changes made within the critical section are undone.
Transactions are necessary in systems where we cannot order locks because either there is no obvious order, because we cannot force actors to respect that order, or because an actor cannot specify all it needs at once. Another situation where we need transactions is when we think about failures. What will happen when one of the actors on the whiteboard has locked certain positions and then dies. Can the server recover from such a situation ? If the server supports transactions and it can detect client-death (by means of a time-out, or by detecting a broken socket) it can rollback the transactions owned by that client.
Transactions can allow the nesting of operations. When, within one transaction another is started and the locks required to execute the inner transaction cannot be obtained, the outer transaction might abort also. From the programmer's point of view this is what we want, also from the point of view from the server transactions are good because the server will always reside in a valid state, something that cannot be guaranteed without transactions.
The problem of livelocks still remains of course. Fortunately, this can be remedied at the server side. The problem is no longer solely dependent on the network traffic or the behavior of the clients, but on the scheduling behavior of the server. For instance a server may decide to set a lock request in wait until the lock can be obtained, or it can return a lock-failure.
Depending on the context within which people talk about transactions there can be a difference between read-locks, write-locks, and the way locks are treated. If smaller locks are defined then there is a greater flexibility to optimize the concurrent access to resources. Similar to the granularity of resources, we can consider a lock to be a resource on its own.
Implementing a simple transaction system can be done by keeping in memory which locks belongs to which transaction. Once a lock is released the lock keeps on belonging to the original transaction. Only, when the transaction commits, all its locks are released. When the transaction aborts, all its locks are released and all changes to the resources are rolled back. Algorithm 15 illustrates the rudimentary semantics of a non-waiting, non recursive transaction synchronization interface. Such transactions can be easily used from a programmers point of view. We illustrate this by making the line actor movable (Figure 16). The reader interested in transactions can read [OHE96,CDK94].
WITHIN OPEN PEER TO PEER NETWORKS concurrency control and guarding is fairly difficult, because in a peer to peer system several components together provide a certain global behavior. So if one wants to take this global behavior from one correct state to another correct state, one needs to take all participating components within one operation to this new state.
To do so we need a transaction that spans multiple components and can commit all components or abort all components. This is called a distributed transaction. Normally distributed transactions are provided by one server that contacts the necessary components and commits or rolls back transactions as necessary.
In figure 5.9 we see how component needs to go through a transaction server before it can start certain actions on a certain set of components. The transaction server takes care of transmitting all incoming messages to the other communication partners, possible mapping different transaction id's to the same number. In essence, all locking logic should go through this transaction server. The transaction server is available per group resources needed per client, which means that all components taking part in a certain session need to go through the same transaction server. So in fact we added another extra layer to solve concurrency problems between multiple components. (Pictured in figure 5.10).
The transaction manager per group is not difficult to conceive. It needs to access the transaction ports on all participating components and needs to have a mapping. This mapping ensures that when a beginTransaction comes in, the transactions started on all participating components will be represented by one transaction id. When a lock request comes in, the supplied transaction id should for every component be mapped to the correct effective transaction id. This constitutes a problem because a component is normally not aware of some sort of transaction manager. This means that all components need to agree to use the specified session transaction manager.
Of course, there are problems with this setup. Other concurrency strategies worked well because their behavior was unambiguous. A transaction was either committed or not, a lock was obtained or not and so on. The problem we have here is that with a distributed transaction we can have transactions that are partially committed. We can have a situation where the transaction server starts committing all its transactions but fails to commit the last transaction because the involved component has died. In such a situation the resulting global state is invalid, even when the component recovers its old state. There is not much that we can do about this, except introducing extra redundancy. Distributed transactions are discussed in [CDK94].
WITHIN THE PREVIOUS SECTIONS we have introduced a number of concurrency strategies. All these different concurrency strategies can be described by a set of commonalities and variabilities. The common issue is the fact that we are talking about resources and about critical sections: how can we ensure a valid state transition ? Below we will present a set of variabilities which can define a concurrency strategy.
|
The above variabilities can be used to describe a concurrent strategy. We must now investigate how they relate to each other, which variabilities can be modified without impact on other variabilities and which variabilities are influenced when another variability is shifted. If two variabilities do not correlate they are said to be orthogonal.
Control flow can be modified without any immediate impact on the reentrancy of the concurrency strategy, so these two variabilities are orthogonal. The resources covered by a concurrency strategy are also independent of how the locks are offered, so resources are orthogonal to control flow and reentrancy. Transition of locks, whether they can abort or are in effect immediately is independent of the resources, control flow or reentrancy. Therefore, this criterion is also orthogonal to all others. Syntax changes when the resource granularity changes, or when the control flow changes, or sometimes even when the reentrancy changes. Therefore syntax is partly defined by all other axes. Syntax is not a pure orthogonal variability. Figure 5.11 illustrates this.
If we now need to take into account the different concurrency strategies we must understand that layering influences everything. Syntax changes to support an extra layer, resources change since extra resources are added, control flow changes because different layers interact with each other in different ways. Transition changes since lower layers need to be rolled back when an upper layer decides to roll back. Figure 5.12 illustrates this. The red arrow in the figure indicates an extra axis/dimension for layering. The axis we will work on on the other hand is slanted in the other 5 dimensions because it is impossible to change layering without interfering with other properties. On the other hand, we cannot express other properties solely by means of control flow, transition, reentrancy and resources, because an extra layer needs to reason about other resources.
However, this view on the different variabilities, should not be considered to be complete. It suffices for our goal, which is to select a set of concurrency conflicts. For instance, syntax should not necessarily define a separate variability, while a missing variability might be timing. We have limited ourselves to these because they capture a great number of possible concurrency strategies and because the resulting concurrency strategies can occur in practice.
IN THIS CHAPTER WE HAVE IDENTIFIED COMMONLY occurring concurrency problems based on a whiteboard on which multiple different actors can draw any figure they want. We will continue to use the whiteboard as a running example throughout the thesis.
When concurrency is, with any technique available, managed at a central place, and everybody adheres to the concurrency strategy, there cannot be much concurrency problems. The reason behind this is that the whole program uses the same concurrency strategy and this strategy is, when well designed, suitable for the program in question. Unfortunately, we saw that this requirement does not hold for open distributed systems.
The problem with concurrency is that it cannot be modularized. We cannot easily say where the concurrency management should be placed, nor can we offer one interface to the outside world which hides all our internal concurrency problems. When we place two deadlock-free programs in the same environment, it is possible to have a deadlock between both programs.
We investigated a number of problems with their solutions, and saw that these solutions in turn give rise to new problems:
Werner 2004-03-07