IN THIS CHAPTER we first summarize our work. Then we will elaborate on our scientific contributions as well as on the applicability of the thesis. We point out the limitations of our work, give guidelines how an adaptor for other conflict domains might be created and finish the dissertation by pointing out possible future work.
BASED ON THE CASE STUDY of conflicting concurrency strategies, we have shown how intelligent adaptors can be created automatically. We have illustrated this by creating such an intelligent adaptor as a sequence of three modules, where every module has different responsibilities.
The liveness module learns how to keep the components behind the concurrency strategy alive. For the concurrency adaptor as a whole, such a module is necessary to offer to the underlying components an optimal feedback behavior, as opposed to for example, returning a behavior that leads to endless live-locks such as lock-lockfalse-lock-lockfalse ad infinitum. Because it is still unknown whether liveness is a decidable property of Petri-nets, we have decided to use a reinforcement learning algorithm (Q-learning and an -greedy strategy) to approach the problem. Feedback for the learning algorithm comes from the underlying component by means of statically placed checkpoints. Every time a checkpoint is reached a reward is sent to the adaptor and the learner can, when necessary, modify its future behavior. The reinforcement learning algorithm itself has been very efficiently mapped onto the structure of colored Petri-nets, in such a way that storage requirements are minimal, without losing essential information. This optimization is mainly based on a) the explicit state information of the Petri-net already present and b) the creation of a system that can recognize certain situations and knows how to handle them. Our situation-recognition system is expressed as a Petri-net. During execution, new transitions are constantly added to the Petri-net. The learning algorithm automatically explores the new avenues offered by these transitions and, when suitable, these new transitions are kept alive and the new situation is remembered. The reason why we use Petri-net transitions as situation handlers has been explained thoroughly, and possible alternative representations have been investigated. To verify the suitability of our representation we used genetic algorithms.
The concurrency module of our concurrency adaptor inserts an appropriate concurrency strategy. To do so, it communicates with the two other modules about resources (for instance the possible squares in a whiteboard) and actions (for instance a setPosition). There is no explicit annotation of resources and actions on resources, both are implicitly represented within the Petri-net. This information is automatically deduced by the first module under the form of enabled transitions and equivalent states. This, together with extra information of the liveness module that describes which futures are possible and which futures the component would like, enables this module to make correct decisions about what to do. Once such a decision is taken the client component can perform its actions on the server component.
The enforce-action module bypasses the concurrency strategy of a component entirely. This allows for more freedom in the two other modules. The enforce-action module does this by offering a logic interface to the outside world while inserting the necessary synchronization messages at the appropriate times. To do so, the module needs to know how certain markings within the underlying Petri-net can be reached. In this dissertation we have achieved this by means of a reachability analysis in prolog.
In these three modules, the adaptor needs a) to analyze how to enable certain actions, b) to know how to reach a certain state and c) to know which resources are involved. In order to be able to know this the adaptor requires a formal documentation in the form of Petri-nets. Every component needs to provide a Petri-net describing its own required or provided concurrency strategy in a consistent and complete way. Consistent means that any action that is allowed by the Petri-net should be accepted by the component and complete means that no action that is allowed upon the component is undocumented.
WE STATED that in open distributed systems, one needs intelligent adaptors to mediate interface conflicts. We identified three reasons why one might need an intelligent adaptor: a) in open systems, it is very unlikely that one standard will be used by everybody, b) the highly volatile nature of peer to peer computational networks gives rise to interface conflicts and c) with the current advent of interconnected embedded systems and ambient intelligence the problem of conflicting interfaces will become even more prominent. We claimed that it is possible to create, for certain categories of concurrency interface conflicts, a concurrency-adaptor that automatically learns how to resolve the conflict in such a way that a) the required behavior of the involved components can be executed over the adapted interconnection and b) it is able to work on-line in certain open system. We have validated this by constructing such a concurrency-adaptor. Below we discuss the different requirements and how we have satisfied them.
To show the necessity for such adaptors, we have introduced our real-world case of conflicting concurrency strategies. We have investigated different concurrency strategies and explained why components in open distributed systems need a concurrency interface. Relying on this analysis we discussed the problem: not everybody will write the same concurrency interface for his components and thus concurrency interface conflicts will arise. We identified and explored in much detail a large number of practical conflicts. More specifically, we investigated conflicts that can easily occur when one or both of the concurrency interfaces is upgraded without prior notification and without full backward compatibility.
To show that we can create such an intelligent adaptor we have constructively developed one. The adaptor itself needs detailed information of the provided and required interfaces of all participating components. The adaptor works by means of three modules. It contains two modules that essentially bypass the provided or required interface and one module that actually implements a suitable concurrency strategy. To bypass a concurrency strategy we investigated two techniques. The first technique was a reachability analysis of a Petri-net, the second technique was a reinforcement learning algorithm. Finally we explain how resources and actions can be detected automatically within a Petri-net. This chain of three modules constitutes an intelligent adaptor that is able to set up a link between conflicting concurrency interfaces, for which the adaptor does not know the exact interface beforehand. The adaptor is restricted to client server architectures. In section 12.7.2 we have explained in detail what can go wrong with peer to peer applications. Essentially, the adaptor does not work in open peer to peer systems because not all communication between different parts of the application can be mediated by one adaptor.
We now reconsider the abstract requirements from the introductory section.
This requirement was necessary to make the adaptor useful in an open system. The performance estimates as well as the setup of the adaptor enables it to work on-line. The liveness module is an on-line prolog program that can deduce how a certain state can be enforced upon the server, without needing to 'test' the server's behavior. The concurrency module is a program that can be executed on-line and the liveness module has been specifically designed, by selecting Q-learning as a learning technique, to work on-line.
The adaptor is a central component that is placed solely on the connection between different components. Hence the only modifications it can introduce is on the messages between the different components. Therefore, this requirement has been satisfied.
The most important abstract requirement of an intelligent adaptor is based on the notion of an overall required behavior. In our case study the overall required behavior between the different conflicting concurrency strategies was to avoid actors crossing each others boundaries. As we have experimentally observed, our adaptor supports this implicit requirement. Two cases can occur:
THE MAIN CONTRIBUTION OF THIS THESIS is the process we have been following to create an adaptor. Instead of trying out one technique and verifying whether it works or not, we have kept on searching to find a (set of) suitable technique(s). To reduce the complexity of the problem we have restricted ourselves to the case of concurrency conflicts, which we have expressed in an event based model. When necessary we have invented new techniques (The extremely useful Petri-net documentation for describing interfaces and a new technique to verify the bias of computer generated programs), or combined existing techniques (Petri-nets & Reinforcement learning, the three different modules that constitute an adaptor). This has lead to the creation of a concurrency adaptor that mediates differences between conflicting concurrency strategies. Below we summarize the contributions.
IN THIS SECTION WE ELABORATE ON SOME LIMITATIONS OF THE PRESENTED APPROACH:
IN THIS SECTION WE discuss the application scope of our adaptor, first, without giving examples of such applications. We discuss why completely open systems are not feasible and why our adaptor is useful even in situations where exact requirements on the interface, but also on the usage of that interface, are necessary. Afterward we give some immediate practical applications of our adaptor.
Some conflicts are impossible to mediate. In section 6.1.2 we have given examples of such conflicts. A notable example of such a conflict was our line example. In this example we assumed that the line actor would always have a lock on a square somewhere on the whiteboard. Another client required the possibility to lock the entire whiteboard. If those two actors were to work together on the same whiteboard then the second actor (which wants to lock the entire field), would never have a chance to do something because it would never be able to lock the entire whiteboard (simply because the line actor will always keep one square locked). This conflict is notable because it demonstrates that even realistic locking strategies can easily lead to impossible to solve concurrency conflicts. In this example the conflict arises from a subtle difference within the usage scenario involved.
This indicates that truly open systems, such as advertised by the mobile multi agent scene, might be impossible to build: a small change to a component or to one of the involved usage scenarios might transform a 'possible to mediate conflict' to an 'impossible to mediate conflict'. For open systems this means that probably the only way these systems can work is if everybody agrees on exactly the same specification. This specification should be exact (formal) with respect to what kind of usage scenarios and what kind of interfaces are allowed. This is clearly not realistic for open systems.
|
The statement 'open systems require an agreement between all partners to follow exactly the same specification' might sound contradictory with our initial motivations. However an exact specification does not prohibit from allowing certain degrees of freedom. An adaptor might shift the problem of being compatible with a very narrow interface description to being compatible with a wide range of allowed interfaces (within certain degrees of freedom). We explain this by elaborating on the different possibilities (all pictured in figure 13.2)
|
We initially started our investigation trying to foresee what the Internet would be in the future. We tried to foresee what kind of problems will arise. We concluded that conflicting concurrency strategies would become a large problem. In the approach we have presented we tackle this problem by assuming that every component will offer a concurrency strategy and that these will need to be mediated. In doing so, the problem of 'writing correct adaptors' is shifted to the problem of 'specifying the required/provided concurrency behavior precisely'. The result of this approach is that the developer does not have to implement the adaptors directly, but that he instead has the responsibility of writing a complete and consistent Petri-net. The developer can use such a Petri-net to write down a strategy specifically suited for the problem at hand. Instead of sticking to one of the standard concurrency approaches he can now easily specify which resources need to be locked simultaneously. Depending on the required usage scenario it might be possible to require (or provide) better suited interfaces for the needed functionality. For instance it is possible to write a client component that specifies that it wants to lock an entire line and its trail immediately. This is pictured in figure 13.3 and would make writing our line actor more easy. On top of this is the fact that this Petri-net exactly specifies what is needed and thus makes mediating conflicts involving this concurrency strategy even more easy.
|
If we follow this line of thought we end up with concurrency servers on which different components can publish the resources they have and the behavior they need. This might become even more interesting if, instead of supplying such a server with a Petri-net of the behavior, we could supply such a server with an actual program, that represents our original component. This program can express even more behavioral information than our Petri-net descriptions. In these cases mobility of components would be very useful. This is pictured in figure 13.4.
Below, we describe some applications or application domains where our concurrency adaptor, as presented here, is useful. We first give a number of examples where the adaptor is useful in situations where the interface not necessarily changes. The applications we have been looking for were constrained by a number of requirements
A computer collaborative supported work environment often contacts a central server that offers some kind of whiteboard (in the broad sense of the word). Users can join such a whiteboard and through it communicate with each other, while still following some rules. In this case, the client program offers a user interface to the end-user, while the central server stores and manages the data. In such a multi-user environment it is necessary to lock operations on the whiteboard. Depending on the kind of locking strategy required by the client an adaptor might be necessary. This is a case where locking is necessary, the resource can be described statically, the clients might require a different concurrency strategy and it can be considered as an event based system. Therefore, our adaptor is useful in this context.
Databases are another area where our work might be useful. Databases are often centrally placed servers that allow different clients to update records. To guarantee that this happens in a consistent way, a locking interface needs to be present. If such a database would, together with the schema's of the different tables, also supply a Petri-net that describes the offered concurrency strategies, then it might be easier for clients to access this data because they in turn can define a more suitable required concurrency strategy. This can make implementing clients for databases much easier.
|
We now present an example of conflicting interfaces within a domain other than concurrency interfaces: socket interfaces. Later on we will use this example as an illustration of how (part of) the techniques used in this dissertation can be used to implement adaptors in other domains.
When we were developing Borg, the biggest problem of creating a working version for all platforms was, contrary to what might be expected, not the user interface, but rather the socket libraries. We have developed Borg for PalmOS, Windows, Linux and Macintosh. Every operating system offers its own version of a socket library. This has posed major problems because not every socket library is as easy to use as any other socket library. Table 13.1 contains the most important differences between Linux and windows.13.1 Between the two presented API's some important differences exist:
CONFLICTING INTERFACES REPRESENT A major problem that has to be dealt with. To approach this problem we have investigated a smaller, but more accessible problem. The solution presented in this dissertation is not a general solution in the sense that it can be straightforwardly applied to other domains. Nevertheless, in identifying solutions to a subset of a smaller problem, we have tried to keep both the solution and the subset of problems as general as possible. This allows us to identify guidelines which can help shaping new solutions to other problems of conflicting interfaces. We will now discuss these guidelines and explain them by giving an example of how the previous identified problem of conflicting socket libraries might be solved. This example will be presented in another font.
The first thing that needs to be done is to explore the environment. The best way to get grip on a certain domain is to explore existing real world conflicts and investigate what exactly causes the conflicts. In this dissertation we have done this by identifying a set of concurrency conflicts (chapter 6). The process of exploring the environment should result in a set of variabilities (that need not to be strictly orthogonal). In a later stage the variabilities are used as input for defining the requirements and possible solutions.
Once this is done, we should check whether it is necessary to represent these conflicts in a scaled down model. This might be necessary to reduce the technical problems involved and/or to speed up the process of testing conflicts (the prototyping cycle). In this dissertation we have created our own component model instead of using standard CORBA services or other available techniques (chapter 2).
Important questions:
Once the environment is explored, it is time to look for a suitable reference frame. A reference frame is part of the environment and is common to all involved conflicts. The reference frame should be chosen very carefully because, later on, it should allow for a measurement of the correct working of an adaptor. In this dissertation, we have used as a reference frame the assumption that the 'functional link' between the involved components was compatible (chapter 7). If we would have chosen a reference frame such as 'every component can send a message', then we would not have been able to define what a good working adaptor was exactly. Therefore, when choosing a reference frame, it is important to find good answers for the following questions:
Once a) the goal of the adaptor and b) a usable reference frame have been identified, it becomes necessary to explicitly state the requirements of our adaptor. Before we can state the requirements more formally, it is often necessary to introduce extra information. By analyzing the variabilities and the problems of expressing a requirement, it becomes clear what kind of extra information is needed. In a later stage we will need this information to create the requirements for the adaptor.
Some standard questions for every variability:
Once the necessary, but missing, extra information is identified, we should be able to state the requirements explicitly. To do this we can again rely on the variabilities we have identified earlier. Which of these variabilities does the adaptor need to understand ? Can we define what a good working adaptor is for such a variability based on the reference frame ? In this case it might be necessary to re-investigate the reference frame, however we consider this to be part of the prototyping cycle. When defining the requirements it is important to distinguish between different kinds of requirements. These are:
The missing information must come from somewhere. Either the user of the adaptor supplies it, or the implementors of the different interfaces have to supply it. Essentially where it comes from does not matter, as long as it is there. To motivate users to supply the missing information, it might be helpful to make it useful. In this dissertation we have done this with our Petri-nets, they not only help in creating an adaptor, but they also help in detecting all kinds of interesting features of an interface/component (see chapter 3).
Secondly, the process of writing down the missing information should itself not be more difficult than the process of writing an adaptor. Therefore, it is necessary to carefully create a language that is able to capture all essential information in such a way that it is intuitive and expressive enough. However, here it should be avoided to introduce a Turing-complete language because then a number of the requirements might suddenly become impossible to verify. So, keep an eye on how the requirements can be validated afterward.
Questions with respect to the missing information are:
Once the missing information is present and the requirements are known, it is time to look for solutions. This involves finding out which requirements can result in a compilation of an adaptor, and which requirements require the insertion of a general algorithm. In both cases it is important to decide what kind of internal representation will be used. In this dissertation the internal representation of the liveness-module and the enforce-action module was similar to the Petri-net representation, however, the representation of the concurrency module was slightly different and more tuned toward its efficient working.
When using learning algorithms, make sure to use an automatic process to verify whether the representation is good. In the end, a computer needs to use the representation to (re)act correctly to certain situations. Therefore it is important to have a representation that quickly leads to a correct behavior. In other words: the bias of the representation should be measured and should be optimal. In this dissertation we have verified our Petri-net representation of the liveness-module by means of a genetic algorithm. E.g, the representation of the liveness module.
Once solutions have been identified to match the different requirements investigate how these solutions can be modularized. Afterward investigate what kind of topology is necessary to make the entire adaptor work. Important issues:
The evaluation of these socket nets should be able to handle the notion of threads: creating ones when necessary, managing incoming threads and destroying others. This can be done by using the 'session' knowledge present within the different tokens.
The above guidelines of course should be embedded in a typical prototype/experiment cycle. With these guidelines it should be possible to create adaptors for domains other than concurrency conflicts in open client server architectures. Essentially , any place where one interface is fitted with multiple implementations is a candidate for automatic adaptation.
An area that is not investigated in this dissertation are meta-conflicts. These are conflicts that occur between the different interface-descriptions. We have already touched this issue in section 12.7.1. However, before the problem of meta-conflicts can be investigated, a large number of adaptors based on extra formal documentation needs to be available.
In our solution to the liveness problem we assumed that the programmer can easily specify checkpoints in the source code. However, if this is not possible, test-scenarios can be used. A test scenario specifies which traces are considered to be 'good'. Based on this information, rewards can be obtained. Illustration 13.6 shows two test-scenarios. Scenario 1 shows that the client will send out a Lock and expects a LockTrue in response. Afterward an Act will be sent out and an ActDone is expected to return. Finally an Unlock is sent out and an UnlockDone should return. The second scenario illustrates more or less the same, the only difference is that now two Act's are sent out. The second scenario can be used to teach a learning algorithm that after the first act message multiple other act messages might be sent out. Given these two test-scenarios one can easily implement a tracker in the Petri-net that will assign rewards when appropriate.
If we go even a step further, it might be possible to convert a number of test-scenarios automatically to the Petri-net description of a component [EK98]. However such a process heavily relies on a correct generalization. Given the two scenarios in figure 13.6, one might assume that the component will also work for 3 or more acts at the same time. However, nothing ensures this is the case. Depending on the algorithm another generalization might occur.
Instead of using Petri-nets and separate checkpoints, the developer of the Petri-nets could also tag arcs with a priority, which would resemble the rewards offered by the different components.
We have made the assumption that every future reward is statistically correlated to the current state of the Petri-net involved. This was necessary to demonstrate that the mapping of our liveness problem to a reinforcement learning problem was well defined. However, if we have made the assumption of a reward/marking correlation, we only did this from a practical point of view: for the programmer it is easy to specify checkpoints. In the conducted experiments we have observed that in general this did not form any problem, however a large scale investigation of this property would be very useful, especially to learn the boundaries of our approach.
We have already seen that if the reward is not entirely dependent on the locking strategy some oscillation in the values will occur, on the other hand if the reward is used to express a hidden property of the locking strategy it will not be able to learn it.
In this dissertation we did not fully investigate how peer to peer concurrency could be managed. All the examples we have presented work with only one server. As explained in section 12.7.2 there are some major problems involved with peer to peer concurrency. The biggest problem of such an environment is that not all communication can be mediated and that different components might communicate indirectly with other components. To solve this problem adaptors themselves should be able to coordinate their behavior and learn how to behave to support an implicit present overall behavior. Theoretically, game theory might be applicable.
Game theory [Nas50b,Nas50a,Nas51,Nas53] requires agents that choose a discrete action. Depending on the action they choose they can either win or lose. The outcome of a game not only depends on the action chosen by one agent but on the interaction between all the chosen actions (this is typically the situation in peer to peer networks). A game is typically characterized by 5 elements:
As explained in section 12.3, all investigated concurrency interfaces behave as simple finite state machines. Therefore it might be possible to deduce the behavior of an interface in an automatic way. For instance, a learning algorithm could, given an API of a component's interface, try out which actions are possible at which time and construct a Petri-net description of the interface automatically.
To do so a program , that is supposed to learn a Petri-net description of component , which offers only a syntactical interface description , could follow different strategies:
This dissertation focuses on the generation of intelligent adaptors. We made the assumption that we have the concurrency strategies specified as a Petri-net. From a pragmatic point of view, this is very useful because it offers the developer all kinds of interesting information. However, the essential reason why we need Petri-nets is to introduce determinism and avoid that the liveness module could bring the client components in an invalid state by sending out a wrong message. E.g., if the client component does not know that before the first setPosition, a joinActor needs to be sent, the client component might fail to work properly.
In the enforce-action module we also need a similar deterministic property. This module needs to know how it can bring the server component in a certain state. Here we assume that the server is well written and that every possible synchronization action that can be invoked upon the server can easily be undone. Of course, to avoid denial of service attacks and to increase its robustness, every good server offers this kind of functionality. However, in general, the concurrency module needs be sure that it can deduce deterministically what to do and that it will always be able to reach a certain state.
Within the concurrency module a similar deterministic problem occurs. However this problem cannot be solved easily. The concurrency module resolves race conditions. However, it is unable to resolve deadlocks when it detects them. The reason is that it is unable to anticipate a deadlock, because it has no knowledge of it, and that, once it has detected a deadlock it is unable to undo previous actions on the involved components.
From these three modules we see how required determinism poses problems. During this dissertation we have tried to generate adaptors that work without making mistakes that cannot be revised afterward. The reason why we imposed such requirement is that all real-world computation happens in a controlled non-deterministic (hence deterministic) way. If two components communicate, there are exactly two components and the functionality offered by those two components is exactly understood. If an adaptor wants to mediate the differences between those components it should be deterministically correct, otherwise it would certainly bring disaster upon one of both (if not both) components. This deterministic view on computation is an illusion, for in open distributed there are many opportunities when a component might fail; all too often the reasons lie within uncontrolled upgrades or unreliable infrastructure. Nevertheless developers still hold on to an illusion of determinism and construct programs that behave completely deterministically.
In this dissertation we made exactly the same assumption. However, one can also approach the problem from the opposite side. One might be tempted to embrace the non-determinism found in open distributed systems and investigate how different but similar working components might emerge behavior that satisfies certain requirements.
By exploiting non-determinism one might create programs, which act creatively and intelligently, because they would be truly able to find and apply clues in the environment which a deterministic process cannot see. This kind of research however is still barely started and certainly not available in the field of 'services' offered on the Internet. Some pointers might give a lead such as swarm intelligence [KRCE01], amorphous computing [HAAC+00] and cellular automata [TM87].