IN THIS CHAPTER we validate our approach by observing the experimental results. First, we will briefly summarize our approach and the results. Second, we make some important observations based on the experiments and third we present some technical limitations of the presented work.
WE WILL NOW BRIEFLY summarize our approach and the experimental results. In our approach we created a general concurrency adaptor that can be used to mediate the differences between conflicting concurrency strategies: one server and multiple clients. As stated in the introduction section 1.3.5, the adaptor we wanted to create should be able to mediate concurrency conflicts in such a way that no conflict arises. We have defined a conflict as a situation in which the mutually agreed behavior between the different communicating partners cannot be executed over the interconnection between them. In our case study, the whiteboard, the mutually agreed behavior between the different actors was not to cross each others boundaries.
The adaptor itself contains three modules. The first module is an enforce-action module which bypasses the concurrency behavior of a server component. This module makes use of a prolog program. The second modules is the liveness module. For every client component present, a liveness module will be created. Such a liveness module bypasses the required concurrency strategy in such a way that the client component can continue with its core functionality. The liveness module has been implemented by means of a reinforcement learning algorithm. Experimentally we observe that this approach is good if the client component returns correct rewards (that is rewards that are statistically correlated to the state of the involved Petri-net). The third module is the concurrency module, which takes care that no race conditions occur within the message order between the participating clients and the server. This module has been explained in chapter 10.
|
IN TABLE
We will now observe some of the behavior of the 22 adaptors:
AN OBSERVATION WE CAN MAKE about the current concurrency strategies is that they are essentially very simple. Even the rollback-able concurrency strategies, which are not necessarily easy to implement, are often nothing more than finite state machines. This statement is based on two observations. First, our reachability analysis can deduce very quickly how to bring a server component in a certain state. Secondly, our liveness module learns very quickly what to do in a certain situation. This is to be expected to a certain extent because we have carefully created the representation of our situation-recognition system. However, this representation itself is also nothing more than a finite state machine.
If we think about interfaces (not only concurrency interfaces), it is not commonplace to hide features and functionalities within the API. On the contrary, most interfaces are designed to make it easy to access all the necessary operations and modify the state of an object/component.
From these two observations one might think that it would have been more appropriate to use finite state machines to describe concurrency strategies. This is not the case because the concurrency strategies themselves often contain parallel resources that act independently. This is very difficult to express with standard finite state machines. In section 3.7 we have given two examples that demonstrates the need and advantages of using Petri-nets.
IN THIS SECTION WE DISCUSS some technical limitations and observations we can make, based on the experiments we have performed.
A limitation to our work is the problem of late joiners. We did not investigate what will happen when a new component joins a set of already communicating components. Currently, this situation is not allowed. However we feel this will not constitute an obstacle. A keep-alive module can be created for every joining component and the concurrency strategy can be extended to allow new components to join.
During our research we implemented a new component system. We conclude from our experiments that this component system offers a lot of advantages, especially with respect to glueing components together. Its high level reified message representation, together with its disciplined non blocking messaging system allows for easy adaptor creation. In comparison with object oriented languages, where the problem of glueing together different libraries and programs is well known, this is an advantage. For instance,
During the experiments we have used an event based component system. This allowed us to write adaptors easily and is a model representation of an open distributed system. However, open distributed systems are not the only area where conflicting interfaces occur and not all applications are written by using an event based approach. If we look at applications that make use of object oriented frameworks then we encounter similar problems because not always all upgrades of part of the application are entirely backward compatible [LSMD96]. A small alteration in the provided behavior or a wrong estimate of the usage scenarios can result in a complete application breakdown. This also happens for applications that dynamically link to external libraries (.DLL's offering extra functionality or plugins). Typical for this kind of object oriented application is its seemingly 'sequential' behavior over object boundaries. In fact, this kind of application is often written within a thread based context and they rely heavily on a shared memory (in the broadest sense of the word). However, the use of a shared memory as a communication medium between threads (or between different parts of an application) makes it very difficult to isolate the communicating partners and modify the flow of information between them. Since it is necessary to modify the information flow to resolve any conflicts it might be very difficult to use a similar approach as the one we have presented. Appendix 13.8.6 contains an extensive explanation of the difficulties of using such a thread based model.
Another observation that can be made is that a randomized testing procedure (such as is implemented by different learning algorithms) is the best guarantee to write well-tested and robust software. Actually, our experiments took longer than expected due to the fact that our learning algorithm always created unanticipated situations. This happened in more than one area as discussed below:
THE USE OF PETRI-NETS AS AN INTERFACE DESCRIPTION LANGUAGE allows for a better software development process. Not only is the code documented in a readable and formal way, the formal properties of the documentation allow for automatic testing of the underlying component. Given the Petri-net and a correct component we then can
ALTHOUGH THIS DISSERTATION MAKES NO claim about performance, we will briefly discuss the performance of the entire adaptor. To do so we will discuss three separate estimates. For the entire adaptor we will investigate
Giving a performance estimate of the liveness module is difficult because the performance of this module largely depends on how well trained it is. Initially it might need a large number of messages and rewards to understand which behavior exactly is required by the client component. However, once a correct behavior is installed it will behave more or less constantly, depending on how much exploration is allowed. Here, we will only consider the case where the liveness module exploits correctly learned information.
To give an estimate of the time spent within the liveness module,
we observe that the liveness module needs to find out a) which transitions
are enabled and b) select from the enabled transitions an appropriate
one by means of Q-learning. Step a) can be done in
by using the locality property of Petri-nets. Step b) can be done
in
because the number of active 'added' transitions is kept
constant. During execution of the liveness module, non-fit transitions
are removed and new transitions are added. The process of removing
a transition is
and the process of adding a new transition
is also
. This results in a time estimate of
.
The memory requirements of the liveness module are the memory requirements
of the Petri-net describing the state of the client component and
the newly added transitions. Because these newly added transition
are bounded to a certain maximum, we consider it to be a constant.
The size of the Petri-net is bounded by . Hence
.
Later on, when giving a performance estimate of the entire adaptor
it should be taken into account that the concurrency adaptor contains
multiple liveness modules. Therefore we will add an extra index to
the liveness measure:
and
Giving a performance estimate of the enforce action module is even
more difficult. We currently rely on a prolog implementation to decide
how to enable a certain action. This makes it difficult to give correct
estimations. Potentially both the memory estimate as well as the timing
estimate might be exponential in function of the necessary search
depth,
. So the worst case time estimate
is
. The worst case memory estimate
is
because we also
need to take into account the memory-requirements of the server side
Petri-net.
However, as observed during the experiments the search depth is always
very small and most concurrency strategies offer the possibility to
reach a certain state quickly. Therefore we will consider a practical
time estimate of
. And, similarly, a practical
memory estimate of
.
The concurrency module needs to keep track of all the enabled transitions
for every client component indirectly connected to it. For every client
a set of controlled transitions and managed transitions will
be kept in memory. The Petri-net of the server is not necessary within
this module. This gives a memory estimate of
.
The time spent within the concurrency module is a constant. For every
incoming request a cross-table is checked and a decision is taken
immediately. This leads to
.
To determine how many messages are sent between the client components and the server component we will consider two cases: functional messages and synchronization messages.
Every functional message going from a client toward a server will go through several connections:
The behavior of synchronization messages is not as simple as the behavior
of logic messages. The problem comes from the fact that synchronization
messages are not simply passed through from module to module, but
are a) altered to represent the possible choices between the liveness
module and the concurrency module and b) are not present between the
concurrency module and the enforce action module. However, this observation
does not prohibit us from making a global observation about the synchronization
communication: We know that the synchronization messages introduced
by the adaptor is minimal to support all components: the enforce action
module will only send out the necessary synchronization messages,
while the liveness module will have learned which synchronization
messages are favored by the client. From this observation we know
that the adaptor will not introduce useless synchronization messages.
This already allows us to give an estimate of
.
However, to estimate how many extra messages might be added we give
a trace of a synchronization request from client to server:
From the results of the functional messages and the synchronization
messages we can conclude with an estimate of
.
From the previous performance estimate we can now summarize the worst case performance:
If we consider, as explained before, that the time to verify a transition's
state (enabled vs disabled) is constant and we assume that the prolog
implementation is able to deduce a correct path in then we
can give an expected estimate of:
This means that we expect that the adaptor needs as much memory as required by the different states of all involved components. Also, the time necessary to handle one message is constant.
IN THIS SECTION WE WILL DISCUSS our initial motivations and compare them with the solution we have presented. Our main focus here will be to investigate how our prototype could be made to work within the context of open distributed systems, multi agent systems and embedded systems.
Our adaptor as it is now does not run automatically in open distributed systems. Below we discuss two obstacles still remaining. The first being adaptor deployment (or who decides to run an adaptor) and the second being the problem of meta-conflicts (conflicts between the involved Petri-net descriptions).
A very simple fact about our adaptor is that, before it will be used, somebody has to insert it somewhere. We will now discuss the implications of this.
Typically, the client is the first one to observe any conflicts with the server. This means that the client will be the first to want to introduce an adaptor. This is not necessarily easy because
One of the motivations behind this research was the fact that standards in open distributed systems are sometimes merely used as a means to stay in control of how software is used instead of being a means to communicate with other software. If the intention of a company is to be incompatible to force partners to make certain decisions, then there is nothing that will prohibit such a company from doing so, not even Petri-nets. However, the description of this behavior was only used as a motivational example and not as the problem we wanted to solve.
On the other hand, companies that want to be compatible with other software will find that our proposal of using Petri-nets offers a big advantage that renders their usage worthwhile. The strength of the documentation format comes from automatically checking completeness and consistency. Completeness being that all messages that a component understands and uses are described within the Petri-net. Consistency being that the documentation is compatible with the behavior of the component. Both properties can be checked automatically. In other words, Petri-nets help the developer in the software development process by offering better and more accurate documentation.
In section 1.1.1 we argued that defining one standard in open distributed systems is unrealistic because competitors will always try to adopt the standard and unexpected incompatibilities will arise. This problem could also occur between the different formal interface descriptions. This kind of conflicts are meta-conflicts. The problem of meta-conflicts is wide ranged; from slight discrepancies between the Petri-net descriptions to conflicts with entirely new, possible better, formal descriptions.
If the formal interface description are entirely compatible then it might be possible to create a super interface description that can express all details of the different formal descriptions. The only thing that needs to be done then is implementing converters from the different interface description languages to this super interface description language. However, it could also be possible that required information in one interface description language is simply not present in another interface description language. This would complicate this method. However before this kind of meta-conflicts can be investigated, a number of realistic conflicts between interface description languages need to be investigated. Only then can this problem be approached pragmatically. This remains future work.
If there are only slight differences between the Petri-net format used, or between the terms used within the Petri-net then other approaches can help:
A second motivation for doing this work was the problem of concurrency management in mobile multi agent systems. Because communication happens in a peer to peer manner between agents, we now discusses how our solution relates to the problems of peer to peer systems. First, we discuss how our adaptor preserves the required behavior if no central locking server is used. Second, if a central locking server is used to coordinate the behavior of a group then we see how our approach fails to offer a solution.
In this section we illustrate how our adaptor correctly mediates concurrency problems in a peer to peer application that makes no use of a central coordination server.
![]() ![]() |
In a mobile multi agent system there is at first sight not really a difference between client-components and server-components. However, if we look at the behavior they require or provide (or in other words, which agents implement a certain functionality), then we can actually make the difference between a client component and a server component. A client agent will expect a certain functionality from another agent, while a server agent will provide a certain functionality. However, contrary to a typical client-server setting, in this situation every agent can be a server as well as a client. To easily support such a model we will assume that the server behavior of such an agent is offered on one port, while the client behavior is offered on another port. This allows us to introduce adaptors on every agent (peer) that offers a service. The resulting interconnections are pictured in figure 12.2.
Because every agent can be a client as well as a server, it becomes difficult to identify a 'session'. In the typical client-server architecture, the client manages a session and tries to get things done on the server. Within a peer to peer system, sessions hop from one agent to another and cannot be localized anymore in one agent. This might lead to deadlocks between communicating peers, without the possibility for the adaptors to do something about it. This is illustrated in figure 12.3. The adaptors cannot resolve such a deadlock because they are part of the waiting loop. However, this problem has little to do with the ability of the adaptor to 'mediate' the conflicting behavior. In fact, if the application is programmed in such a way that it relies on concurrency strategies that lead to unwanted behavior then this unwanted behavior will also occur with adaptors, otherwise it will not.
In a peer to peer application there can also situations where our adaptor fails to work. The example we present to illustrate this uses a central locking server that is used by all components to identify sessions and control access to the application's resources.
In peer to peer systems it is often necessary to bring multiple peers from one consistent state to another consistent state. This however requires from every peer the ability to obtain a number of locks, distributed over different peers. Once all the locks are held the session can apply the changes on the involved resources and unlock them again. Often this is done by passing session id's from one peer to another such that receiving peers can verify whether the resources they hold can be accessed by the incoming session. However, the session id's themselves are often provided by other components than the receiving peer itself. For instance, consider a transaction server such as pictured in figure 12.4. The transaction server will be used by any peer that needs a transaction id (this will be our session id). This transaction id will then be passed from peer to peer (without going through the transaction component).
In this setup, multiple concurrency strategies can be present. First there are the concurrency strategies offered by one peer towards another peer, secondly there are the concurrency strategies offered from one peer toward the transaction server and thirdly there is the concurrency strategy offered from the transaction server toward the peers. We will assume that we have mediated the conflicts by placing at every server interface a concurrency adaptor. This might lead to problems:
One of the motivations behind this dissertation were conflicts between connected embedded systems. The rise of smaller and mobile embedded devices increases the chance of encountering interface conflicts. In this section we investigate how suitable our solution would be for this kind of domain. First we must note that the biggest difference between desktop software and embedded software is the constraints put on it. Among others, there can be memory-constraints, speed constraints, bandwidth/latency constraints and real time constraints. Secondly, aside from these constraints, there is also an entire process involved in creating embedded systems. Typically first a prototype is developed which is, as the project continues, scaled down into a much smaller artifact. During the entire process, quality control is of utmost importance, because, especially for consumer devices, it is not always possible to fix a bug once the device has been sold [SEE99]. From a high level point of view this provides an opportunity for our adaptor because it decreases the likelihood of bugs introduced due to concurrency strategy conflicts. However, as embedded systems are developed pragmatically, we will now discuss how it would be possible to reduce the requirements of our adaptor in such a way that it could fit within certain constraints.
Very often embedded systems are limited in their memory. There are two approaches that might help in reducing the memory requirements of the adaptor:
Certain embedded systems have low bandwidth constraints, which means that they can only send and/or receive a small amount of bytes per second. As stated in section 12.6.4, our current implementation of the concurrency adaptor requires 3 times more messages than there are messages posted by the client and server component together. This could weigh heavily in a low-bandwidth environment. Therefore we now investigate how we could reduce the bandwidth used by the adaptor by decreasing the amount of extra message sends between the modules or by making the penalty of doing so less severe.
There are two distinct approaches one can use to make the adaptor more efficient with respect to speed:
Real time constraints often require a much better quality control process [SEE99]. This process should be extended to our adaptor as well.
Werner 2004-03-07