Subsections


2. Event Based Models for Distributed Systems

SINCE THE GOAL OF THIS DISSERTATION is to write an intelligent adaptor between conflicting interfaces, we need to determine what is required to be able to write such an adaptor. The first thing we need is the ability to intercept all communication to and from a component, in fact isolating it from its environment. We have to intercept all of a component's communication to the outside world, otherwise it may be impossible to adapt the behavior of components. Because open distributed systems are very closely linked to technology, writing components and adaptors are also closely linked to it. Therefore we need to choose a good technology that allows (or enables) us to write adaptors easily.

Figure 2.1: This picture shows how difficult it can be to isolate a component from its environment when all kinds of different communication technologies are used. The red blocks show where we need to intercept a connection.
\includegraphics[%
width=0.70\textwidth]{FluffyComponents.eps}

Figure 2.2: This picture shows how one can isolate a component from its environment when it is loosely coupled with its environment. The only links with other components are by means of a (socket) connection.
\includegraphics[%
width=0.70\columnwidth]{ModularComponents.eps}

Depending on the technology used, one can have components that can share state, that can communicate with each other by sockets, RMI calls, tuple spaces or shared disks (see figure 2.1). In contrast, we can have a model that simply communicates over a single link (see figure 2.2). It is clear that writing an adaptor for the first kind of technology requires a serious amount of code to intercept all behavior and modify it as necessary, while the second example only requires us to intercept one or two sockets. How we can intercept such a connection and how we can understand what is sent over such a connection will be investigated in this chapter. In general we will use the SEESCOA component model [SEE99], which is implemented as an event based system that allows us to place adaptors on connections easily. This chapter discusses this model and relates this model to open distributed systems. We will talk about the history of the model, introduce the basic concepts, how services are found, the setup of connections, how communication takes place, management of sessions, concurrency behavior and finally we will explain how one can write adaptors with this model.

2.1 History

THIS CHAPTER USES IDEAS from two event based systems. The first is the mobile multi agent system Borg, the second is the SEESCOA component model.

2.1.1 Borg

Borg[BFDV01] was developed from 1997 up to 2002 by the author of this dissertation. The original goal of Borg was to provide a platform that can run Borg components on all computers that run the Borg virtual machine, providing strong migration and location transparent routing. The system itself is an extension of the Pico [D'H95] virtual machine. Pico is accessible via an extremely simple language, yet its expressiveness is very high, comparable to e.g. Scheme [SJ75]. Pico semantics are defined by a set of 9 evaluation functions that are supported by a storage model and a computational model. The storage model features full storage management and reclamation; the computation model is based on a pushdown automaton that manages expressions and continuations on a double stack. The Borg virtual machine is written entirely both in C and in Borg itself. The user interfaces which accompany the virtual machine run on all kinds of platforms. For Linux users: KDE 1.1.2 and a GNU readline based command line interface. For Macintosh users there is a legacy MacOS 9 version. For windows there is a Windows user interface, which is based on the Cygwin libraries. And finally there is a version running for the Palm Pilot (PalmOS 3.5).

2.1.2 SEESCOA

The second event based system is the component system made for the SEESCOA[SEE99] project. SEESCOA is a project funded by the IWT and 6 industrial partners. The project itself is a cooperation between the University of Gent (UGent), Katholieke Universiteit Leuven (KUL), Limburgs Universitair Centrum (LUC) and the Vrije Universiteit Brussel (VUB). The industrial partners are Phillips, Agfa Gevaert, Alcatel, Barco, Imec and Siemens. SEESCOA stands for Software Engineering for Embedded Systems using a Component Oriented Approach. The component model used in the SEESCOA project is developed by the same team which developed the Borg virtual machine. This component model was written entirely from scratch, using the experience gained from Borg. This allowed us to introduce some necessary semantics that were difficult to capture in Borg due to some implementation issues. The SEESCOA component model is written entirely in Java and focuses on re-usability by means of pluggable adaptors. The fact that the project aims at embedded systems doesn't weaken the component model. Currently, embedded systems need a higher degree of connectivity and, as such, the system itself becomes a distributed system as well. As said above, the remainder of this chapter focuses on the SEESCOA component model.

2.2 The Model

THE SEESCOA COMPONENT MODEL is largely based on the Borg mobile multi agent system. The model itself contains four important concepts, that we explain below:

Figure 2.3: The component model as used in this dissertation: the left side of the picture is one host, the right side is another host. The bottom half of the picture shows that we run a full component system on every participating host. The upper half are the components, as executed by the component system. Every component can have ports, ports can be connected with each other.
\includegraphics[%
width=1.0\textwidth]{ComponentSystemOverview.eps}

The above model takes ideas from Actor [AMST97] systems, the PI[Mil99] calculus and ROOM[SGP94]. The most important difference with Actor systems is that actors are connectionless, while we do have the concept of connections, which will become very useful, as explained below.


2.3 Naming & Finding Services

AN IMPORTANT PROPERTY of open distributed systems is their highly dynamic nature. In comparison to standard object oriented technology, where a linker glues together all objects of an application before it is started, the programmer has to set up explicitly the links themselves. This is usually performed by specifying information on what kinds of components the programmer is looking for, finding compliant components, and connecting to these components. This addresses the following two subproblems: first, how can one component reference another component, so that they can make contact and communicate and second, how can a component find out which other components offer a certain service.

Before we can contact a component, we need the ability to reference it, just as identifiers in an object oriented language are used to refer and contact objects at runtime. However now we have to take into account that we have multiple applications sharing one global data space. This means that we should be able to refer to components by using globally unique identifiers. Such an identifier should have the same meaning for every component in the system. In other words, we need an identifier that we can use to send messages to the correct place in the network. The most basic idea would be to use the IP-number and the port number of the machine hosting our component and a local component reference. However, this requires the programmer to write down and hard-code some frequently changing external information: the unique identifier of another component. It is clear that, to ease development, the system should abstract away from such frequently changing information. Therefore, we use local ports, that can be filled in at runtime by the system. We call addressing components through these local ports implicit addressing.

Setting up the link between different ports is done by the underlying system, however, in open distributed systems in general, the problem of finding the correct component that offers the required service still remains. To solve this problem directory services (such as JINI[Edw99]) are being implemented. They offer a central point where service-providers can announce themselves, and where service-requesters can look up other providers. This shifts the problem from supplying the correct service to supplying the correct description and looking up services by their description, which is less work because this description can be manipulated at a central place.

As this is not a relevant issue for this thesis, we will not address this subject further, and we will assume that the components already know with whom they will communicate.

2.4 Connecting & Deployment

AS SAID BEFORE, the underlying component system connects different ports to each other using an application specific connection-broker. This broker incorporates simple lookup and name to address translation services as well as finding other services by specifying their properties. When deploying an application, this broker will receive an input file that describes the connections between all the components and the links to the external environment.

Connections always take place between two endpoints: we do not support connections between more than two ports. When setting up a connection, the component system will ask both parties to offer a port, based on a description of the required properties of the component. The component then normally sends a portid back to the system, which in its turn will use both portid's to connect the ports. Figure 2.4 illustrates how a broker component can request the component system to set up a connection.

Figure 2.4: Message flow when connecting two components.
\includegraphics[%
width=0.80\textwidth]{StandardConnectionPhase.eps}

From the software development point of view this way of working is very nice. The application programmer points to components and the system wires them together. This offers us the possibility to place adaptors on the connections between components.

The problem with this (and other point and click methods) is their very static nature. Sometimes, we need the ability to receive messages from all kinds of really unexpected components. For example, consider a web server, at component composition time we cannot foresee how many clients, i.e. other components, will join.

Since the base system only allows one connection per port we can only allow a fixed number of clients to join. To address this we have added the possibility to use multi-ports. A multi-port is a representation of a collection of ports. One multi-port can be connected to a number of other normal ports. If we send something to the multi-port, this message will be sent to all the ports in the collection, implementing a multicast.

2.5 Communication

IN THIS SECTION WE EXPLAIN how communication between components can take place. We will explain how messages can be sent and received, the explicit representation of messages and why this allows for easy adaptor creation.

2.5.1 Sending and Receiving Messages

At component creation time a component makes a number of ports available for communication. At a certain point the system will connect these ports to ports of other components. Sending a message to another component is performed by offering a message to a local port (which is connected to the port of another component). In the model we use, this is done by invoking the sendMessage method upon a port, which will immediately return. At the moment a message arrives on a port handleMessage will be invoked on the component. The standard handleMessage behavior is to immediately invoke the method corresponding to the message. E.g.: when a message foo comes in, first handleMessage will be called. If that method doesn't handle the message, the method foo will be called. The only way in which components can communicate is using through disciplined communication: which is sending and receiving messages to each other using sendMessage and handleMessage

2.5.2 Message Representation

Every message in the system is explicitly represented as an association-list of parameters and arguments. In the remainder of the text, we will call the key/value pairs in the association list fields. Fields can be written or read by using the putField and getField methods of the Message class. One predefined field is always present in every message: Invoke. This field names the message and is used, in the default implementation, to invoke the correct method upon the receiver. This representation of messages, along with the putField and getField methods allow an adaptor to handle the messages without needing to know their full content. Messages are deep copied entirely upon sending: a copy of the parameter-strings and a deep copy of all the arguments is made.2.1

The example below illustrates how one can create a message and insert fields. The example also illustrates that the standard handleMessage behavior is to invoke the Invoke field. Hence, ShowIt() will receive the message and can retrieve the Text field.

class MyComponent extends Component 
  { 
  public Port a; 
  public MyComponent() 
    { 
    a = createPort(``a''); 
    Message msg=new Message(); 
    msg.putField(``Invoke'',''ShowIt''); 
    msg.putField(``Text'',''some text to show''); 
    a.sendMessage(msg); 
    } 
  public void ShowIt(Message msg) 
    { 
    System.out.println(msg.getField(``Text'')); 
    } 
  }

2.5.3 Syntactical Annotation

For clarity, during the rest of this dissertation we will resort to a more simple syntax for communication2.2. Specifying a component is done with the component keyword, while declaring a port is done with the port keyword. To make sure that an incoming message is immediately invoked upon the component invoke on this can be placed behind the port declaration. If a message needs to be handled explicitly by the component, handle on this should be used.

To designate a message handler, we use the message keyword and to create a field we use a < and > syntax. If we want to read the value of a field we name the field between the < >, if we want to set a field we use a : (colon). Before the colon we name the field to be set, after the colon we place the value to be assigned to the field. To send a message we use the .. syntax. The first word following .. is alway automatically bound to the Invoke field. E.g.:

component MyComponent

  {

  port a;

  message Init()

    {

    a..ShowIt(<Text:''text to show''>);

    }

  message ShowIt()

    {

    System.out.println(<Text>);

    }

  }

When programming with such explicit messages, often a type cast is needed to make fields within the message accessible. To help with this, the notation <type|field> can be used. This simply expands to (type)<field>.

2.5.4 Motivation

This explicit way of sending, receiving and handling messages gives us a greater flexibility when writing adaptors. It allows us to receive all possible messages and handle these without knowing the full message internals. E.g. in case that we want to write an adaptor we can simply override handleMessage, ignore the content of the message, but still pass it through. For example: a component placed between two other components which simply prints the messages out and passes them along can be written as follows:

component Logger

  {

  port left, right;

  public void handleMessage(Port cameover, Message msg)

    {

    System.out.println(``Message ``+msg+'' from ``+cameover);

    if (cameover == left) right..msg;

    else if (cameover == right) left..msg;

    }

  }

This simple logger component can be placed between any possible two components, without needing to rewrite the Logger component to support new interfaces as they come along.

A second observation about this kind of messages is that this system is truly peer to peer. Any component can send messages to other components, while every other component can receive messages. There is no distinction between server components and client components. They are all both server and client at the same time. Also, it is not required at compile time to specify with which partners we will connect, this is purely done at runtime.

2.6 Sessions

IN THIS SECTION WE INVESTIGATE ONE OF THE CONSEQUENCES OF NON-BLOCKING MESSAGE SENDS: in an extended conversation between two components, we need a way to explicitly keep track what point in the conversation we have reached, we need to remember session information. However, due the non blocking nature of the communications such session information must be explicitly managed. To allow this, we introduce a new mechanism which easily associates messages with sessions.

2.6.1 Non-Blocking

As said before, the message send is non-blocking, which is a model clearly different from standard object calling conventions. However, this non-blocking model supports open distributed systems very well. Open distributed systems can have long latency times and variable network speeds. Sending a message can be instantaneous are can take an extremely long amount of time, therefore a component working in a blocking way, wastes precious time by waiting for an answer to return. Moreover, since the network is unreliable, we have no guarantee that a return will ever arrive, and therefore we might wait indefinitely.

A non blocking model has none of these drawbacks, however programming in a non-blocking way is not easy. One can no longer simply ask another component something, wait for the reply and continue afterward. To do this one needs to remember what requests have been posted to other components and continue within the correct session when an answer to one of the previous requests arrives. To illustrate the difficulties of such a non-blocking send, consider for example a program that calls 3 components in sequence, where the result of one component is passed to another component. Assuming that a blocking send is available, this could be written in a synchronous way as follows:

component Foo

  {

  port a1;

  port a2;

  port a3;

  message Init()

    {

    System.out.println(a3.call(a2.call(a1.call())));

    }

  }

On the other hand, if one wants to write this with a non-blocking sending primitive, one should write

component Foo

  {

  port a1,a2,a3;

  message Init()

    {

    a1..Call();

    }

  message Result()

    {

    if (port == a1)

      a2.call(<Value>);

    else if (port == a2)

      a3.call(<Value>);

    else if (port == a3) 

      System.out.println(<Value>);

    }

  }

In this program the port field in the Result message handler designates the port over which the Result message arrived. Clearly, the second program is far more unreadable as the first one. In larger programs the problem of managing different sessions will become even more difficult.

2.6.2 Session Tracking

To address this, a component should be able to map a message to a certain session, and to remember the state of certain values within that session. To do so, we will pass hidden fields along with every message. These fields are passed along automatically when a message is handled and when a new message is sent out. These hidden fields can be used by any component to mark a message, and identify messages when they return. During the rest of the dissertation we will use the > and < notation (instead of '<' and '>') as a syntax for hidden fields. Using these hidden fields is still more complicated than working with non-blocking primitives, however it cleanly separates the session tracking from the application logic. In the example below we see how a session counter (the >Time< field) is increased representing a notion of time.

component Foo

 {

  port a1,a2,a3;

  message Init()

    {

    a1..Call(>Time:0<);

    }

  message Result()

    {

   >Time: >Time< +1<;

    switch(>Time<)

      {

      case 1 : a2..Call(<Value:<Value»); break;

      case 2 : a3..Call(<Value:<Value»); break;

      case 3 : System.out.println(<Value>);

    }

  }


2.7 Concurrency

THE DESCRIBED EVENT MODEL uses messages to communicate between different components. Because components are single threaded, concurrency problems within components themselves are avoided, which minimizes the possible places in which they can occur. Now, concurrency problems do not arise from the ordering of statements within the components, but only from the order in which messages arrive. The overall application behavior is uniquely defined by the message sequences. It is clear that this makes this model very suitable for experiments with concurrency management.

Figure 2.5: Bank accounting example of a concurrency problem
\includegraphics[%
height=4cm]{BankAccounting.eps}

However some message sequences can still give rise to race conditions, deadlocks and other kinds of unwanted behavior. A well known example is the bank accounting example. Suppose we have three components. The first component is a server component which offers two methods: Read and Write. The second and third components both try to increase the same value at the server component. They do this by reading the value, increasing it and storing the value again. As is shown in figure 2.5, it is clear that the order in which the messages arrives is critical for the correctness of the value.

Classical solution in object based systems such as synchronized and thread based mutexes are not applicable in our situation because this is not a thread based model and it is not an object oriented model. Moreover these classical solutions often lead to more problems in the sense that they are difficult to understand, difficult to debug and give rise to a large number of all kinds of inheritance anomalies.

As we will explain in more detail in chapter 5, an important observation is the that the only place where we can solve concurrency problems is within the component itself: the component should offer locks for the values that can be updated. In fact, since the component already needs to do some kind of session management when it is accessed from different points, it should at the same time also perform concurrency management. This implies that a component actually offers dual interfaces: an interface for its functionality and an interface for its concurrency strategy.


2.8 Writing Adaptors

WE ALREADY ARGUED that the explicit messaging system offered by the component system offers us a greater flexibility to write adaptors. Above, we illustrated how one can implement a simple logger adaptor that can be placed on any connection between components. We will now further show how adaptors can be written by giving two examples: first we illustrate how we can implement a flow-of-control component, which can be placed at any connection necessary. Second we show how setup and connections of components can be dynamically modified by means of a connection adaptor.

2.8.1 Flow Control by Means of Adaptors

Figure 2.6: A producer that produces data faster than the consumer can consume. This results in overloaded queues at sender side (pictured as the long port at the left side).
\includegraphics[%
height=1.5cm]{ProducerConsumerStandard.eps}

The setup in which we will demonstrate our first two adaptor is between a producer component on one machine and a consumer component on another machine. The producer simply grabs images from a camera and sends them out to the receiver. (See figure 2.6). This link goes over a network, so it is possible that the producer produces images faster than the network can handle. This typically results in a producer with overloaded sending queues and eventually an out of memory error on the sending machine.

Figure 2.7: A Producer and consumer pair regulated by two adaptors: a sending regulator and receiving regulator. Both regulators communicate over a separate channel.
\includegraphics[%
height=2cm]{ProducerConsumerControlled.eps}

To solve this problem we need a regulator on the sending side that communicates with a regulator at the receiver side to agree on dropping a certain number of messages. This can only be done of course when the sending side knows how fast the receiving side is receiving messages. This, in turn, turns out to be tricky, because we cannot use the same communication channel we use to send out data: this would place the 'control' messages in the same, overflowing, queue as the images themselves, making regulation substantially more difficult. Therefore we opt for a control channel with a separate queue. An advantage of this is also that the messages on sending side can be simply passed through to the receiving side and vice versa without the necessity to intercept specific control messages.


\begin{algorithm}
% latex2html id marker 910
[!htp]
\begin{list}{}{
\setlength...
...mall\}}{\small\par
}\end{list}\par
\caption{
Sending Regulator }
\end{algorithm}

The sending regulator is given in code in algorithm 1. The sending regulator keeps track of how many message have been sent and how many messages have already been received. If this number is too large new messages are simply dropped. Note that the sending regulator has a method called handleMessage() which is used to pass incoming requests from sender to receiver if there is not too much lag. The component also understands in FlowReceived messages, which arrive on the flow_control port. All messages incoming on the flow_control port are automatically invoked, and will therefore not pass through the handleMessage routine. This makes writing the adaptor more straightforward as the programmer does not need to differentiate between flow control messages and data messages.


\begin{algorithm}
% latex2html id marker 988
[!htp]
\begin{list}{}{
\setlength...
...all\}}{\small\par
}\end{list}\par
\caption{
Receiving Regulator}
\end{algorithm}
The receiving regulator (algorithm 2) is similar: for every four incoming messages, it sends a flow-control message describing how many data messages have arrived. Messages coming from the sending side are simply passed on to the receiver side and messages coming from the receiver side are simply passed along to the sending side.

To relate this work to existing technologies, such as Java RMI, compare this implementation to how these adaptors would need to be implemented in Java RMI. For RMI, both adaptors should implement the interface of the image receiver. This has two important drawbacks. Firstly, both adaptors are no longer generic because they can only work with the camera-components. Secondly, for each method declared within the interface, a pass-through implementation should be provided, which is tedious work.


2.8.2 Placing the control flow regulators at runtime

Figure 2.8: How a regulator generator adaptor can set up adaptors dynamically.
\includegraphics[%
width=1.0\textwidth]{RegulatorGenerator.eps}

One of the problems often encountered with such setups is that components are created and added at runtime. Assume that the camera is always connected to a component receiver, which will create an image decoder when, for example, a new output window of the camera is opened. The component receiver will then automatically set up a connection between the camera and the image decoder. The component receiver uses the connection broker to create new components and set up connections at runtime, therefore it has a connection to the broker. The problem now is placing two regulator adaptors on these dynamically created connections.


\begin{algorithm}
% latex2html id marker 1048
[!htp]
\begin{list}{}{
\setlengt...
...l\par
}\end{list}\par
\caption{
The regulator generator adaptor}
\end{algorithm}

As shown in chapter 3, this can be done straightforwardly by placing an adaptor, which will be called the regulator generator, between the connection broker and the component receiver. We will then not only change the messages sent between the camera and the decoder, but also the messages sent between the component receiver and the connection broker. When a request to generate a decoder arrives at the adaptor (the regulator generator), it will create three new components: the requested decoder, a sending regulator and a receiving regulator. The single connection request from the component receiver, which follows the creation request, is replaced by another setup of connections between

Looking again at Java RMI, there is no standard way to create components at a remote location, and no standard way to make connections between components. Therefore it is simply impossible to write the regulator generator in a generic way, if we would to implement such functionality in Java RMI, we need to define our own standards for component creation and component linking, however this kind of modifications will require the existing components to conform to this standards.

2.9 Summary

IN THIS CHAPTER we have introduced event based systems by means of the SEESCOA component model. We first talked about the history of the model and introduced the basic concepts: components, connections, ports and messages. Second, we explained why an implicit addressing scheme is required in a dynamically changing environment, such as open distributed systems. Third, we have shown how the system sets up connections between any two components and how communication between these components takes place. The main ideas here are that messages are represented explicitly and that communication takes place in a disciplined way. Fourth, we explained that the model does not support blocking sends because of the large latency times of open distributed systems, and as a result of this, we asserted that sessions must be managed explicitly. Fifth, we discussed the concurrency behavior of the system and last we showed by means of two real life examples that the system allows for a greater flexibility when writing adaptors.

Because of the flexibility for writing adaptors, the component model presented here is used as the underlying architecture for all our experiments.



Footnotes

...2.1
Since the ``Message'' class is the basis for all messages and its standard behavior is to offer an association list it is perfectly possible to optimize local communication by implementing a copy-on-write within the message. For local communication the performance boost is 3 times faster if we do so !
... communication2.2
We use a precompiler to translate this extended Java notation to standard Java source code.
Werner 2004-03-07