Actors as a Model of Mobile Computing.
 
Paul Mackay
pjm2@doc.ic.ac.uk
 
Pradeep Loganathan
pl5@doc.ic.ac.uk
 
Information Systems Engineering (ISE II)
Department of Computing
Imperial College of Science, Technology and Medicine
 
 
Contents
 
Introduction
 
The actor model
 
Mobile computing as a distributed system
 
Distributed algorithms for mobile computing networks
 
The challenges of wireless communication
The dynamic nature of actors
 
Message passing
 
Asynchronous and synchronous communication
The pure actor model
Optimisation
 
Java
 
Conclusion
 
References
 
 
There is nothing either good or bad, but thinking makes it so.
[Hamlet,. II.ii]
 
Introduction
 
Since the start of the trend in computer programming to use high level languages (developed shortly after the invention of computers able to store programs) there have been many advances in the field of software. Many approaches or models created for different applications have been proposed. As the field has expanded, the breadth of applications has grown, and with it the models and languages created to simplify and enhance these horizons. Today, as the acceptance of computers and their uses becomes ever more widespread, it is worth considering some of the older models or paradigms, and investigating whether they can be adapted to today’s applications.
 
In this report, and survey as a whole, the model of actors is considered with respect to the field of mobile computing, and how well it might adapt to this field. Every model has its advantages and disadvantages but the question is to see if the advantages lend themselves to this case. The particular field of mobile computing is one which has only begun to grow appreciably within the past 10 years or so, as the technology to build smaller and lighter computers has improved, coupled with the advances and acceptance of wireless telecommunications. As time goes by the number of wireless computers is expanding to cover the globe, and coupled with the statically linked portions of the Internet, this creates a distributed network the like of which has never been seen before.
 
 
The actor model
 
Before considering the feasibility of the actor model with regard to mobile computing, it is necessary to define the actor model. It is a programming paradigm, as is the object-orientated paradigm or imperative paradigm. All of these are ways of thinking about the underlying model of the software (or language used) in a system. The actor model was proposed by Carl Hewitt in the early 1970’s, as a model which would be particularly suited to parallel and AI computing applications. It has never gained the widespread recognition which other methods of programming have attained, such as object-orientated programming which is used throughout the industry.
 
A system of actors is defined by the actors and the tasks remaining for the actors to perform. An actor is a self contained active entity which performs one of more of the following actions:
  • Specify a new behaviour for itself. This it must do. 
  • Create other actors. 
  • Send a communication to one or more actors.
 
The actor Xn must define its new behaviour Xn+1, and may also create tasks and other actors.
 
 
What the actor executes is defined by the actor’s current behaviour. It will only perform these actions upon receiving a communication from another actor. The model is extremely concurrent. All actors operate concurrently with each other and all independent operations within an actor’s behaviour are also concurrent.
 
The tasks within an actor system drive the computation which takes place. An actor system with no tasks will have completed its execution. A task is defined by three bits of information:
  • A unique identifier. 
  • A communication containing the target actor’s required information. 
  • The mail address of the target actor.
Communications are in the form of messages. Each actor has a mail queue or mailbox associated with it, and the mail address of an actor may be freely distributed amongst other actors. All mail operations within an actor system are asynchronous, so the order of arrival of messages is random, but a form of weak fairness is assumed by guaranteeing the arrival of a message eventually. Any actor which receives a message will guarantee to process it eventually. This can be used to prevent deadlock and other associated problems (which often occur in distributed systems).
 
The mail addresses of other actors known by an actor are its acquaintances. An actor may obtain these addresses from one of three sources:
 
  • It was known to the actor from the start, when the actor was created. 
  • It was contained within a message sent to the actor. 
  • It was defined by the actor when it created another actor.
 
An actor system may communicate with the outside world by sending messages to externals, actors outside the system. The system itself will have a list of receptionists, which are actors which may receive (and buffer) messages from the outside world.
 
 
Mobile computing as a distributed system
 
Distributed systems are a natural outgrowth of the development of computer networks. Computer networking focuses on having a number of processors communicate with one another, as well as on the utilisation of such systems by remote users, whether the "user" was a terminal or another computer. A distributed data processing system is characterised by having both the processor and storage facilities physically dispersed and interconnected by data communication facilities. There are three activities of a data processing system that may be distributed :
  • Processing functions.
  • Storage of database(s).
  • System control.
Mobile computing can be thought of as widely distributed systems. In general, mobility of hosts brings a new set of issues in distributed systems. Mobility substantially affects data placement. The network is likely to be a WAN (Wide Area Network) having some complex topology that provides moderately reliable inter-site message delivery services with moderate bandwidth (although highly variable) and low signalling rates. There may be a significant delay between sending a message and receiving a reply even when no processing is required to generate the reply.
 
Features expected from distributed systems:
  • High system performance- fast response.
  • Sufficient throughput.
  • High system reliability (high system availability).
  • Reduced network transmission costs.
  • Ease of modular and incremental growth.
  • Configuration flexibility.
  • Automatic resource sharing.
  • Automatic load distribution.
  • High adaptability to changes in workload.
  • Easy expansion in both capacity and function.
  • Good response to temporary overloads.
A distributed system consists of many sequential processes communicating solely by messages. The behaviour of each process is controlled by a local algorithm (in this case based on the Actor model) which determines the sequence of actions and the reaction of the process to incoming messages. The concurrent (and co-ordinated) execution of all local algorithms forms a distributed computation. In an abstract setting, a distributed computation is determined by the types and the relative order of atomic actions (or events) occurring within the processes. These atomic actions could usually be classified into three types: send events, receive events, and internal events. The actor model is well suited for distributed computations because it models these three very basic events elegantly.
 
 
Distributed algorithms for mobile computing networks
 
"To the extent possible, computation and communication costs of an algorithm is borne by the static portion of the network. The core objective of the algorithm is achieved through a distributed execution amongst the fixed hosts while performing only those operations at the mobile hosts that are necessary for the desired overall functionality."   
[Two tier principle by B.R.Badrinath, Arup Acharya and Tomasz Imielinski.]
 
The design of distributed algorithms and protocols has traditionally been based on an underlying network architecture consisting of static hosts. Consequently, the absence of site and link failures, the connectivity amongst hosts in the network remains fixed. Distributed algorithms thus assume a model comprising of a set of processes executing on static hosts, that communicate through message passing over point-to-point logical channels. The channels may span multiple physical links of the network; this set of links and the hosts at the endpoints of the channel does not change with time. This conventional model fails to capture the features and constraints of mobile hosts and distributed algorithms designed for this new extended model, and therefore needs to be restructured to tackle host mobility. 
 
To facilitate continuous network coverage for mobile hosts, a static network is extended with Mobile Support Stations (MSS) that are each capable of directly communicating with Mobile Hosts (MH) within a limited geographical area (or cell), via a low-bandwidth wireless medium.
 
MHs are thereby able to connect to the static segment of the network from different locations at different times. Consequently, the overall network topology changes dynamically as MHs move from one cell to another. Hence a mobile host must first be located before a message can be sent to it. Further, as hosts change their locations, the physical connectivity of the network changes. A typical mobile distributed system would have a structure such as that shown in the figure below. 
 
 
 
mh
Mobile host
MSS
Mobile Support Station (has a wireless interface) 
Fixed Host
Stationary computer with no wireless interface
 
The structure of a mobile distributed system
 
 
The challenges of wireless communication
 
Wireless computers, in general, have less resources relative to stationary computers. This is because wireless computers are required to be smaller, lighter and consume less power than stationary computers. Wireless communication is more difficult to implement than wired communication because of the interaction of the surrounding environment with the message signal. Problems caused by the environment include blocked signal paths, echoes and noise. Hence wireless connections are more error prone, have much lower bandwidths, and have frequent spurious disconnections when compared to wired connections. These factors can increase communication latencies due to error control checks, retransmissions, time-out delays and brief disconnections.
 
 
Disconnected operation
 
A major problem associated with wireless communications is frequent disconnections. Resources can be allocated to handle disconnections more elegantly, or to try and prevent those disconnections from happening altogether. In environments with frequent disconnections it is better for the mobile computer to act as a stand-alone unit rather than a mobile terminal (i.e. splitting the application and the user interface across the network). The MSSs can provide commonly used application software, so that a mobile user can download the software from the closest MSS for execution or alternatively execute it remotely on the MSS. For wide-area networks, round-trip RPC delays will tend to be expensive in terms of wasted processor clock cycles, hence round-trip latencies and brief disconnections can be made less expensive by operating asynchronously. Caching techniques could be used to enhance the performance of weakly-connected and disconnected operation, but preserving cache coherence under weak connectivity can be expensive.
 
 
Low bandwidth and bandwidth variability
 
Wireless networks deliver lower bandwidth than wired networks, hence mobile computing designs need to be concerned about bandwidth consumption. The deliverable bandwidth per user depends on the number of users sharing a cell. The deliverable bandwidth can be significantly improved in two ways: 
  • Maintain multiple cells at different frequencies. This technique, although more flexible, is limited by the range of frequencies of the electromagnetic spectrum available for public consumption.
  • Limiting transmission ranges so that more cells can fit in a given area. This preferred approach is simpler, reduces power requirements, and may decrease corruption of the signal. It is also known that transceivers covering less area can achieve higher bandwidths. 
Compression, ‘logging’ (making large requests out of several short ones), pre-fetching (guessing which files will be needed soon), and write-back caching can help cope with low bandwidth availability. System performance can be further enhanced by scheduling communications intelligently. 
 
Mobile computing designs must cope with much greater variations in network bandwidth than traditional designs. A good design would be able to adapt to the currently available resources, providing the user with a variable level of quality. As a mobile element leaves the range of one network transceiver it switches to another, and there may also be places where they can access multiple transceivers on different frequencies. There may be a need to change access protocols, for example when switching from cellular coverage to satellite coverage.
 
 
Security concerns
 
Security and privacy is a major controversial issue in mobile computing. The security of wireless communication is more easily compromised than wired communication, especially for wide area networks. Being accessible at any location and at any time creates great concern about privacy issues among potential users. Users should be able to specify who, when and where is authorised to reach them. Secure communication over wireless networks is accomplished by encrypting the sensitive data.  With mobile users, the problem of authentication and security will be a global problem and distributed services to support authentication across administration domains will be necessary. 
 
 
The dynamic nature of actors
 
One of the most important features of actors is their ability to behave dynamically and adjust the characteristics of the overall system to match the current requirements. Actors exhibit dynamic behaviour in two key ways. They are able to redefine their behaviour, depending on the contents of communications received. This allows an actor to perform several roles. They are also able to create new actors, allowing modification of the system in response to changing needs. As the computation required increases, more actors can be formed and operate concurrently with each other to exploit the maximum resources of the distributed system.
 
 
The actor A creates a customer actor to which the reply from B is sent.
 
 
In a system such as a mobile network, frequent connections and disconnection to and from the network may occur. When the system is first initialised, the address or even existence of some of the nodes on the network may not be known by the receptionists in the static portion of the network which receive information from the outside. New addresses can be received within a communication, as the system is running. With a static topology there would be no mechanism for replying to the new users. With a dynamic topology, an actor may change its behaviour to deal with another client, or (perhaps more likely) create a new actor to which messages from the new user are passed on.
 
 
Message passing
 
Messages, their transmission modes, and the semantics of communication play an important role in distributed systems since messages are the only means by which processes can exchange data and can synchronise their actions. Running a copy of the same program on more than one node at a time can be a useful technique. The two nodes may be running the same program, but they are different copies of the same program. They will have the same variables, but they will each have their own copy of the variables. If these two nodes both have a variable called ‘A’ the two ‘A’s are distinct except in name. For instance, if node 1 set its ‘A’ to some value, node 2 does not know the value of node 1's ‘A’. If node 2 needs to know the value of node 1's ‘A’, it will have to be told. This is why communication is necessary in the form of message passing. Node 1 could send its value of ‘A’ to node 2 in a message. 
 
Communications in distributed systems are not reliable, messages can be lost because of communication failures, duplicated because of retransmissions, or their contents can be garbled or destroyed. Sophisticated techniques and protocols have been devised to cope with these failures and guarantee reliable message delivery to the application.
 
Messages may be transmitted synchronously or asynchronously, and messages might be received in the order that they were sent or received out of sequence. 
 
 
Asynchronous and synchronous communication
 
Usually, non-blocking communication mechanisms are called asynchronous, whereas blocking communication mechanisms are called synchronous. Synchronous group communication is certain, since it simplifies replicated programming by providing every replica with the same accurate and up-to-date knowledge of the system state. On the contrary, asynchronous group communication is characterised by uncertainty, it does not guarantee bounded reaction time. Asynchronous programming is actually a label for a variety of programming paradigms differing in the system models they imply, even if they are all variants of the same asynchronous system model. The underlying system assumed by the actor model is always asynchronous in the sense that we do not assume the existence of synchronised clocks or fixed communication delays.
 
It is widely accepted that neither of the two communication modes "synchronous" and "asynchronous" is generally superior to the other. While asynchronous communication is less prone to deadlocks and often allows a higher degree of parallelism (since the sender can proceed while the message is still being delivered), its implementation requires complex buffer management and flow control mechanisms. Furthermore, algorithms making use of asynchronous communications are more difficult to develop and to verify than algorithms working in a synchronous environment. It is, however, possible to simulate one mode with the other. In synchronous mode, explicit buffers or intervening buffer processes can be used to decouple the sender of a message from the receiver, thus simulating asynchronous mode (although true simulation requires an unbounded amount of buffer storage). In asynchronous mode, synchronous communication can be simulated by waiting for an explicit acknowledgement immediately after asynchronously sending a message. 
 
 
Motivation for synchronous communication
 
If the sending process needs to know that the message has been received by the receiving process, then both processes may use synchronous communication.
 
If a process executing a blocking synchronous send is "ahead" of the process executing the matching receive, then it will be idle until the receiving process catches up. Similarly, if the sending process is executing a non-blocking synchronous send, the completion test will not succeed until the receiving process catches up. Synchronous mode is a safe method of communication because the communication network can never become overloaded with undeliverable messages. It has the advantage of being predictable: a synchronous send always synchronises the sender and receiver. This makes the behaviour of a program more deterministic. Debugging is also easier because messages cannot lie undelivered and "invisible" in the network.
 
 
Motivation for asynchronous communication
 
In asynchronous communication, the sender does not block waiting for the receiver to acknowledge receipt of the message. This does mean that if there is a problem with the underlying network, the sender may still think that everything is working fine. This can pose a huge problem especially in wireless communication where frequent disconnections can occur due to unreliable communication links. We therefore need a set of tools based on asynchrony to provide a guarantee of message delivery. 
 
Distributed processes may have local clocks that proceed at different rates. This may be due to the clock speeds of the different processors, or the way that different logical processors are scheduled on the physical processors. Also, there may be varying time delays in communication between different processors, hence leading to unpredictable ordering of events. Physical components from which we construct distributed systems are often synchronous. However, as we bring layers of software to multiplex these physical resources to create abstractions such as processes and reliable communication channels, the resulting system may be better classified as asynchronous.
 
There are more reasons as to why asynchronous communication is better suited for the implementation of wireless distributed systems. The more significant ones are as follows:
  • Since communication takes time, communication intended for a process may arrive at the same time or scattered with communication from another source. Hence the system has to be able to sort out parts of a communication, which is easier to do with asynchronous communication. 
  • The sending process may run quicker than the receiving process, these situations could be handled easier with asynchronous communication. 
  • Synchronous communication does not allow a process to send itself a message. A recursive computation may require this facility.
  • Asynchronous communication is important for real-time applications where the component may be performing a time-critical activity and hence cannot waste time waiting for confirmation of successful transmission. 
Communications may get lost with a system built on the asynchronous model without buffers. Buffered asynchronous communication provides efficiency in execution by not arbitrarily delaying a computation which need not wait for another process to receive a its communication.
 
 
The pure actor model
 
One of the key concepts behind the actor model is that it spawns as many actors as is allowable within the system, to exploit the available parallelism most effectively. If an actor receives a communication its current behaviour is unable to cope with, it will probably create a customer (another actor) to deal with the problem, rather than alter its behaviour. Note that either option is available. It is worth mentioning again that the actor model was invented specifically for use in parallel and distributed systems. This was when the notion of a distributed system consisted of two or more independent computers connected together and able to share tasks. It therefore made sense to create lots of actors in these environments.
 
The consequences are that in an average system, many actors will be created, and the corresponding communications overhead, with all the actors passing messages, will increase with it. The nature of the actor model implies a greater level of communication than other models. This may be taken to extremes depending on the implementation of the model.
 
A particular form of the theoretical actor model is known as the universe of actors model. In this model, everything is represented as an actor. This includes all expressions, commands and communications. To take the simplest example, an integer is an actor, which replies with its own value upon receiving  a message. It is an example of a built-in actor, one which is used to bottom out a computation because it creates no further computation itself. With a extreme case of the model such as this, the likelihood of a high level of message passing is even greater.
 
This goes to show that the latency of message passing within the model is very important, and should at all times be kept to a minimum. The time for a computation to be carried out by a single actor is in general relatively short, especially when the system is split into many actors. The overall performance of the system is therefore heavily reliant on the time for messages to be sent and received. For an actor implementation to be efficient, certain issues must be considered. Balancing of the load amongst available resources to maximise their use. Locality of reference, to ensure frequent communications between certain actors does not occur over a distant, slow link. Process migration may be preferable, rather than sending constant communications.
 
An insensitive actor is one which buffers all communications until it receives a become communication, which describes what the actor’s new behaviour will be. The idea behind this is to provide for situations where a reply is required from another actor. The other actor would be instructed to send a become communication to the sending actor. This is an elegant theoretical solution to the problem. However, a practical implementation is inefficient. With asynchronous message passing, where there is no guarantee of time of delivery of a message, having to buffer all other messages will cause serious performance reductions, if the reply is coming from a remote mobile computer. 
 
 
Guarantee of message delivery in the actor model
 
Eventual delivery of a message is a convenient programming assumption within the actor model, which means mailboxes need not be infinite. However, it is not entirely feasible for a practical implementation to make this assumption. If a system did not eventually deliver a communication it was buffering, it would have to buffer the communication indefinitely, hence consuming storage. The guarantee of delivery does not assume that all processes are meaningfully processed. The actor model assumes that the underlying network is error free and hence guarantees eventual message delivery. For reasons mentioned above, this assumption cannot be made in a highly error prone environment such as wireless mobile computing. Hence a ‘loosely’ synchronous system may be needed to identify network errors and hence guarantee message delivery.
 
 
Optimisation
 
The pure actor model is the primary consideration in this report, but it must be noted that the performance of any model may be substantially improved using optimisation when implemented. Indeed, it is often only by certain optimisation techniques that the performance of a system can be improved to a point where it becomes usable. This may occur in both hardware and software.
 
In mobile computing the time required to establish a connection can be substantial. It can involve transmitting a signal over a region and waiting for it to be picked up by the recipient, and possibly searching across the network if the location of the recipient has moved. Therefore it is desirable to limit this overhead if possible. One way to do this is to store, or buffer the messages to be sent to a certain node and send them all at once. An actor could be created whose only function is to buffer the messages sent to it, until it accepts a communication which tells it to forward all its stored messages to another mail address. Similarly, the receptionists at the other end accepting messages from other systems can buffer messages until they are needed by their respective actors.
 
 
 
The actor X creates a new actor Y and an expression e.
 
 
As information required by a remote mobile node is almost certain to take longer than from an actor located somewhere within the static network, the concept of eager evaluation using futures can be employed. An actor will create an expression (the future) which requires information from a mobile source. It will also create another actor, which is given the address of the expression. The expression is told to evaluate itself. Meanwhile the newly created actor can continue with whatever processing it can perform without the information, and will only ask for it when it must have it. By then the expression will hopefully have obtained it. The expression becomes the actor it evaluates to, and it is an acquaintance of the first new actor. This method maximises the concurrency of the system.
 
 
Java
 
Java is a language, not a programming model, therefore it would be an unfair match to compare Java with the actor model. However, there are several points which may be addressed by looking at Java and its success. 
 
Java was first developed in 1991 by James Gosling at Sun Microsystems. Initially the idea behind Java was to create a language which could be used to generate small code for use in consumer electronic devices, such as mobile phones, televisions, etc. For this reason it was designed to run on all architectures, by using the technique of compiling intermediate code to run under a small interpreter on the particular architecture. Java was therefore universal, and could run anywhere, after the code had been written once. This is important in the distributed systems there are today (look at the extreme example of the Internet). Most systems are a mix of many architectures and computers. If all are using the common language of Java it makes the design of communication primitives easier. Java is interpreted by the machine it runs on, which means a speed reduction, but this disadvantage is often outweighed by the portability.
 
Sun based the language heavily on C++ because of their Unix background. This was important because C++ is already widely known throughout the world of computing, and so Java could be picked up by many programmers quickly. It has gained wide acceptance quickly because of this.
 
Java was designed from the start to be powerful and yet simple with regard to networks in general, and the Internet in particular. It is distributed, and has strong networking capabilities built in. It is dynamic to the extent that new libraries can easily be added, but it does not have the dynamic runtime capabilities of the actor model. Java allows Remote Procedure Calls, which reduce network traffic by sending only the information required for a procedure call.
 
Java has many features to enhance its security, for example not allowing memory overflow, or the fact that it cannot write from an applet to a client-side hard drive. Some of these, such as memory problems, come about because of some features of C++ dropped in Java, like the direct manipulation of pointers. There is also a high level of checking that goes on at compile time and run time to ensure the code is safe.
 
Despite considering what the actor model offers to the field of mobile computing, it has to be brought into context. There have been many object-orientated languages written, but not all would lend themselves to the same applications. If a pure implementation of the actor model was developed, it would need to incorporate some, if not all, the points addressed above. For it to be a success, it would need to be easy to understand and learn, available on a wide range of machines, and efficient. The global spread of mobile units is increasing rapidly, so it becomes ever more important that each unit encounters a standard, familiar language or interface, so that code will run identically everywhere.
 
Security would be an important consideration. One of the key problems with the actor model is that it assumes infinite space to store messages which have not been processed. This is important for the weak fairness of the model, as every message must be processed eventually. In a real system, it may be that available memory is limited, especially with mobile computers (e.g. PDA’s) where there are few hardware resources. Care must be taken to ensure memory overflow and other related problems cannot occur. Creating new actors might also exacerbate these difficulties, as extra memory must be found for them.
 
 
Conclusion
 
The field of mobile computing has expanded rapidly in the past 25 years, since the actor model was first proposed. Although the actor model was initially designed for parallel and distributed architectures, it can be seen that a mobile network or system may be abstractly modelled as a distributed system. For this reason, the underlying idea of the actor model is well suited to the field of mobile computing.
 
One of the fundamental properties of the actor model is that all actors communicate using message passing. Distributed systems have to operate using the technique of message passing, because it removes the confusion created by the sharing of variables over a distributed network. In the case of mobile networks, care must be taken to reduce the level of communication which takes place. The fact that messages and the actors which send and receive those messages are totally self contained, is a strong advantage in mobile networks.
 
The concept of a global clock is impractical in any distributed system. Asynchronous communication, which is the foundation of the actor model, exploits the inherent parallelism of a distributed system. The theoretical actor model uses the technique of mailboxes to provide asynchronous communication, but this assumes infinite resources for the storage of messages. Mobile computers are characterised by their constraints on resources.
 
The large latencies involved in mobile networks mean that parallel execution is paramount. As well as using asynchronous communication, the highly concurrent nature of actors leads to maximum utilisation of available resources. The dynamic nature of the actor model allows for adaptation of the network, as users connect and disconnect. Care must be taken to ensure that the majority of operations are performed on the static portions of the network. A full implementation of the actor model causes problems in this respect because it is designed to liberate the programmer from the decisions of explicitly defining parallel operations.
 
The general features of the actor model do lend themselves to mobile computing, but from past research it can be seen that an full implementation of the actor model is not sufficient. For efficient operation on a mobile network, refinements may be needed. When the actor model was first proposed, the advancements in the field were still in their early stages. Now, as further research into the actor model has led to the discovery of more efficient methods for its implementation, an implementation designed for use with mobile networks may be feasible.