|
|
Actors as a Model of Mobile Computing.
Paul Mackay
pjm2@doc.ic.ac.uk
Pradeep Loganathan
pl5@doc.ic.ac.uk
Information Systems Engineering (ISE II)
Department of Computing
Imperial College of Science, Technology and Medicine
Contents
Introduction
The actor
model
Mobile computing
as a distributed system
Distributed
algorithms for mobile computing networks
The challenges
of wireless communication
The dynamic
nature of actors
Message
passing
Asynchronous
and synchronous communication
The pure
actor model
Optimisation
Java
Conclusion
References
There is nothing either good or bad, but thinking makes
it so.
[Hamlet,. II.ii]
Introduction
Since the start of the trend in computer programming to use high level
languages (developed shortly after the invention of computers able to store
programs) there have been many advances in the field of software. Many
approaches or models created for different applications have been proposed.
As the field has expanded, the breadth of applications has grown, and with
it the models and languages created to simplify and enhance these horizons.
Today, as the acceptance of computers and their uses becomes ever more
widespread, it is worth considering some of the older models or paradigms,
and investigating whether they can be adapted to today’s applications.
In this report, and survey as a whole, the model of actors is considered
with respect to the field of mobile computing, and how well it might adapt
to this field. Every model has its advantages and disadvantages but the
question is to see if the advantages lend themselves to this case. The
particular field of mobile computing is one which has only begun to grow
appreciably within the past 10 years or so, as the technology to build
smaller and lighter computers has improved, coupled with the advances and
acceptance of wireless telecommunications. As time goes by the number of
wireless computers is expanding to cover the globe, and coupled with the
statically linked portions of the Internet, this creates a distributed
network the like of which has never been seen before.
The actor
model
Before considering the feasibility of the actor model with regard to
mobile computing, it is necessary to define the actor model. It is a programming
paradigm, as is the object-orientated paradigm or imperative paradigm.
All of these are ways of thinking about the underlying model of the software
(or language used) in a system. The actor model was proposed by Carl Hewitt
in the early 1970’s, as a model which would be particularly suited to parallel
and AI computing applications. It has never gained the widespread recognition
which other methods of programming have attained, such as object-orientated
programming which is used throughout the industry.
A system of actors is defined by the actors and the tasks remaining
for the actors to perform. An actor is a self contained active entity which
performs one of more of the following actions:
- Specify a new behaviour for itself. This it must do.
- Create other actors.
- Send a communication to one or more actors.
The actor Xn must define its new behaviour Xn+1, and may
also create tasks and other actors.
What the actor executes is defined by the actor’s current behaviour.
It will only perform these actions upon receiving a communication from
another actor. The model is extremely concurrent. All actors operate concurrently
with each other and all independent operations within an actor’s behaviour
are also concurrent.
The tasks within an actor system drive the computation which takes
place. An actor system with no tasks will have completed its execution.
A task is defined by three bits of information:
- A unique identifier.
- A communication containing the target actor’s required information.
- The mail address of the target actor.
Communications are in the form of messages. Each actor has a mail queue
or mailbox associated with it, and the mail address of an actor may be
freely distributed amongst other actors. All mail operations within an
actor system are asynchronous, so the order of arrival of messages is random,
but a form of weak fairness is assumed by guaranteeing the arrival of a
message eventually. Any actor which receives a message will guarantee to
process it eventually. This can be used to prevent deadlock and other associated
problems (which often occur in distributed systems).
The mail addresses of other actors known by an actor are its acquaintances.
An actor may obtain these addresses from one of three sources:
- It was known to the actor from the start, when the actor was created.
- It was contained within a message sent to the actor.
- It was defined by the actor when it created another actor.
An actor system may communicate with the outside world by sending messages
to externals, actors outside the system. The system itself will have a
list of receptionists, which are actors which may receive (and buffer)
messages from the outside world.
Mobile
computing as a distributed system
Distributed systems are a natural outgrowth of the development of computer
networks. Computer networking focuses on having a number of processors
communicate with one another, as well as on the utilisation of such systems
by remote users, whether the "user" was a terminal or another
computer. A distributed data processing system is characterised by having
both the processor and storage facilities physically dispersed and interconnected
by data communication facilities. There are three activities of a data
processing system that may be distributed :
- Processing functions.
- Storage of database(s).
- System control.
Mobile computing can be thought of as widely distributed systems. In
general, mobility of hosts brings a new set of issues in distributed systems.
Mobility substantially affects data placement. The network is likely to
be a WAN (Wide Area Network) having some complex topology that provides
moderately reliable inter-site message delivery services with moderate
bandwidth (although highly variable) and low signalling rates. There may
be a significant delay between sending a message and receiving a reply
even when no processing is required to generate the reply.
Features expected from distributed systems:
- High system performance- fast response.
- Sufficient throughput.
- High system reliability (high system availability).
- Reduced network transmission costs.
- Ease of modular and incremental growth.
- Configuration flexibility.
- Automatic resource sharing.
- Automatic load distribution.
- High adaptability to changes in workload.
- Easy expansion in both capacity and function.
- Good response to temporary overloads.
A distributed system consists of many sequential processes communicating
solely by messages. The behaviour of each process is controlled by a local
algorithm (in this case based on the Actor model) which determines the
sequence of actions and the reaction of the process to incoming messages.
The concurrent (and co-ordinated) execution of all local algorithms forms
a distributed computation. In an abstract setting, a distributed computation
is determined by the types and the relative order of atomic actions (or
events) occurring within the processes. These atomic actions could usually
be classified into three types: send events, receive events, and internal
events. The actor model is well suited for distributed computations because
it models these three very basic events elegantly.
Distributed
algorithms for mobile computing networks
"To the extent possible, computation and communication costs
of an algorithm is borne by the static portion of the network. The core
objective of the algorithm is achieved through a distributed execution
amongst the fixed hosts while performing only those operations at the mobile
hosts that are necessary for the desired overall functionality."
[Two tier principle by B.R.Badrinath,
Arup Acharya and Tomasz Imielinski.]
The design of distributed algorithms and protocols has traditionally
been based on an underlying network architecture consisting of static hosts.
Consequently, the absence of site and link failures, the connectivity amongst
hosts in the network remains fixed. Distributed algorithms thus assume
a model comprising of a set of processes executing on static hosts, that
communicate through message passing over point-to-point logical channels.
The channels may span multiple physical links of the network; this set
of links and the hosts at the endpoints of the channel does not change
with time. This conventional model fails to capture the features and constraints
of mobile hosts and distributed algorithms designed for this new extended
model, and therefore needs to be restructured to tackle host mobility.
To facilitate continuous network coverage for mobile hosts, a static
network is extended with Mobile Support Stations (MSS) that are each capable
of directly communicating with Mobile Hosts (MH) within a limited geographical
area (or cell), via a low-bandwidth wireless medium.
MHs are thereby able to connect to the static segment of the network
from different locations at different times. Consequently, the overall
network topology changes dynamically as MHs move from one cell to another.
Hence a mobile host must first be located before a message can be sent
to it. Further, as hosts change their locations, the physical connectivity
of the network changes. A typical mobile distributed system would have
a structure such as that shown in the figure below.
|
mh
|
Mobile host
|
|
MSS
|
Mobile Support Station (has a wireless interface)
|
|
Fixed Host
|
Stationary computer with no wireless interface
|
The structure of a mobile distributed system
The challenges
of wireless communication
Wireless computers, in general, have less resources relative to stationary
computers. This is because wireless computers are required to be smaller,
lighter and consume less power than stationary computers. Wireless communication
is more difficult to implement than wired communication because of the
interaction of the surrounding environment with the message signal. Problems
caused by the environment include blocked signal paths, echoes and noise.
Hence wireless connections are more error prone, have much lower bandwidths,
and have frequent spurious disconnections when compared to wired connections.
These factors can increase communication latencies due to error control
checks, retransmissions, time-out delays and brief disconnections.
Disconnected operation
A major problem associated with wireless communications is frequent
disconnections. Resources can be allocated to handle disconnections more
elegantly, or to try and prevent those disconnections from happening altogether.
In environments with frequent disconnections it is better for the mobile
computer to act as a stand-alone unit rather than a mobile terminal (i.e.
splitting the application and the user interface across the network). The
MSSs can provide commonly used application software, so that a mobile user
can download the software from the closest MSS for execution or alternatively
execute it remotely on the MSS. For wide-area networks, round-trip RPC
delays will tend to be expensive in terms of wasted processor clock cycles,
hence round-trip latencies and brief disconnections can be made less expensive
by operating asynchronously. Caching techniques could be used to enhance
the performance of weakly-connected and disconnected operation, but preserving
cache coherence under weak connectivity can be expensive.
Low bandwidth and bandwidth
variability
Wireless networks deliver lower bandwidth than wired networks, hence
mobile computing designs need to be concerned about bandwidth consumption.
The deliverable bandwidth per user depends on the number of users sharing
a cell. The deliverable bandwidth can be significantly improved in two
ways:
- Maintain multiple cells at different frequencies. This technique, although
more flexible, is limited by the range of frequencies of the electromagnetic
spectrum available for public consumption.
- Limiting transmission ranges so that more cells can fit in a given
area. This preferred approach is simpler, reduces power requirements, and
may decrease corruption of the signal. It is also known that transceivers
covering less area can achieve higher bandwidths.
Compression, ‘logging’ (making large requests out of several short
ones), pre-fetching (guessing which files will be needed soon), and write-back
caching can help cope with low bandwidth availability. System performance
can be further enhanced by scheduling communications intelligently.
Mobile computing designs must cope with much greater variations in
network bandwidth than traditional designs. A good design would be able
to adapt to the currently available resources, providing the user with
a variable level of quality. As a mobile element leaves the range of one
network transceiver it switches to another, and there may also be places
where they can access multiple transceivers on different frequencies. There
may be a need to change access protocols, for example when switching from
cellular coverage to satellite coverage.
Security concerns
Security and privacy is a major controversial issue in mobile computing.
The security of wireless communication is more easily compromised than
wired communication, especially for wide area networks. Being accessible
at any location and at any time creates great concern about privacy issues
among potential users. Users should be able to specify who, when and where
is authorised to reach them. Secure communication over wireless networks
is accomplished by encrypting the sensitive data. With mobile users,
the problem of authentication and security will be a global problem and
distributed services to support authentication across administration domains
will be necessary.
The dynamic
nature of actors
One of the most important features of actors is their ability to behave
dynamically and adjust the characteristics of the overall system to match
the current requirements. Actors exhibit dynamic behaviour in two key ways.
They are able to redefine their behaviour, depending on the contents of
communications received. This allows an actor to perform several roles.
They are also able to create new actors, allowing modification of the system
in response to changing needs. As the computation required increases, more
actors can be formed and operate concurrently with each other to exploit
the maximum resources of the distributed system.
The actor A creates a customer actor to which the reply
from B is sent.
In a system such as a mobile network, frequent connections and disconnection
to and from the network may occur. When the system is first initialised,
the address or even existence of some of the nodes on the network may not
be known by the receptionists in the static portion of the network which
receive information from the outside. New addresses can be received within
a communication, as the system is running. With a static topology there
would be no mechanism for replying to the new users. With a dynamic topology,
an actor may change its behaviour to deal with another client, or (perhaps
more likely) create a new actor to which messages from the new user are
passed on.
Message
passing
Messages, their transmission modes, and the semantics of communication
play an important role in distributed systems since messages are the only
means by which processes can exchange data and can synchronise their actions.
Running a copy of the same program on more than one node at a time can
be a useful technique. The two nodes may be running the same program, but
they are different copies of the same program. They will have the same
variables, but they will each have their own copy of the variables. If
these two nodes both have a variable called ‘A’ the two ‘A’s are distinct
except in name. For instance, if node 1 set its ‘A’ to some value, node
2 does not know the value of node 1's ‘A’. If node 2 needs to know the
value of node 1's ‘A’, it will have to be told. This is why communication
is necessary in the form of message passing. Node 1 could send its value
of ‘A’ to node 2 in a message.
Communications in distributed systems are not reliable, messages can
be lost because of communication failures, duplicated because of retransmissions,
or their contents can be garbled or destroyed. Sophisticated techniques
and protocols have been devised to cope with these failures and guarantee
reliable message delivery to the application.
Messages may be transmitted synchronously or asynchronously, and messages
might be received in the order that they were sent or received out of sequence.
Asynchronous
and synchronous communication
Usually, non-blocking communication mechanisms are called asynchronous,
whereas blocking communication mechanisms are called synchronous. Synchronous
group communication is certain, since it simplifies replicated programming
by providing every replica with the same accurate and up-to-date knowledge
of the system state. On the contrary, asynchronous group communication
is characterised by uncertainty, it does not guarantee bounded reaction
time. Asynchronous programming is actually a label for a variety of programming
paradigms differing in the system models they imply, even if they are all
variants of the same asynchronous system model. The underlying system assumed
by the actor model is always asynchronous in the sense that we do not assume
the existence of synchronised clocks or fixed communication delays.
It is widely accepted that neither of the two communication modes "synchronous"
and "asynchronous" is generally superior to the other. While
asynchronous communication is less prone to deadlocks and often allows
a higher degree of parallelism (since the sender can proceed while the
message is still being delivered), its implementation requires complex
buffer management and flow control mechanisms. Furthermore, algorithms
making use of asynchronous communications are more difficult to develop
and to verify than algorithms working in a synchronous environment. It
is, however, possible to simulate one mode with the other. In synchronous
mode, explicit buffers or intervening buffer processes can be used to decouple
the sender of a message from the receiver, thus simulating asynchronous
mode (although true simulation requires an unbounded amount of buffer storage).
In asynchronous mode, synchronous communication can be simulated by waiting
for an explicit acknowledgement immediately after asynchronously sending
a message.
Motivation for synchronous
communication
If the sending process needs to know that the message has been received
by the receiving process, then both processes may use synchronous communication.
If a process executing a blocking synchronous send is "ahead"
of the process executing the matching receive, then it will be idle until
the receiving process catches up. Similarly, if the sending process is
executing a non-blocking synchronous send, the completion test will not
succeed until the receiving process catches up. Synchronous mode is a safe
method of communication because the communication network can never become
overloaded with undeliverable messages. It has the advantage of being predictable:
a synchronous send always synchronises the sender and receiver. This makes
the behaviour of a program more deterministic. Debugging is also easier
because messages cannot lie undelivered and "invisible" in the
network.
Motivation for asynchronous
communication
In asynchronous communication, the sender does not block waiting for
the receiver to acknowledge receipt of the message. This does mean that
if there is a problem with the underlying network, the sender may still
think that everything is working fine. This can pose a huge problem especially
in wireless communication where frequent disconnections can occur due to
unreliable communication links. We therefore need a set of tools based
on asynchrony to provide a guarantee of message delivery.
Distributed processes may have local clocks that proceed at different
rates. This may be due to the clock speeds of the different processors,
or the way that different logical processors are scheduled on the physical
processors. Also, there may be varying time delays in communication between
different processors, hence leading to unpredictable ordering of events.
Physical components from which we construct distributed systems are often
synchronous. However, as we bring layers of software to multiplex these
physical resources to create abstractions such as processes and reliable
communication channels, the resulting system may be better classified as
asynchronous.
There are more reasons as to why asynchronous communication is better
suited for the implementation of wireless distributed systems. The more
significant ones are as follows:
- Since communication takes time, communication intended for a process
may arrive at the same time or scattered with communication from another
source. Hence the system has to be able to sort out parts of a communication,
which is easier to do with asynchronous communication.
- The sending process may run quicker than the receiving process, these
situations could be handled easier with asynchronous communication.
- Synchronous communication does not allow a process to send itself a
message. A recursive computation may require this facility.
- Asynchronous communication is important for real-time applications
where the component may be performing a time-critical activity and hence
cannot waste time waiting for confirmation of successful transmission.
Communications may get lost with a system built on the asynchronous
model without buffers. Buffered asynchronous communication provides efficiency
in execution by not arbitrarily delaying a computation which need not wait
for another process to receive a its communication.
The
pure actor model
One of the key concepts behind the actor model is that it spawns as
many actors as is allowable within the system, to exploit the available
parallelism most effectively. If an actor receives a communication its
current behaviour is unable to cope with, it will probably create a customer
(another actor) to deal with the problem, rather than alter its behaviour.
Note that either option is available. It is worth mentioning again that
the actor model was invented specifically for use in parallel and distributed
systems. This was when the notion of a distributed system consisted of
two or more independent computers connected together and able to share
tasks. It therefore made sense to create lots of actors in these environments.
The consequences are that in an average system, many actors will be
created, and the corresponding communications overhead, with all the actors
passing messages, will increase with it. The nature of the actor model
implies a greater level of communication than other models. This may be
taken to extremes depending on the implementation of the model.
A particular form of the theoretical actor model is known as the universe
of actors model. In this model, everything is represented as an actor.
This includes all expressions, commands and communications. To take the
simplest example, an integer is an actor, which replies with its own value
upon receiving a message. It is an example of a built-in actor, one
which is used to bottom out a computation because it creates no further
computation itself. With a extreme case of the model such as this, the
likelihood of a high level of message passing is even greater.
This goes to show that the latency of message passing within the model
is very important, and should at all times be kept to a minimum. The time
for a computation to be carried out by a single actor is in general relatively
short, especially when the system is split into many actors. The overall
performance of the system is therefore heavily reliant on the time for
messages to be sent and received. For an actor implementation to be efficient,
certain issues must be considered. Balancing of the load amongst available
resources to maximise their use. Locality of reference, to ensure frequent
communications between certain actors does not occur over a distant, slow
link. Process migration may be preferable, rather than sending constant
communications.
An insensitive actor is one which buffers all communications until
it receives a become communication, which describes what the actor’s new
behaviour will be. The idea behind this is to provide for situations where
a reply is required from another actor. The other actor would be instructed
to send a become communication to the sending actor. This is an elegant
theoretical solution to the problem. However, a practical implementation
is inefficient. With asynchronous message passing, where there is no guarantee
of time of delivery of a message, having to buffer all other messages will
cause serious performance reductions, if the reply is coming from a remote
mobile computer.
Guarantee of message
delivery in the actor model
Eventual delivery of a message is a convenient programming assumption
within the actor model, which means mailboxes need not be infinite. However,
it is not entirely feasible for a practical implementation to make this
assumption. If a system did not eventually deliver a communication it was
buffering, it would have to buffer the communication indefinitely, hence
consuming storage. The guarantee of delivery does not assume that all processes
are meaningfully processed. The actor model assumes that the underlying
network is error free and hence guarantees eventual message delivery. For
reasons mentioned above, this assumption cannot be made in a highly error
prone environment such as wireless mobile computing. Hence a ‘loosely’
synchronous system may be needed to identify network errors and hence guarantee
message delivery.
Optimisation
The pure actor model is the primary consideration in this report, but
it must be noted that the performance of any model may be substantially
improved using optimisation when implemented. Indeed, it is often only
by certain optimisation techniques that the performance of a system can
be improved to a point where it becomes usable. This may occur in both
hardware and software.
In mobile computing the time required to establish a connection can
be substantial. It can involve transmitting a signal over a region and
waiting for it to be picked up by the recipient, and possibly searching
across the network if the location of the recipient has moved. Therefore
it is desirable to limit this overhead if possible. One way to do this
is to store, or buffer the messages to be sent to a certain node and send
them all at once. An actor could be created whose only function is to buffer
the messages sent to it, until it accepts a communication which tells it
to forward all its stored messages to another mail address. Similarly,
the receptionists at the other end accepting messages from other systems
can buffer messages until they are needed by their respective actors.
The actor X creates a new actor Y and an expression e.
As information required by a remote mobile node is almost certain to
take longer than from an actor located somewhere within the static network,
the concept of eager evaluation using futures can be employed. An actor
will create an expression (the future) which requires information from
a mobile source. It will also create another actor, which is given the
address of the expression. The expression is told to evaluate itself. Meanwhile
the newly created actor can continue with whatever processing it can perform
without the information, and will only ask for it when it must have it.
By then the expression will hopefully have obtained it. The expression
becomes the actor it evaluates to, and it is an acquaintance of the first
new actor. This method maximises the concurrency of the system.
Java
Java is a language, not a programming model, therefore it would be
an unfair match to compare Java with the actor model. However, there are
several points which may be addressed by looking at Java and its success.
Java was first developed in 1991 by James Gosling at Sun Microsystems.
Initially the idea behind Java was to create a language which could be
used to generate small code for use in consumer electronic devices, such
as mobile phones, televisions, etc. For this reason it was designed to
run on all architectures, by using the technique of compiling intermediate
code to run under a small interpreter on the particular architecture. Java
was therefore universal, and could run anywhere, after the code had been
written once. This is important in the distributed systems there are today
(look at the extreme example of the Internet). Most systems are a mix of
many architectures and computers. If all are using the common language
of Java it makes the design of communication primitives easier. Java is
interpreted by the machine it runs on, which means a speed reduction, but
this disadvantage is often outweighed by the portability.
Sun based the language heavily on C++ because of their Unix background.
This was important because C++ is already widely known throughout the world
of computing, and so Java could be picked up by many programmers quickly.
It has gained wide acceptance quickly because of this.
Java was designed from the start to be powerful and yet simple with
regard to networks in general, and the Internet in particular. It is distributed,
and has strong networking capabilities built in. It is dynamic to the extent
that new libraries can easily be added, but it does not have the dynamic
runtime capabilities of the actor model. Java allows Remote Procedure Calls,
which reduce network traffic by sending only the information required for
a procedure call.
Java has many features to enhance its security, for example not allowing
memory overflow, or the fact that it cannot write from an applet to a client-side
hard drive. Some of these, such as memory problems, come about because
of some features of C++ dropped in Java, like the direct manipulation of
pointers. There is also a high level of checking that goes on at compile
time and run time to ensure the code is safe.
Despite considering what the actor model offers to the field of mobile
computing, it has to be brought into context. There have been many object-orientated
languages written, but not all would lend themselves to the same applications.
If a pure implementation of the actor model was developed, it would need
to incorporate some, if not all, the points addressed above. For it to
be a success, it would need to be easy to understand and learn, available
on a wide range of machines, and efficient. The global spread of mobile
units is increasing rapidly, so it becomes ever more important that each
unit encounters a standard, familiar language or interface, so that code
will run identically everywhere.
Security would be an important consideration. One of the key problems
with the actor model is that it assumes infinite space to store messages
which have not been processed. This is important for the weak fairness
of the model, as every message must be processed eventually. In a real
system, it may be that available memory is limited, especially with mobile
computers (e.g. PDA’s) where there are few hardware resources. Care must
be taken to ensure memory overflow and other related problems cannot occur.
Creating new actors might also exacerbate these difficulties, as extra
memory must be found for them.
Conclusion
The field of mobile computing has expanded rapidly in the past 25 years,
since the actor model was first proposed. Although the actor model was
initially designed for parallel and distributed architectures, it can be
seen that a mobile network or system may be abstractly modelled as a distributed
system. For this reason, the underlying idea of the actor model is well
suited to the field of mobile computing.
One of the fundamental properties of the actor model is that all actors
communicate using message passing. Distributed systems have to operate
using the technique of message passing, because it removes the confusion
created by the sharing of variables over a distributed network. In the
case of mobile networks, care must be taken to reduce the level of communication
which takes place. The fact that messages and the actors which send and
receive those messages are totally self contained, is a strong advantage
in mobile networks.
The concept of a global clock is impractical in any distributed system.
Asynchronous communication, which is the foundation of the actor model,
exploits the inherent parallelism of a distributed system. The theoretical
actor model uses the technique of mailboxes to provide asynchronous communication,
but this assumes infinite resources for the storage of messages. Mobile
computers are characterised by their constraints on resources.
The large latencies involved in mobile networks mean that parallel
execution is paramount. As well as using asynchronous communication, the
highly concurrent nature of actors leads to maximum utilisation of available
resources. The dynamic nature of the actor model allows for adaptation
of the network, as users connect and disconnect. Care must be taken to
ensure that the majority of operations are performed on the static portions
of the network. A full implementation of the actor model causes problems
in this respect because it is designed to liberate the programmer from
the decisions of explicitly defining parallel operations.
The general features of the actor model do lend themselves to mobile
computing, but from past research it can be seen that an full implementation
of the actor model is not sufficient. For efficient operation on a mobile
network, refinements may be needed. When the actor model was first proposed,
the advancements in the field were still in their early stages. Now, as
further research into the actor model has led to the discovery of more
efficient methods for its implementation, an implementation designed for
use with mobile networks may be feasible.
|