Chapter 3 The Louse and the Mars Explorer

 

Logical Extremism, which views life as all thought and no action, has given Logic a bad name. It has overshadowed its near relation, Logical Moderation, which recognises that Logic is only one way of thinking, and that thinking isn’t everything.

 

The antithesis of Logical Extremism is Extreme Behaviourism, which denies any Life of the Mind and views Life instead entirely in behavioural terms. Extreme Behaviourism, in turn, is easily confused with the condition-action rule model of thinking.

Behaviourism

If you were analysing the behaviour of a thermostat, which regulates the temperature of a room by turning on the heat when it is too cold and turning it off when it is too hot, you might describe the thermostat’s input-output behaviour in condition-action terms:

 

            If current temperature is T degrees and target temperature is T’ degrees and T < T’ -  2°

            then the thermostat turns on the heat.

 

            If current temperature is T degrees and target temperature is T’ degrees and T > T’ + 2°

            then the thermostat turns off the heat.

 

But you wouldn’t attribute to the thermostat a mind that manipulates such descriptions to generate its behaviour.

 

In the same way that you would view the thermostat, the behaviourist views agents in general.

 

Thus, in the story of the fox and the crow, a behaviourist, unable to examine the fox’s internal, mental state, would view the behaviour of the fox in the same way that the fox views the behaviour of the crow:

 

If the fox sees that the crow has cheese, then the fox praises the crow.

If the fox is near the cheese, then the fox picks up the cheese.

 

The behaviourist’s description of the fox begins and ends with the fox’s externally observable behaviour. The behaviourist justifies her refusal to attribute any internal, mental activity to the fox, by the fact that it is impossible to verify such attributions by the scientific method of observation and experimentation.

 

According to the behaviourist, the fox is a purely reactive agent, simply responding to changes in the world around her. If, in the course of reacting to these changes, the fox gets the cheese, then this result is merely an indirect, emergent effect, rather than one that the fox deliberately brings about by proactive, goal-oriented reasoning.

 

The behaviourist also sees no reason to distinguish between the behaviour of a thermostat and that of a human. The behaviourist might use an implication:

 

                                    If a passenger observes an emergency on the underground,

                                    then the passenger presses the alarm signal button.

 

to describe the behaviour of a passenger on the underground. But the use of such an implication says nothing about how the passenger actually generates  that behaviour. As far as the behavourist is concerned, pressing the alarm signal button whenever there is an emergency might be only an instinct, of whose purpose the passenger is entirely unaware.

 

Behaviourism is indirectly supported by Darwinism, which holds that organisms evolve by adapting to the environment, rather than by a goal-oriented process of self-improvement.

 

Behaviourism also shares with condition-action rules its focus on modelling behaviour as reactions to changes in the environment. However, whereas behaviourism restricts its attention to descriptions of behaviour, condition-action rules are used in production systems to generate behaviour.

Production Systems

 

Few psychologists subscribe today even to moderate versions of behaviourism. Most adhere instead to the cognitive science view that intelligent agents engage in some form of thinking that can usefully be understood as the application of computational procedures to mental representations of the world.

 

Paul Thagard states in his book, Mind: Introduction to Cognitive Science, that, among the various models of thinking investigated in cognitive science, production systems have “the most psychological applications” (page 51). Steven Pinker in How the Mind Works also uses production systems as his main example of a computational model of the mind (page 69).

 

A production system is a collection of condition-action rules incorporated in the thinking component of an agent’s observation-thought-decision-action cycle.

 

Condition-action rules (also called production rules) are similar to the behaviourist’s descriptions of behaviour. However, because they are used by an agent internally to generate its behaviour, their conclusions can be expressed in the imperative, rather than in the declarative mood:

 

If conditions then do actions.

 

Production systems were invented in the 1930’s by the logician, Emil Post, but were proposed as a computational model of human intelligence by Alan Newell.

The Production System Cycle

Production systems embed condition-action rules in an observation-thought-decision-action agent cycle:

 

To cycle,

 

observe the world,

                                                                                                                   

think,

                                                                                                                   

decide what actions to perform,

                                                                                                                   

act,

                                                                                                                   

cycle again.

 

Thinking is a form of forward reasoning, initiated by an observation matching one of the conditions of a condition-action rule. In such a case, the observation is said to trigger the condition-action rule. As in logic, the remaining conditions of the rule are verified and the conclusion is derived.

 

In logic, the conclusion is an inescapable consequence of the conditions. However, in production systems, it is only a recommendation to perform the actions that are the conclusion. If only one rule is triggered by the observations, then the recommendation is, in effect, an unequivocal command. However, if more than one is triggered, then the agent needs to choose between the different recommendations, to decide which actions to perform. This decision is called conflict resolution, because the different recommendations may conflict with one another.

 

For example:

 

            If someone attacks me, then attack them back.

            If someone attacks me, then get help.

            If someone attacks me, then try to escape.

 

Deciding what to do, when there is a conflict between different recommendations, can be harder than generating the recommendations in the first place. We will come back to the decision problem later.

Production Systems without any representation of the world

In the simplest case, an agent’s mental state might consist entirely of production rules alone, without any mental representation of the world. In such a case, the conditions of a rule are verified simply by matching them against the agent’s current observations. In this case, it can be said (and has been said) that the world serves as its own representation: If you want to find out about the world, don’t think about it, just look and see!

 

Observing the current state of the world is a lot easier than trying to predict it from past observations and from assumptions about the persistence of past states of affairs. And it is a lot more reliable, because persistence assumptions can easily go wrong, especially when there are other agents around, changing the world to suit themselves. It’s too early to consider this issue further in this chapter, but it is an issue we will return to later when we look more closely at what’s involved in reasoning about the persistence of states over time.

What it’s like to be a louse

 

To see what a production system without any representation of the world might be like, imagine that you are a wood louse and that your entire life’s behaviour can be summed up in the following three rules:

 

If it’s clear ahead, then move forward.

                                    If there’s an obstacle ahead, then turn right.

                                    If I am tired, then stop.

 

Because you are such a low form of life, you can sense only the fragment of the world that is directly in front of you. You can also sense when you are tired. Thus, your body is a part of the world, external to your mind. Like other external objects, your body generates observations, such as being tired or being hungry, which have to be attended to by your mind.

 

It doesn’t matter where the rules come from, whether they evolved by a process of evolution, or whether they were gifted to you by some Grand Designer. The important thing is, now that you have them, they govern and regulate your life.

 

Suppose, for the purpose of illustration, that you experience the following stream of observations:

 

                                    Clear ahead.

                                    Clear ahead.

                                    Obstacle ahead.

                                    Clear ahead and tired.

 

Matching the observations, in sequence, against the conditions of your rules results in the following interleaved sequence of observations and actions:

 

                                    Observe:               Clear ahead.

                                    Do:                   Move forward.

                                    Observe:               Clear ahead.

                                    Do:                   Move forward.

                                    Observe:               Obstacle ahead.

                                    Do:                   Turn right.

                                    Observe:             Clear ahead and tired.

 

At this point, your current observations trigger two different rules, and their corresponding actions are in conflict. You can’t move forward and stop at the same time. Some method of conflict resolution is needed, to decide what to do.

 

Many different conflict resolution strategies are possible. But, in this as in many other cases, the conflict can be resolved simply[1] by assign different priorities to the different rules, and selecting the action generated by the rule with the highest priority. In this case it is obvious that the third rule should have higher priority than the second. So the appropriate action is:

 

                                    Do:   Stop.

 

Once a louse has learned its rules, its internal state is fixed. Observations come and go and the associated actions are performed without being recorded or remembered. The price for this simplicity is that a louse lives only in the here and now and has no idea of the great wide world around it. But, for a louse, this is probably a small price to pay for enjoying the simple life.

Production Systems with Memory

Although the simple life may seem attractive, most people would probably prefer something more exciting. For this, you need at least a production system with an internal memory. The memory can be used, not only to record observations of the current state of the world, but also to store a historical record of past observations.

 

Typically, an individual observation has the form of an atomic sentence[2], so called because it contains no proper subpart that is also a sentence. Thus, the logical form of an observation contains none of the logical connectives, “and”, “or”, “if” and “not”, which turn simpler sentences into more complex ones. An atomic sentence is also called an atom.

 

In a production system with memory, a rule is triggered by a current observation that matches one of the conditions of the rule. Any remaining conditions are then verified by testing them against records of current, past or future observations, and the actions of the rule are derived as candidates for execution.

What it’s like to be a Mars Explorer

To imagine what a production system with memory might be like, suppose that your life as a louse is finished and you have been reincarnated as a robot sent on a mission to look for life on Mars.

 

Fortunately, your former life as a louse gives you a good idea how to get started.  Moreover, being a robot, you never get tired and never have to rest. However, there are two new problems you have to solve: How do you recognise life when you see it, and how do you avoid going around in circles.

 

For the first problem, your designers have equipped you with a life recognition module, which allows you to recognise signs of life, and with a transmitter to inform mission control of any discoveries. For the second problem, you need a memory to recognise when you have been to a place before, so that you can avoid going to the same place again.

 

A production system with memory, which is a refinement of the production system of a louse, might look something like this:

 

If the place ahead is clear

and I haven’t gone to the place before,

then go to the place.

 

If the place ahead is clear

and I have gone to the place before,

then turn right.

 

                                    If there’s an obstacle ahead

                                    and it doesn’t show signs of life,

                                    then turn right.

 

                                    If there’s an obstacle ahead

                                    and it shows signs of life,

                                    then report it to mission control

                                    and turn right.

 

To recognise whether you have been to a place before, you need to make a map of the terrain. You can do this, for example, by dividing the terrain into little squares and naming each square by a co-ordinate, (E, N), where E is the distance of the centre of the square East of the origin, N is its distance North of the origin, and the origin (0, 0) is the square where you start.

 

For this to work, each square should be the same size as the step you take when you move one step forward. Assuming that you know the co-ordinates of your current location, you can then use simple arithmetic to compute the co-ordinates of both the square ahead of you and the square to the right of you, and therefore the co-ordinates of your next location.

 

Every time you go to a square, you record your observation of the square together with its co-ordinates. Then, to find out whether you have gone to a place before, you just consult your memory of past observations.

 

Suppose for example, that you are at the origin, pointed in an Easterly direction. Suppose also that the following atomic sentences describe part of the world around you:

 

                                                Life at (2, 1)

                                                Clear at (1, 0)

                                                Clear at (2, 0)

                                                Obstacle at (3, 0)

                                                Obstacle at (2, -1)

                                                Obstacle at (2, 1).

 

Although there is life in your vicinity, you can’t see it yet. So, when you start, the only thing you know about the world is that it is clear at (1, 0).

 

Assume also that, although it is your mission to look for life, you are the only thing that moves. So this description of the world applies to all states of the world you will encounter (assuming that, when you occupy a place, it is still considered clear).

 

With these assumptions, you have no choice. Your behaviour is completely predetermined:

 

                                    Observe:               Clear at (1, 0)

                                    Do:                   Go to (1, 0)

                                    Observe:            Clear at (2, 0)

                                    Do:                   Go to (2, 0)

                                    Observe:            Obstacle at (3, 0)

                                    Do:                   Turn right

                                    Observe:            Obstacle at (2, -1)

                                    Do:                   Turn right

                                    Observe:               Clear at (1, 0)

                                    Remember:            Gone to (1, 0)

                                    Do:                   Turn right

                                    Observe:            Obstacle at (2, 1)  and  Life at (2, 1)

                                    Do:                   Report life at (2, 1) to mission control

                                    Do:                   Turn right.[3]

 

Notice that reporting your discovery of life to mission control is just another action, like moving forward or turning right. You have no idea that, for your designers, this is the ultimate goal of your existence.

 

Your designers have endowed you with a production system that achieves the goal of discovering life as an emergent property. Perhaps, for them, this goal is but a sub-goal of some higher-level goal, such as satisfying their own scientific curiosity. But for you, none of these goals or sub-goals is apparent.

The use of production systems to simulate goal-reduction

Production systems have been used, not only to construct computational models of intelligent agents, but also to build computer applications, most often in the form of expert systems. Many of these applications use condition-action rules to simulate goal-reduction explicitly, instead of relying on emergent properties to achieve higher-level goals implicitly.

                                                         

For example, the fox’s reduction of the goal of having cheese to the sub-goals of being near the cheese and picking it up can be simulated by the condition-action rule[4]:

 

                        If I want to have an object

                        then add to my beliefs that I want to be near the object

                        and pick up the object.

 

Here a goal Goal is represented in the system’s memory as a pseudo-belief of the form:

 

                        I want Goal.

 

The reduction of Goal to Sub-goals is simulated by a condition-action rule with a condition Goal and actions that are either pseudo-actions of the form:

 

                        add to my beliefs that I want Sub-goal

 

performed internally on the system’s memory or genuine actions performed externally.

 

The main problem with the simulation approach is that it looses the connection between goal-reduction and the belief that justifies it, in this case with the belief:

                        An animal has an object

if the animal is near the object

and the animal picks up the object.

 

As we have seen, the connection is that goal-reduction is the same as reasoning backwards with the belief, which is the main idea of logic programming.

 

Thagard (page 45) gives a similar example of a condition-action rule, but uses it to illustrate his claim that “unlike logic, rule-based systems can also easily represent strategic information about what to do”:

 

            If you want to go home and you have the bus fare,

            then you can catch a bus.

 

Forward reasoning with the rule reduces the goal (going home) to a sub-goal (catching a bus), and simulates backward reasoning with the belief[5]:

 

            You go home if you have the bus fare and you catch a bus.

 

Thus Thagard’s argument against logic can be viewed instead as an argument for logic programming, because it can “easily represent strategic information about what to do”.

 

Indeed, it seems that all of Thagard’s arguments for production rules are better understood as arguments for computational logic instead. This is because he confuses production rules:

 

                        If conditions then do actions.

 

with logical implications:

 

                        If conditions then conclusions.

 

An unfortunate confusion

 

This confusion is perhaps most apparent when Thagard writes (page 47) that “rules can be used to reason either forward or backward.” But this is not a true property of production rules, but rather a characteristic feature of logical implications.

 

Because conditions in production rules come first and actions happen later, true production rules can be used only in the forward direction, when the conditions hold to derive candidate actions. But because conclusions in logic are always expressed in the declarative mood, logical implications can be used to reason either forward or backward.

 

Thus Thagard mistakenly attributes to production rules a property that they don’t have, but that logical implications do, and then he uses this attribution to argue that “rules” are better than logic.

 

To be fair, it has to be recognised that Thagard is only reporting a generally held confusion. In this case, the rule that he uses as an example simulates goal-reduction, which is a special case of backwards reasoning with a belief expressed in logical form. However, in the next chapter, we will see that true production rules are a special case of forward reasoning with goals expressed in logical form. We will also see how goal-reduction and production rules can be combined in a more general framework, which uses logic for both beliefs and goals.

 

 

 

 

 

 

Summary

 

The use of production systems to generate the behaviour of an intelligent agent, as seen in this chapter, can be pictured like this:

 

 

   Observations                                                                          Actions

                                                ”Forward reasoning”

                                                                       

 

 

 

 



[1] An even simpler approach is to avoid conflict resolution altogether, by changing the rules, adding an extra condition “and you are not tired” to the first and second rules. A more complicated approach is to use Decision Theory, to compare the different options and to select the option that has the highest expected benefit. But, no matter how it is done in this case, the result is likely to be the same – better to rest when you are tired than to forge ahead no matter what.

 

[2] This assumes that an agent’s experience of the world can be expressed in linguistic terms. This is certainly not true of ordinary, natural language, but might, by some stretch of the imagination, apply to the “Language of Thought”. More about this later.

[3] I leave it to the reader to work out what happens next, and I apologise in advance.

[4] The rule can be paraphrased more naturally, although somewhat less precisely, in ordinary English: If I want to have an object, then I should be near the object and pick up the object. 

[5] In this form it is perhaps more obvious that the procedure will work only if the bus you catch is going past your home.