Lecture 17 Genetic Programming
The idea of a computer automatically programming itself is a very old, desirable
and elusive goal. Automatic Programming has, in the past, had a very bad
reputation. This is probably because, intuitively, it would seem that writing
software which is able to write software should be easier that writing software
able to prove theorems, or paint pictures, etc. However, computer programming is
a very difficult task, which involves intelligence, creativity, understanding,
cunning and guile. In short, it is as difficult to get a computer to program as it is
to get it to do anything else. So, when early attempts at automatic programming
largely failed to deliver what they promised, people began to avoid using the
term automatic programming, and people generally stayed away from the subject.
Many AI techniques disguise the fact that, to some extent, they are
performing automated programming. If we think about decision tree learning
techniques, for example,
the end product is a decision tree, which if it is to be used to actually make
decisions for us, has to be "executed", i.e., information about the system is
going to be given as input, and an answer will be computed. In this case, the
program doing the computing is fairly simple - decision trees can be easily
translated into a bunch of "if-then" statements. It's a similar situation with
Artificial Neural Networks. So, many AI techniques
are kind-of doing automated programming. The Genetic Programming (GP) people are
more explicit about this - they state clearly that their GP engines program
software automatically.
As you will have guessed from the name, the way Genetic Programming
engines generate programs is by following an evolutionary
approach. The general approach is as follows. The user specifies the
task to be undertaken (or problem to be solved) using an evaluation
function to express what the evolved programs should do. They also
specify what kind of things the programs will be able to use during
the computation, e.g., whether or not they will be able to multiply
two numbers together. Then, an initial population of programs is
generated at random. Each program is translated, compiled and executed
and how well it performs with respect to the task is assessed. This
enables the calculation of a fitness value for each of the programs,
and the best of them are chosen for reproduction. Programs are
combined or mutated into offspring, which aree Äadded to the next
generation of programs. This process repeats until a termination
condition is met.
This leaves open the following questions, which we look at in this lecture:
- How does the user specify the problem to be solved by the evolved programs?
- How does the user specify the ingredients which make up the evolved
programs?
- How is an initial population generated?
- How are programs chosen to mate?
- How do we represent programs in a way amenable to reproduction?
- How are parent programs used to generate offspring programs?
- How does the GP engine know when to stop?
Obviously, there have been many answers to these questions, and some approaches
to genetic programming have failed where others have succeeded. A big decision, as
with most AI techniques is the choice of representation, because many of the
questions above depend on the way in which programs are represented. Hence, we
will look at this question first.
17.1 A Graphical Representation of Programs
We look here at how to represent programs in a way that enables one program to be
combined with another. To do this, we will need to be able to remove parts
of a program in a meaningful way, and to add new parts to programs. We are used
to writing programs procedurally as a series of instructions: do this, then do
that, if this is true, then do this and that, etc. Those of us lucky enough to
have programmed in Prolog also know how to write programs declaratively, so that
we specify what we want, and the Prolog interpreter finds answers for us. In
both cases, the programs are just lines of code. Therefore, one possible way for
two programs to be combined would be to jumble up the lines of code. With
the bit strings in genetic algorithms, it made sense to keep regions of the
string in-tact, rather than jumble up the bits randomly, as the valueable parts
of the solution may be contained in those substrings. The same is true with our
programs: presumably we would want to pass on long regions of code to the
offspring program.
So, randomly combining lines is ruled out, and an approach
similar to the cross-over routines for GAs is required.
However, this approach is still problematic, as the combined
programs would often not make any sense and hence a compiler would not be able
to compile them. At the very least, we need our offspring programs to be compileable, and
our representation scheme needs to take this into account. For this reason, in
most GP algorithms, they use a graphical representation of programs, where each
branch and sub-branch of the tree is syntactically self-contained, so that, if we
combine programs by chopping off and adding subtrees, the resulting
programs will still be syntactically valid.
The first thing we will do to find a good representation scheme is to say that
our programs are going to be thought of as functions: they take in a set of
values and output a single value. Such programs are called result producing
braches and will typically form part of larger program structures which we
will skip over for now. The next thing we will do is to specify that
our graphs will be trees (i.e., no cycles), and that each node of the graph
itself represents a function or an input to a function such as a variable or a
constant. Demonstrating the graphical representation is
easiest by example. Suppose we
wanted a function which added together two numbers, X and Y, then took the square root
of this sum, and, if the answer is less than 10, output X, otherwise output X
divided by Y. The following graph represents a program that would do this for
us:
Here, we see that there is an IFLTE node, which stands for "If-Less-Than-Else".
The first and second nodes below this (counting from left to right), are the
values which are tested: if the first is less than the second, then the IFLTE
function returns the value of the third node below it, and if the first is not
less than the second, the IFLTE function returns the value of the fourth node
below it. The four nodes below IFLTE themselves represent functions, for
instance the square root node takes a single input, which is the output of the
plus function. The plus function takes X and Y as input and outputs their sum.
We see that the second node is just the number 10 (because we are checking that
the root of X+Y is less than 10). Also, the third node below IFLTE is X, which
says that if the property is true, then output X, else output the division of X
by Y.
As discussed below, the functions such as addition, multiplication, etc,. and
the constants allowed in the evolved programs are defined in advance by the
user. Also, the programming constructs, such as if-then-else nodes, and for-loop
nodes are specified in advance. There is no agreed upon formalism for the
programming constructs, and how they are defined will affect the expressibility
of the programs containing them. For instance, below is an alternative
representation for the above function:
Here, we see that the IFLTE node has been replaced by a more general IF node,
which returns the value of the second node if the first node returns "true", else
it returns the value of the third node below it. We see how this can be more
expressive, because any boolean function, not just <, can be substituted into
the program. Higher expressivity means that more programs can be found as
potential solutions to the problem at hand. However, it also means that the
search space of programs gets bigger, so solutions may not be found as quickly.
Hence, we should think hard about what our evolved programs should do: if they
only need to check whether one number is less than another, then we can
perfectly well get by using just an IFLTE node. If, however, we need to check
for equality, divisibility, and lots of other things, then perhaps we should use
the second representation scheme.
17.2 Specifying the Program Task
Now that we know the kind of representation scheme to use, we can look at the
question of how we will evolve programs in that represenation. Remember that we
are going to try to evolve a program to undertake a task. So, we must specify
what that task is, in order for the fitness function to
determine which programs are doing well at the task, and for the GP engine to
know when to stop. As with Genetic Algorithms, this is often one of the hardest
parts of working with GPs. One possibility for specifying the task is simply to give a
set of input-output pairs. Then, a program will calculate an output for each of
the inputs, and will get a certain number of them right, i.e., the program
outputs the same as in the user-given input-output pair. The evaluation
function is then defined as the proportion of inputs for which the correct
output is generated.
For example, in this paper:
Discovery of
Understandable Math Formulas Using Genetic Programming
the author evolves a series of programs which can find the highest common factor
(HCF) of two integers (the point of the research is to show how this can be done so
that the programs produced can be understood). To check how well the programs
are doing with respect to finding the HCF of two integers, a set of triples
(X,Y,Z) are supplied, where the HCF of X and Y is Z. Towards the end of the
evolutionary process, the programs were getting all the examples correct, and
when the programs were analysed, they did indeed calculate the HCF as per its
mathematical specification. Notice how similar this is to machine
learning programs being given sets of positive and negative examples to learn over
(which re-inforces the fact that we can see evolutionary approaches to AI as
machine learning efforts).
Genetic programming has been used for applications where the real reason to
evolve programs is to enjoy the output from the programs, i.e., the
artefacts produced by the programs are more interesting than the programs
themselves. For these kinds of applications, a more sophisticated fitness
function is often required. For example, in their paper "Learning to Colour
Greyscale Images", Penousal Machado, Andre Dias and Amilcar Cardoso used a GP
approach to generate programs able to take a greyscale image and colour it in.
They tried a variety of fitness functions to gauge how well the colouring in
process had done. These functions used information such as pixel hue and
intensity. For example, the following scary piece of mathematics was used as a fitness
function:
This is obviously very specific to the task at hand. As with Genetic
Algorithms, the specification of the evaluation function is nearly always
problem specific.
17.3 Other Specifications
In addition to giving details of what task the evolved programs are supposed to
undertake, the GP user must specify some more details before they can start a
session. These include:
This is the set of functions, such as addition, multiplication, taking square
roots etc., which will be the component parts of the evolved programs.
As with the evaluation function, the set of functions will be hand-carved for
the particular task. For instance, if you are evolving a program to control a
robot as it tries to find its way out of a maze, the functions will include
things like turning left, turning right, going forward, etc. If you are evolving
functions to manipulate images (see the application to generating art below),
the functions will involve mathematical functions such as sine and cosine, and
pixel functions, such as finding pixel colours, hues and intensities, and setting
pixel colours, hues and intensities.
The function set also includes the set of programmatic functions such as
if-then-else and for-loops. It is often instructive to see if good programs can be
evolved with, for example, while-loops but no for-loops.
The terminal set contains all the variables and constants which will apppear in
the evolved programs. This will typically include some numbers which may be
randomly generated, and, as with the function set, it will include problem
specific details. In a robot controlling scenario, it may be that movement functions are
parameterised by directions such as left, right, forward, backwards, which would
form part of the terminal set for that GP application. Similarly, in a graphics
application, constants such as pi might be put into the terminal set. The terminal set
is so called, because in the tree representations, the constants and variables
are found at the end of the branches.
There are many possibilities for how the search will proceed, and the user
should tweak various parameters to optimise the performance of the GP engine.
The main consideration will be the size of the population, as this will effect
the GP the most: larger populations will mean fewer generations in the time
available, but will mean larger diversity within the population of programs.
Given the programs will grow as they evolve, another important parameter will be
a cap on the length of the programs that can be produced. One criticism of GP
approaches is that the programs produced are too large and complicated to be
understood, so, if being able to understand the resulting programs is a
consideration, the length of the programs should be kept relatively small. Other
parameters will control various probabilities, including the probability that
each genetic operator (see later) will be employed.
The ways to specify when the GP engine should stop are very similar to those
for Genetic Algorithms. One possibility is to let the process run for a certain
amount of time, or until it has produced a certain number of generations, then
take the best individual produced in any generation. Many GP implementations
enable the user to monitor the process and click on the stop button when it
appears that the fitness of the individuals has reached a plateau.
Alternatively, the user may specify that populations are continually produced
until an individual which is above a certain fitness is produced.
17.4 Evolving New Populations
Genetic Programming engines begin by generating a initial population
of programs randomly. Each tree is generated by randomly choosing a
function from the function set for every internal node in the tree,
and setting the inputs to this to be constants and variables as
terminals, again chosen randomly. Then, terminal points are chosen to
be altered by adding in functions, etc. The programs will have many
different shapes and sizes, subject to the maximum program size
parameter specified by the user. Care must be taken to make sure that
the input types to functions is correct. After the seeding of the
initial population, individuals are selected to produce offspring. The
production process uses one of a set of genetic operators, as
described below. The old population is killed off, and the process is
started again with the new population.
- Choosing Individuals to Reproduce
The user has specified an evaluation function which calculates a value
for each individual in the population. Different GP implementations
use this in different ways, but always do so in such a way that the
chance of an individual being chosen increases as the score it gets
from the evaluation function increases. As with GAs, individuals are
selected to go into an elite intermediate population (IP), from which
individuals will be chosen to produce offspring. One approach to doing
this is similar to the approach with Genetic Algorithms: the
evaluation function assigns a probability to each individual in a
mathematically principled way, and each one is allowed into the IP in
a probabalistic fashion. So, for example, if an individual is assigned
0.8 by the fitness function (using the evaluation function), then a
random number between 0 and 1 will be generated. If it's over 0.8, the
individual will be allowed to reproduce, if not it will be unlucky.
Another approach is to apply tournament selection: pairs of individuals are
chosen at random and the most fit one of the two is chosen for reproduction.
This is meant to simulate the kind of competition that occurs in reproduction,
and means that if two fairly unfit individuals are paired against each other,
one of them is guaranteed to reproduce. A third way of choosing individuals is
to rank them using the evaluation function and choose the ones at the top of the
ranking, which means that only the best will be chosen for reproduction.
As with most AI applications, it's a question of trying out different approaches
to see which works for a particular problem.
Individuals are chosen from
the intermediate population and genetic operators are used to produce new
individuals from old ones. Which operator is used at a particular time is
chosen probabalistically, with the user specifying the probabilities.
There are two types of genetic operators:
ones which generate a new individual from a single parent, and ones which
generate a new individual from a pair of parents. The simplest operator is
called reproduction, which copies a single parent into the new
generation. This means that copies of individuals from the old generation
can make it into the new generation, which is why the original individuals in
the old generation can be killed off entirely.
The other genetic operator which produces offspring from a single individual is
mutation. As with genetic algorithms, this operation is performed sparingly, and serves to
help the population get down from local maxima. A point on the individual's tree
is chosen at random, and the subtree below that point is removed, to be replaced
by a randomly generated subtree, with the generation done in the same way as for
the initial population. Sometimes the mutation is constrained so that functions
in the tree can only be replaced by functions and terminal nodes can only be
replaced by other terminal nodes. Below is an example of a random mutation on an
individual program.
We see that the square root subtree has been removed and replaced, so that the
program calculates root(17)*root(x) instead of root(x+y) down the left hand
side of the tree.
Another genetic operator is called crossover. This takes two parent
individuals and chooses a point on the first and a point on the second at
random. It then swaps
the two subtrees which start at the point. The subtrees to be swapped are called
the crossover fragments. This produces two offspring: the first parent
with a fragment from the second, and the second parent with a fragment from the
first. The following highlights this process:
We see that two children have been generated from the two parents.
Note that the parents could be two copies of the same individual, in which case
the operator has to make sure that the point chosen on the first copy is
different to the point chosen on the second copy, otherwise the operator
simply produces two copies of the original.
We have concentrated here on simple programs which consist of a single main
routine which produces the output (the whole program is called a result
producing branch). In more complicated programs, there will be more
complex structures such as iterations (for-loops) and subroutines.
A final set of genetic operators which operate on the more complicated
structures are called architecture-altering
operations. These change aspects of the subroutines, including deleting and
copying them, and altering the arguments passed to them.
17.5 Applications of Genetic Programming
- Designing Electronic Circuits
John Koza, a professor at Stanford and CEO of Genetic Programming Inc. is
perhaps the person most responsible for making GP more acceptable in the eyes of
the AI community. He and his team have successfully applied genetic programming
techniques to a variety of applications ranging from bioinformatics to
distributed systems. One of their most successful endeavours has been to the
generation of electronic circuit designs. Here, the programs are actually all
about the flow of information around the circuits, so the function set contains functions which
mimic the actions of transistors, resistors, etc., on the flow of electricity.
According to the web site at Genetic Programming Inc:
|
"there are now 36 instances where genetic programming has automatically produced a
result that is competitive with human performance, including 15 instances where
genetic programming has created an entity that either infringes or duplicates the
functionality of a previously patented 20th-century invention, 6 instances where
genetic programming has done the same with respect to a 21st-century invention, and
2 instances where genetic programming has created a patentable new invention."
|
For more information about Koza's work, visit genetic-programming.com.
One of the most exciting and creative areas in which genetic
programming is being is applied is evolutionary art. In contrast to
most GP applications, in evolutionary art, the user often acts
directly as the fitness function. That is, the GP engine generates a
set of programs which can produce images (JPG's etc.), either by
transforming a given image, or generating pixel data from scratch.
These images are then shown to the user, who performs the selection by
choosing those which they most like. The GP engine then generates a
population from the chosen images and selects from it images which
fairly closely resemble the ones chosen by the user, or which have
some properties similar to the chosen ones, e.g., colour
distribution. The user then selects those with most appeal again, and
the process continues until the user is so happy with the image that
they put it on their homepage. The evolutionary art community includes
many artists and computing professionals, and the artworks their
programs produce generate much interest (similar to how everyone was
amazed by fractal images when they first came out). Such an approach
was recently used to generate images for an ad-campaign by Absolut
Vodka, for example.
One evolutionary art program is called Nevar, which is written and
maintained by Penousal Machado of Coimbra University, Portugal (he is
also the person who researched how to colour in greyscale images - as
part of the Nevar project). The images given below were generated by
Nevar:
|
|
|
©Penousal Machado
|
©Penousal Machado
|
Check out more images at the EvoArt web page:
EvoArt.
© Simon Colton 2004
|