Papers at the workshop will be refereed for a special issue (2001-B1, 2001-B2) of the Machine Intelligence area of the ETAI (Electronic Transactions on Artificial Intelligence) Journal. This includes referee comments on the papers.
We would like to acknowledge generous support of the Machine Intelligence 18 workshop by EPSRC Grant GR/R69655/01.
Programme for Machine Intelligence 18 (PDF - includes information about breaks and session chairs)
Contributors
Accommodation Information - Please contact Jane White (jane@cs.york.ac.uk) concerning booking of accomodation at the Dean Court and Minster Hotels.
Speaker: Kiyoshi Asai
Title: Finding Genes from DNA Sequences by Using Multi-stream Hidden Markov Models
Abstract:
It is necessary to integrate various types of information in
order to recognize the genes in DNA sequences. The
information might be the four letters of DNA sequences, stochastic
frequency of the corresponding codons, homology
search scores, splice cite signal strengths. We have developed
a software system of multi-stream Hidden Markov Models
with which those information can be easily integrated with a
consistent measure of probabilities.
The output symbols of HMMs are the signals that we can observe.
In the field of bioinformatics, the output symbols of
the HMMs are mostly the four letters of nucleic acids or the
twenty letters of amino acids.
However, the output symbols can be anything, as far as we can
attach their probability distributions. They can be
discrete symbols, real values, real valued vectors, and multiple
streams of those values. We propose gene annotation
using HMMs with multiple streams, which combine the sequence
and other pre-processed information.
The important feature of multi-stream HMMs is that the weights
of the streams can be optimized for each model. The
multi-stream HMM with adjustable weight permits very flexible
design of the gene finding systems.
(URL: http://www.genedecoder.org/)
TBA
Speaker: Michael Ashburner
Title: On the representation of "gene function and process" in genome databases: The Gene Ontology Consortium
Abstract:
Genes are expressed in temporally and spatially characteristic patterns.
Their products are (often) located in specific cellular compartments and
may be components of complexes. Gene products possess one or more
functions which may play a role in one or more biological processes. We
are developing a method to describe these attributes in a rigorous way that
will allow researchers to consistently annotate and explore the universe
of genomes.
This is a collaborative project between several model organism databases;
FlyBase for the fruitfly, the Saccharomyces Genome Database for yeast, Mouse
Genome Database for mouse, TAIR for Arabidopsis and WormBase for C. elegans.
GO is now being used quite extensively in industry.
The three organizing principles of GO are function, process and cellular component.
A gene product can have one or more functions and can be used in one or
more processes; a gene product may be a component of one or more supra-molecular
complexes. Function is what something does. It describes only what it can do
without specifying where or when this usage actually occurs. Process is a
biological objective. A process is accomplished via one or more ordered assemblies
of functions.
The model organism database curators assign controlled vocabulary terms to gene
products based on current knowledge. This information describes the specific
function(s) of a gene product, the role(s) it plays in cellular processes and
its localization and associations. The structure of the "Gene Ontology" Database
allows both attribution and querying at different levels of granularity. The
Gene Ontology database provides annotation, definitions, and unique identifiers
for all terms. The cooperation between the model organism databases ensures the
use of a common ontology and, therefore, facilitates queries across all cooperating
model organism databases.
Gene Ontology, therefore, allows users to make queries exemplified by:
"What mouse genes encode products playing a role in RAS signal transduction ?";
"What fly genes encode products in this biological process ?"; "What genes in
flies and yeast have products that are components of the mitochondrial inner
membrane and function as enzymes ?".
GO is modelled as a directed acyclic graph, with two classes of relationship
between parent and child - instance (isa) and part-of. The semantics of these
relationships are not as well defined as we would wish. In principle, all GO
terms have formal definitions (in practice only 11% as I write). GO is
also being instantiated in DAML+OIL by the Manchester group.
The GO database is open to all without restraint. It, a GO browser and editor,
are available from http:://www.geneontology.org/.
We thank AstraZeneca & the NIH for financial support.
Speaker: Christopher H. Bryant
Title: Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes
Abstract:
We aim to partially automate some aspects of scientific
work, namely the processes of forming hypotheses, devising trials to
discriminate between these competing hypotheses, physically performing
these trials and then using the results of these trials to converge
upon an accurate hypothesis. We have developed ASE-Progol, an Active
Learning system which uses Inductive Logic Programming to construct
hypothesised first-order theories and uses a CART-like algorithm to
select trials for eliminating ILP derived hypotheses. We have
developed a novel form of learning curve, which in contrast to the
form of learning curve normally used in Active Learning, allows one to
compare the costs incurred by different leaning strategies.
We plan to combine ASE-Progol with a standard laboratory robot to
create a general automated approach to Functional Genomics. As a
first step towards this goal, we are using ASE-Progol to rediscover
how genes participate in the aromatic amino acid pathway of
Saccharomyces cerevisiae. Our approach involves auxotrophic mutant
trials. To date, ASE-Progol has conducted such trials in silico.
However we describe how they will be performed automatically in vitro
by a standard laboratory robot designed for these sorts of liquid
handling tasks, namely the Beckman/Coulter Biomek 2000.
Speaker: Jacques Cohen
Title: Classification of approaches used to study cell regulation: Search for a unified view using constraints and machine learning
Abstract:
Historically, the first attempts to study cell regulation were
based on establishing systems of nonlinear differential
equations that "simulate" the behavior of the variables
involved as a function of time. In that context the variables
correspond to genes capable of producing compounds (mRNA
and proteins) that in turn influence the production capacity
of other genes. This problem has been thoroughly studied
in the mathematical theory of dynamic systems. Graphs
expressing the interactions among genes are often used
as convenient abstractions enabling the formulation of
the appropriate systems of differential equations. The
nodes of those graphs correspond to the variables (genes)
and the directed edges -- labeled by plus or minus signs
-- express the positive or negative interactions among
the variables. In the case of cell regulation thousands
of genes may interact with each other.
The advent of micro-arrays urgently motivates the solution of
the inverse problem: given the curves expressing the
capacity of each gene of producing known compounds as
time progresses, attempt to infer the original system
of equations or equivalently the graph expressing the
interactions among genes. Presently biologists are capable
of generating huge sets of data expressing the amount
of production of proteins or mRNA as a function of time.
One then wishes to establish the corresponding gene interaction
graph. The inverse problem may be called "modelization"
in contrast with "simulation", the term used to describe
the direct problem of determining the behavior of the
variables in a system of differential equations.
There have been many attempts to discretize (render discrete)
the cell regulation problem. By considering Boolean
variables instead of continuous variables one can simplify
remarkably the complexity of the original problem. Several
papers are now available that describe solutions of both
the direct and inverse problem using the Boolean approach.
An important aspect of the inverse problem is that the data obtained
by biologists is not reliable in the sense that many
experimental errors occur when dealing with thousands
of genes. It then becomes imperative to take into account
a probabilistic approach of the inverse problem. In other
words, given imperfect data expressing the behavior of
variables with time, can one infer the corresponding
gene interaction graph?
More recently this last problem has been simplified by having
biologists conjecture the interaction graph and propose
the simpler question: How likely is the hypothesized
graph capable of generating the given data set?
The approach used in this paper is to first attempt to classify
the existing methods proposed in the literature to solve
the direct and inverse problems. An initial classification
involves three parameters and thus eight subclasses:
Probabilistic versus Nonprobabilistic
Direct versus Inverse
Continuous versus Discrete
In a recent article de Jong and Page propose a Direct-Discrete-Nonprobabilistic
type of approach; that amounts to a qualitative (a la
Kuipers) simulation of the systems of differential equations
representing a given influence graph. Even though their
algorithms are described in a standard programming language,
the authors make extensive use of constraints in formulating
the solution to that problem.
In this paper I first propose the use of constraint logic programming
as a paradigm for solving the discrete versions of the
cell regulation problem. Basically one considers regions
in an n-dimensional space, and transitions that specify
possible paths between the regions. Both regions and
transitions are specified by sets of constraints that
restrict the values of the variables being studied. A
path linking an initially specified region, to subsequent
regions via the transitions, expresses the behavior of
the system. (That description is inherent to the approach
taken by de Jong and Page.) The inverse problem amounts
to determining regions and transitions when given the
paths that may be derived from the data expressing the
behavior of the variables. From those regions and transitions
one should be able to determine the desired gene interaction
graph.
In the particular case of Boolean constraints -- a special case
among the discrete approaches -- the determination of
the regions and transitions amounts to the problem of
synthesizing Boolean circuits from input and output data.
The regions correspond to subsets of the input data and
the transitions correspond to Boolean formulas that allow
transforming the input data into output. Therefore the
problem using Boolean variables can be expressed in terms
of program synthesis or machine learning. Similarly,
it is hoped that the synthesis of constraints from data
should play a role in the discrete solution of the inverse
problem.
An interesting payoff of the use of constraints is that they
may well pave the way for probabilistic solutions to
both the direct and indirect versions of the cell regulation
problem. The approach reinforces the importance of developing
probabilistic machine learning algorithms expressed in
logic programming and in constraint logic programming.
Speaker: Adrian Cootes
Title: Automatic determination of protein fold signatures from structural superpositions
Abstract:
It remains unclear what principles underlie a protein sequence/structure
adopting a given fold. Local properties such as the arrangement
of secondary structure elements adjacent in sequence
or global properties such as the total number of secondary
structure elements may act as a constraint on the type
of fold that a protein can adopt. Such constraints might
be considered "signatures" of a given fold and their
identification would be useful for the classification
of protein structure. Inductive Logic Programming (ILP)
has been applied to the problem of automatic identification
of structural signatures. The signatures generated by
ILP can then be both readily interpreted by a protein
structure expert and tested for their accuracy.
A previous application of ILP to this problem indicated that
large insertions/deletions in proteins are an obstacle
to learning rules that effectively discriminate between
positive and negative examples of a given fold. Here,
we apply an ILP learning scheme that reduces this problem
by employing the structural superposition of protein
domains with similar folds. This was done in three basic
steps. Firstly, a multiple alignment of domains was generated
for each type of fold studied. Secondly, the alignment
was used to determine the secondary structure elements
in each of those domains that can be considered equivalent
to one another (the "core" elements of that fold). Thirdly,
an ILP learning experiment was conducted to learn rules
defining a fold in terms of those core elements.
Speaker: Luc De Raedt
Title: Molecular Feature Mining in HIV Data
Abstract:
We present the application of Feature Mining techniques
to the Developmental Therapeutics Program's AIDS
antiviral screen database. The database consists of
43576 compounds, which were measured for their capability
to protect human cells from HIV-1 infection.
According to these measurements, the compounds were
classified as either active, moderately active or inactive.
The distribution of classes is extremely skewed: Only 1.3%
of the molecules is known to be active, and 2.7% is
known to be moderately active.
Given this database, we were interested in molecular
substructures (i.e., features) that are frequent in
the active molecules, and infrequent in the inactives.
In data mining terms, we focused on features
with a minimum support in active compounds and a
maximum support in inactive compounds. We analyzed
the database using the level-wise version space algorithm
that forms the basis of the inductive query and database
system MolFea (Molecular Feature Miner). Within this framework,
it is possible to declaratively specify the features of
interest, such as the frequency of features on (possibly
different) datasets as well as on the generality and syntax
of them. Assuming that the detected substructures are causally
related to biochemical mechanisms, it should in principle
be possible to facilitate the development of new
pharmaceuticals with improved activities.
Speaker: Koichi Furukawa
Title: A Computational Model for Children's Language Acquisition using
Inductive Logic Programming
Abstract:
This paper describes our research activity on developing a
computational model for children's word acquisition using inductive
logic programming. Modeling human language acquisition is a very hard
problem. The reason why we focused children's word acquisition is that
it is an initial phase of the entire human language acquisition and
therefore the problem is rather easy compared to the entire learning
activity. We tried to incorporate cognitive biases developed recently
to explain the efficiency of children's language acquisition. We also
designed a co-evolution mechanism of acquiring concept definitions for
words and developing concept hierarchy. Concept hierarchy plays an
important role of defining contexts for later word learning processes.
A context switching mechanism is used to select relevant set of
attributes for learning a word depending on the category which it
belongs to. On the other hand, during acquiring definitions for words,
concept hierarchy is developed. This mechanism is supposed to solve
the selection problem of relevant sensory inputs for each word
learning task. We developed an experimental language acquisition
system called WISDOM (Word Induction System for Deriving Object Model)
and conducted virtual experiments or simulations on acquisition of
words in two different categories. The experiments showed feasibility
of our approach.
Speaker: Jean Hayes-Michie
Title: Real Time Skill: the role of semantic information in learning, performance and report
Abstract:
The literature on human skills has seen
the rise of two successive models described
below in simplified form:
(1) BEHAVIOURIST: skill is formed by pure
practice. Trial-and-error experience builds
a black box of stimulus-response bonds,
impervious to both instruction and introspection.
Subjects' post hoc reports are confabulations.
(2) COGNITIVIST: skill is first acquired in
'declarative' form, though either or both of
(a) construction of a conscious model
from the task's perceived behavioural semantics;
(b) tutorial exposition of the task's structure
and of applicable heuristics. Declarative forms
are then internally remoulded into procedural
black boxes, but also retained for purposes
of post hoc report.
EXPERIMENTAL FINDINGS
The learning by 51 male and female subjects
of a computer simulation of a well-known
real-time control problem supplied some
useful observations. Under certain
conditions of 'cognitive overload' an otherwise
unlearnable task bacame learnable only when
the screen presentation was descriptive of the
task's behavioural semantics. The cognitive
overload condition was the combination of a
substantially visual task with female gender.
It is well-attested that females are in general
at a disadvantage when dealing with highly visual
information, especially if it involves motion.
When a meaningful animated cartoon was substituted
for an abstract representation, female subjects
showed a capacity to learn, a capacity quite
absent when they were asked to use the simple
'instrument panel' variant.
This corroborates elements of the cognitivist model,
and is not explicable in terms of pure behaviourism.
But discovery by male subjects of key heuristics by
trial and error, regardless of representation, contradicts
certain elements of the cognitivist position.
A need is apparent for a cognivitist-behaviourist
synthesis. A new model is suggested.
Speaker: Ross King
Title: Representing molecules for induction using logic and quantum mechanics
Abstract:
The key to applying machine inference to scientific problems is to
have an appropriate representation of the problem. The most
successful applications of inductive logic programming (ILP) have been
in the field of learning molecular quantitative structure activity
relationships (QSARs). The basic representation used in these
applications has been based on atoms linked together by bonds. This
representation, although successful and intuitive, ignores the most
basic discovery of 20th century chemistry, that molecules are quantum
mechanical objects. Here we present a new representation using
quantum topology (StruQT) for machine inference in chemistry which is
firmly based on quantum mechanics. The new representation is based on
Richard F.W. Bader's quantum topological atoms in molecules (AIM)
theory. Central to this representation is the use of critical points
in the electron density distribution of the molecule. Other gradient
fields such as the Laplacian of the electron density can also be used.
We have demonstrated the utility of the use of AIM theory (in a
propositional form) in the regression problem of predicting the lowest
UV transition for a system of 18 anthocyanidans. The critical point
representation of fields can be easily mapped to a logic programming
representation, and is therefore well suited for ILP. We have
developed an ILP-StruQT method and are currently evaluating it for
QSAR problems.
Speaker: Jan Komorowski
Title:
On Discovering Gene Function with Approximate Methods
Abstract:
Functional Genomics is the study of gene function on a large
scale that is done by conducting parallel analysis of
gene expression for thousands of genes. This research
is still in an early phase but has been recently sped
up by the advent of high-throughput technologies such
as, for instance, microarrays. With the completion of
several genomes, the attention is shifting towards discovering
function of new genes. Critical to any scientific discovery
is the ability to re-use the existing knowledge. Somewhat
surprisingly, most if not all work currently reporting
on the analysis of gene expression applies unsupervised
approaches that do not really use the huge repositories
of biological knowledge.
In our recent work, we have taken a supervised approach to predicting
gene function from gene expression time-series profiles
and from background biological knowledge. To this end
we annotated known genes with terms from Gene Ontology.
Using rough sets, as implemented in the publicly available
ROSETTA toolkit, we then generated a predictive model
of high quality (AUC value 0.83) that provided, among
others, function hypotheses for uncharacterized genes.
The method is robust in the following sense. Despite
missing annotations of some known genes, the model appeared
to be able to classify and re-classify several of the
known genes in a way that was new to biologists. In
other words, the model discovered biological knowledge
that was not present in the learning examples. These
discoveries, represented as false positive classifications,
were initially dismissed. After a literature-based
re-examination by biologists, a significant portion of
the false positives appeared to be correct. A considerable
number of the literature confirmations of false positive
classifications could be visualized by using PubGene
(http://www.pubgene.org/), a tool developed by us for
effective retrieval of literature-based knowledge about
relations between genes. Our recent publication in Nature
Genetics 28, 21-28 (2001) gives details about that work.
Our work demonstrates that Gene Ontology annotations emulate
deep biological knowledge that may be functionally linked
to gene expression profiles thereby aiding the discovery
of new gene functions.
We conclude that methods of approximate reasoning that are able
to deal with large amount of conflicting and missing
knowledge will be instrumental to the further development
of Functional Genomics.
Speaker: Donald Michie
Title: Return of the "imitation game"
Abstract:
Very recently there has been an unexpected
rebirth of Turing's imitation game in the
context of commercial demand. To meet the
new requirements the following properties
of human conversation constitute a minimal
list of what must be simulated.
Real chat utterances are concerned with
associative exchange of mental images.
They are constrained by contextual relevance
rather than by logical or linguistic laws.
Time-bounds do not allow real-time construction
of reasoned arguments, but only the retrieval
of stock lines and rebuttals, assembled
Lego-like on the fly.
A human agent has a place of birth, age, sex,
nationality, job, family, friends, partners,
hobbies etc.
On meeting again with the same conversational
partner, a human agent is expected to recall
the gist of previous chats, as well as what
has passed in the present conversation so far.
A consistent personality emerges from a human's
expression of likes, dislikes, pet theories,
humour, stock arguments, superstions, hopes,
fears, aspirations etc.
A human agent typically has a main goal, of
fact-provision, fact elicitation, wooing,
selling, "conning" etc. at each stage.
A human agent remains ever-ready to maintain
or re-establish rapport by switching from
goal mode to chat mode.
Some mechanisms and tactics for simulating
these things will be illustrated from work
in progress.
Speaker: Satoru Miyano (represented by Hideo Bannai)
Title: HypothesisCreator: Concepts for Accelerating the Computational Knowledge Discovery Process
Abstract:
We summarize and discuss the work accomplished through the
HypothesisCreator (HC) project, an ongoing effort whose aim is to
develop systematic methods to accelerate the computational knowledge
discovery process.
A novel approach to describe the knowledge discovery process, focusing
on a generalized form of attribute called view, is presented. It
is observed that the process of knowledge discovery can, in fact, be
modeled as the design, generation, use, and evaluation of views,
asserting that views are the fundamental building blocks in the
discovery process. Also, we note that the trial-and-error cycle
inherent in any knowledge discovery process, can be formulated as the
composing and decomposing of views. To assist this cycle, a
programming language called VML (View Modeling Language) was designed,
providing facilities for this purpose. VML is designed as an
extension to the functional language ML with the extensions `view' and
`vmatch', which are used for defining and decomposing
views. We describe, as VML programs, successful knowledge discovery
tasks which we have actually experienced, and show that computational
knowledge discovery experiments can be efficiently developed and
conducted using this language.
The original idea for VML, however, was only informally described and
therefore contained subtle problems both in theory and practice. This
was settled by Sumii and Bannai (2001) who gave the formal foundation
of VML by defining VMlambda, an extention to the standard simply
typed call-by-value lambda-calculus.
Speaker: Shinichi Morishita
Title: Enumerating Alternative Spliced Transcripts for Understanding Complexity of the Human Genome
Abstract:
According to recent several reports, the number of human genes is estimated
fewer than 40,000, which for example only doubles 19,000 of Caenorhabditis
elegans. This fact indicates that the correlation between complexity of a
species and its gene number appears to be loose, motivating studies on
other sources of complexity such as alternatively splicing and
cis-regulatory machinery. Machine learning would be useful to uncover
relevance between those mechanisms and biological consequences, but the
first step towards this research direction would be to list candidates of
alternatively spliced transcripts and promoters. Because of the release and
updates of the human genome, this enumeration could be achieved by aligning
millions of expressed sequence tags (ESTs, for short) to a draft human
genome and then organizing alignments into related groups, which is however
computationally costly.
We therefore have developed an efficient algorithm for this purpose, taking
only one day to align millions of ESTs to a newly revised draft genome from
scratch. Analysis of 2.2 millions of alignments identifies about 8,000
groups of alternatively spliced transcripts. A typical group contains tens
of alignments, demanding a method of selecting representatives. All the
results are accessible through the WWW at our Gene Resource Locator web
site "http://grl.gi.k.u-tokyo.ac.jp/."
Speaker: Stephen Muggleton
Title: Stochastic Search in Progol
Abstract:
Most search techniques within ILP are based on clause
refinement. Such searches are typically time-consuming, requiring the
testing of a large number of clauses. Acceptable clauses typically
need to be consistent, and are only found at the ``fringe'' of the
search. In this paper we show that on a range of learning
around 98% of the clauses considered by a system such as Progol are
inconsistent. A search approach is presented, based on a novel algorithm
called QG (Quick Generalisation). QG carries out a random-restart stochastic
bottom-up search which efficiently generates a consistent clause on the
fringe of the refinement graph search without needing to explore the
graph in detail. An experimental version of QG has been implemented.
Initial experiments with QG indicate that when the proportion of
consistent clauses is small, QG is far more efficient than standard
refinement-graph searches, while still generating similar solutions.
QG has been incorporated into an Analogical Prediction strategy in Progol4.6
and has been compared with Progol's standard generalisation search on
a beta-turn protein prediction problem. The resulting QG-based approach achieves
predictive accuracies comparable to standard Progol with a ten-times
reduction in total training and test time.
Speaker: Nils Nilsson
Title: Teleo-Reactive Programs and the Triple-Tower Architecture
Abstract:
I describe how a new programming format is used in a proposed
architecture for linking perception and action in a robot.
Along the way, I mention some analogies with control
mechanisms in animals. The programming format is called
'teleo-reactive' because it allows programs to react
dynamically to changing situations in ways that lead
inexorably toward their goals. The proposed architecture
consists of three "towers" of layered components. The
"perception tower" consists of rules that create ever-higher
descriptions of the environmental situation starting
with the primitive predicates produced by the robot's
sensory apparatus. These descriptions, themselves high-level
predicates, are deposited in a "model tower" which is
continuously kept faithful to the current environmental
situation by a "truth-maintenance" system. The predicates
in the model tower, in turn, evoke appropriate action-producing
programs in the "action tower." The action tower consists
of teleo-reactive programs that are organized more-or-less
hierarchically. Execution of the lowest-level action
programs causes the robot to take primitive actions in
its environment. The effects of the actions are sensed
by the robot's sensory mechanism, completing a sense-model-act
cycle that is quiet only at those times when the robot's
goal is satisfied. I demonstrate the operation of the
architecture using a simple block-stacking task.
Speaker: Philip Reiser
Title: Developing a Logical Model of Yeast Metabolism
Abstract: With the completion of the sequencing of genomes of an increasing
number of organisms, the focus of biology is moving to determining the
role of these genes (functional genomics). To this end it is useful
to view the cell as a biochemical machine: it consumes simple
molecules to manufacture more complex ones by chaining together
biochemical reactions into long sequences referred to as
metabolic pathways. Such metabolic pathways are not linear but often
intersect to form a complex network. Genes play a fundamental role in
this network by synthesising the enzymes that catalyse biochemical
reactions. Although developing a complete model of metabolism is of
fundamental importance to biology and medicine, the size and
complexity of the network has proven beyond the capacity of human
reasoning. This paper presents intermediate results in the Robot
Scientist research programme that aims to discover the function of
genes in the metabolism of the yeast Saccharomyces cerevisiae.
Results include: (1) the first logical model of metabolism; (2) a
method to predict phenotype by deductive inference; and (3) a method
to infer reactions and gene function by abductive inference. We
describe the in vivo experimental set-up which will allow these
in silico inferences to be automatically tested by a laboratory
robot.
Speaker: Zhanna Reznikova
Title: How to study ant's numerical competence using their
own communicative means and ideas of information theory
Abstract:
The main point of proposed approach to study ants' cognitive
abilities is that our experiments provide a situation
in which insects have to transmit information quantitatively
known to the experimentalist in order to obtain food.
One may estimate some properties of ant intelligence
by measuring complexity of tasks they solve in order
to pass definite pieces of information from scouts to
foragers.
Our previous experiments, basing on ideas of Information
Theory, have shown that ants are able to memorize and
transmit messages concerning sequence of turns toward
a trough of syrup and use the simplest regularities
to compress the information. To reveal counting and number
related skills, we suggested red wood ants Formica polyctena
to transmit information on the number and coordinates
of objects. One of the experimental set-ups consisted
of a «tree trunk» with branches that ended in empty troughs,
except for one which was filled with syrup. Another
set-up consisted of a lattice which simulated Cartesian
coordinates. The foragers of F. polyctena were separated
into teams of 5-8 individuals , each with one scout.
All laboratory ants were marked with coloured labels.
To start the experiment, an ant scout was placed at the
randomly numbered trough containing food and then returned
to the nest on its own. The duration of the contact between
foragers and the scout was measured. Then we removed
the scout and the foragers had to search for the food
by themselves. The experiments were so devised as to
eliminate all possible ways that may help to find food,
except for distant homing. It turns out that the ants
are able to count within several tens, and transmit this
information to their nestmates. The analysis of time
duration of ants' contacts enable us to create a hypothesis
of how they use numbers and coordinates in their communication.
We suppose that only a few highly social ant species
use such a complex communication system based on cognitive
processes. At the same time, we believe that the experimental
schemes described can be used to study the communication
systems and numerical competence of other animals.
Speaker: Boris Ryabko
Title: The nonprobabilistic approach to learning the best prediction
Abstract:
The problem of predicting a sequence x_1, x_2,.... where each x_i belongs
to a finite alphabet A is considered. Each letter x_{t+1} is predicted
using information on the word x_1, x_2, ...., x_t only. We use the game
theoretical interpretation
which can be traced to Laplace where there exists a gambler who
tries to estimate probabilities for the letter x_{t+1} in order to maximize
his capital . The optimal method of prediction
is described for the case when the sequence x_1, x_2,.... is generated by
a stationary and ergodic source. It turns out
that the optimal method is based only on estimations of conditional probabilities.
In particular, it means that if we work in the framework of the ergodic and
stationary source model, we cannot consider
pattern recognition and other complex and interesting tools,
because even the optimal method does not need them. That is why
we suggest a so-called nonprobabilistic
approach which is not based on the stationary and ergodic source model and
show that complex algorithms of prediction can be considered in the framework
of this approach.
The new approach is to consider a set of all infinite
sequences (over a given finite alphabet) and estimate the size of sets of
predictable sequences with the help of the Hausdorff dimension.
This approach enables us first, to show that there exist large sets of well
predictable sequences which have zero measure for each stationary and ergodic
measure. (In fact, it means that such sets are invisible in the framework of
the ergodic and stationary source model
and shows the necessity of the new approach.)
Second, it is shown that there exist quite large sets of such sequences
that can
be predicted well by complex algorithms which use not only estimations of
conditional probabilities.
Speaker: Claude Sammut
Title: Managing Context in a Conversational Agent
Abstract:
Traditional methods for Natural Language Processing have failed to
deliver the expected performance required in a Conversational Agent.
This is mainly due to the fact the speakers rarely use grammatically
complete and correct sentences in conversation. Furthermore, speakers
make significant assumptions about the background knowledge of the
listener. Thus, pragmatic knowledge about the context of the
conversation turns out to be a much more important factor in
understanding an utterance than traditional linguistic analysis. For
this reason, the a majority of conversational systems being implemented
favour shallow parsing techniques. The challenge for such systems is in
managing information about the context of the conversation.
This paper describes a conversational agent, called "ProBot", that uses
a novel structure for handling context. The ProBot is implemented as a
rule-based system embedded in a Prolog interpreter. The rules consist of
patterns and responses, where each pattern matches a user's input
sentence and the response is an output sentence. Both patterns and
responses may have attached Prolog expressions that act as constraints
in the patterns and can invoke some action when used in the response.
Rules are grouped into sets, called contexts. There is always a
"current" context. This is the set of rules that is considered most
applicable in to the current state of the conversation. A user's input
is processed by attempting to match the sentence with the pattern in
each rule. The first match causes that corresponding rule's response to
be invoked. In addition to outputting an appropriate response, the rule
can also invoke a Prolog program. One such program switches the current
context. We can represent the course of a conversation as a the
traversal of a graph in which each node is a context. The rules of the
current context are used to carry on the conversation for some period, a
transition to a new context corresponds to the traversal of an arc from
one node in the graph to another.
While this scheme works well for a highly structured conversation, it
fails to take into account sentences that either change the topic of the
conversation or those that contain expressions that the rules of the
current context are not able to handle. To avoid these problems, we
introduce the "filters" and "backups". A filter is a set of rules that
is prepended to the current context. That is, the filter rules are
examined for a match before any of the rules in the current context.
Typically, filter rules contain keywords that assist the system in
guessing that a user is changing the topic of the conversation. When a
filter rule matches, the usual response is to switch the current
context. Backup rules are appended to the current context. Their
function is to catch user inputs that do not match any rules in the
current context. Usually backup rules produce responses that temporise,
provoking the user into providing further input that may be recognised
by the system.
In this paper, we give a detailed discussion of the architecture of the
ProBot and explain the origins of this design. We also give some
indication of future work. In particular, writing scripts for
conversational agents is extremely labourious, but automating the
construction of scripts through Machine Learning is very challenging. We
discuss some of these challenges.
Speaker: Erik Sandewall
Title: On the Design of Software Invididuals
Abstract:
We use the term 'software individuals' to designate aggregates of
programs and data that have the following properties:
- They exist in a population of similar, but not identical
individuals.
- Individuals are able to interact with their surrounding
environment, with each other, and/or with people. While doing
so they may modify their internal state.
- Each individual contains the safeguards that may be required
in order to select which influences to accomodate and which
ones to ignore.
- The aggregate of programs and data that define an individual,
and that in particular define its behavior, is a part of its
internal state and can therefore be modified as the result of
the interactions where the individual is involved.
- Individuals or small groups of individuals are able to create
new individuals that inherit the features of their parent(s).
- The program/data aggregate that defines an individual is
symbolic in character. The ability for knowledge representation
is designed into individuals from the start.
The last item distinguishes software individuals from the software
that is evolved by genetic programming.
In this article we address the question of design principles for
software individuals, and approach it as a software design issue.
This problem is of course an unconventional one from the point
of view of ordinary software engineering. The emphasis on
software self-modification runs counter to conventional software
design principles, where the inherent ability of the von Neumann
machine to modify its own programs is strongly restricted in
particular through the design of conventional programming
languages. In our approach, program self-modification is viewed
as an important possibility, and not as an abberration.
Our proposed Software Individual Architecture is characterized by
the following principles:
- The use of a Software Individual Kernel (SIK) which is a minimal
individual having the basic properties of self-replication,
self-modification, interaction, and ability to administrate
its own state.
- The use of a Knowledge Object Repository that is accessible
by, but external to software individuals, and that can be shared
by several individuals in a population. This repository is also
the basis for several of the user services that are built using
the architecture.
- Each Software Individual Kernel is set up as a directory
structure (directory and all its sub-directories) in the file
system of a particular computer host. A population may 'live'
within one host, or in a network of several hosts. Each individual
is self-contained in this structure and each carries with itself
(i.e., contains) a full set of the required programs and data
defining it.
- An individual can acquire additional features, besides those
present in its Kernel, by adopting additional program and data
modules, which either are sent to it by other individuals, or
by importing them from the Repository. A number of previously
developed knowledge management services have been ported so that
they can be acquired by the new Software Individual Kernel.
- We have worked the Software Individual Kernel so as to achieve
maximal simplicity and generality.
The full paper will describe the design of the Software Individual
Kernel. It will also discuss how the design and use of this kind
of software differs from the design of conventional software.
Finally it will discuss the relevance of Software Individuals
as the basis for implementing A.I. systems.
Speaker: Steffen Schulze-Kremer
Title: Ontologies for Molecular Biology
Abstract:
Molecular biology has a communication problem. There are many databases
using their own labels and categories for storing data objects and others
using identical labels and categories but with a different meaning. One
prominent example is the concept "gene" which is used with different
semantics by major international genomic databases. Ontologies are one
means to provide a semantic repository to systematically order relevant
concepts in molecular biology and to bridge the different notions in
various databases by explicitly specifying the meaning of and relation
between the fundamental concepts in an application domain. Here, design
principles for ontology building and the upper level and a database branch
of a prospective ontology for molecular biology is presented and compared
to other ontologies with respect to suitability for molecular biology
(http://igd.rz-berlin.mpg.de/~www/oe/mbo.html).