Index of /~shm/Software/ase_progol/version_1.0

[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory  -  
[   ]CONDITIONS_ON_USE2008-08-07 14:23 935  
[TXT]README.html2008-08-07 09:51 8.0K 
[   ]ase_progol2008-08-07 09:51 16K 
[TXT]ask_oracle.pl2008-08-07 09:51 925  
[TXT]classify.pl2008-08-07 09:51 2.5K 
[TXT]fast_forward.pl2008-08-07 09:51 2.1K 
[TXT]max_compressions.pl2008-08-07 09:51 1.6K 
[   ]one_line_clauses2008-08-07 09:51 962  
[TXT]trial_generator.pl2008-08-07 09:51 5.5K 
[TXT]trial_selector.pl2008-08-07 09:51 9.3K 

ASE-Progol Version 1.0

ASE-Progol Version 1.0

This system is described in the paper:

  Combining Inductive Logic Programming, Active Learning and Robotics to
  Discover the Function of Genes
  C.H. Bryant and S.H. Muggleton and S.G. Oliver and D.B. Kell and
  P. Reiser and R.D. King
  2001,
  Electronic Transactions on Artificial Intelligence, 5(B), pp1-36

INSTALLATION

Down-load the following files from the directory contain this README
file and place them in a single directory.

ase_progol
ask_oracle.pl
classify.pl
fast_forward.pl
max_compressions.pl
trial_generator.pl
trial_selector.pl
one_line_clauses

Edit the file ase_progol as follows:

Change the line near the top of this file beginning
     set code_dir = 
to 
     set code_dir = X
where X is the full pathname of the directory containing the above code.

Change the line near the top of this file beginning
     set progol = 
to 
     set progol = Y
where Y is the full pathname of the directory containing source code
for CProgol.

To run ASE-Progol execute the file "ase-progol". This is a unix shell
script which makes calls to CProgol. The options described below may
be used by adding them to the Unix command line after "ase-progol".

OPTIONS

-d data_directory
	where data_directory is the pathname of the directory
	containing the domain files (see below). Either a full or
	relative pathname may be used. If a relative pathname is used
	then it should be relative to the directory from which
	ASE-Progol is called. If this option is not used then, by
	default, the data_directory is taken to be the directory from
	which ASE-Progol is called.

-scroll
	This option causes ASE-Progol to ask the user at the start of
	each cycle whether to continue the execution. (This allows the
	user to examine the detailed files /tmp/*log which are only
	kept for the current loop.)

-l max_iterations_of_CLML_cycle
	where max_iterations_of_CLML_cycle is a limit on the maximum
	number of iterations of the CLML cycle.  In other words, a
	limit on the number of trials which may be physically
	performed. 

-c trial_costs_limit
	where trial_costs_limit is a limit on the experimental resources
	which may be consumed by ASE-Progol.

-robot  This option causes ASE-Progol to call the robot to get the
	results of trials. If this option is not used then ASE-Progol
	asks an Oracle instead.

-s      Size of unification stack used by CProgol.

-random
	This option causes ASE-Progol to select trials at random.

-naive
        This option causes ASE-Progol to naively select the cheapest
        trial from the set of candidate trials.  

Normally ASE-Progol will instruct the robot to perform the trial whose
outcome will provide the highest discrimination between the candidate
hypotheses. The random and naive options can be used to obtain
benchmarks against which the normal performance of ASE-Progol can be
measured.

MODES

ASE-Progol can operates in either of three modes, namely ua (unaided),
hs (head-start) or ff (fast-forward).

Head-start mode allows ASE-Progol to utilise a set of laboratory
results obtained prior to execution without the use of ASE-Progol. In
Unaided mode there is no such set of results.

Fast-forward mode allows ASE-Progol to utilise the results from a
previous execution. In fast-forward mode execution recommences at a
specified iteration of the CLML cycle. ASE-Progol enters the loop at
the point where a new example has just been created. Fast-forward mode
requires a trace of the examples and hypotheses generated during a
previous execution.

The option -m is used to specify the desired mode of operation. It is
be used by adding one of the following to the Unix command line after
"ase-progol".

-m ua

-m hs hs_lab_results_file

	where hs_lab_results_file is a file which take the same form
	as lab_results.pl (see below). (All these results must be
	assigned the loop number 0; this allows ASE-Progol to
	distinguish these results from those that will be obtained
	during the forthcoming execution).

-m ff start_loop ff_lab_results_file ff_hyps_file

	where start_loop is an iteration of the previous CLML cycle
	and ff_lab_results_file and ff_hyps_file are files which take the
	same form as lab_results.pl and hypotheses.pl  files
	respectively (see below).

DOMAIN FILES

All the data for a particular domain must kept in a single separate
directory.  This directory will contain the following files.

static_know.pl
	Prolog code representing static knowledge on the domain,
	including cost(Trial, Cost) definitions for domain.

trials.pl
	trial/1 facts, where the argument is a ground unit clause
	representing a trial 
	e.g. trial(phenotypic_effect(gene_d, [m1, m2])).

slp.pl
	A Prolog program containing a definition of the predicate
	sample_trials(Quantity,List_of_trials,Trial_selection_method).
	The Trial Generator component calls this program on each
	iteration of the loop in order to generate a set of candidate
	trials.

	Values of the terms Quantity and Trial_selection_method are
	input to the program and the value of the term List_of_trials
	is output.  Quantity is the number elements in List_of_trials.
	Trial_selection_method takes the value 'normal' or 'random'

	The purpose of the term Trial_selection_method is to give the
	program the option of behaving differently according to which
	method of selecting trials is selected when ASE-Progol is
	executed. One possible use of this option would be to provide
	a definition of sample_trials(Quantity, List_of_trials,
	normal) which utilises domain knowledge and a definition of
	sample_trials(Quantity, List_of_trials, random) which does
	not.

	The program may use Stochastic Logic Programming. If this is
	so and  the slp utilises the system predicate sample/3 then the
	predicate randomseed/0 must be executed first. Otherwise each
	time CProgol is restarted, each call to sample(trial_pred,
	Quantity, List_of_trials) generates the same list of trials,
	assuming that the definition of trial_pred remains constant.

lab_results.pl
	example(Loop_no, Trial, Class) facts, where Trial is a ground
	unit clause representing a trial, Loop_no is the iteration of
	the CLML loop in which the example was generated and Class is
	either positive or negative.

examples.pl
	positive and negative instances of a predicate representing an
	trial.
	e.g. phenotypic_effect(gene_d, [m1, m2]).
	and
	     :- phenotypic_effect(gene_d, [m2]).
	NB These are represented in the usual CProgol syntax.

learning.pl
	CProgol mode and type declarations, constraints etc

hypotheses.pl 
	hypoth(Loop_no, [Head,Body], Compression), where the
	second term is a list with two elements representing a
	hypothesis, Loop_no is the iteration of the CLML loop
	in which the hypothesis was generated and the third term is
	the compression for that clause output by CProgol. eg
	hypoth(1, [codes(gene_d,enzyme_d), (true)], 9).

classifications.pl
	The output of the Classifier ie 
	 matrix_cell(H, Compression, Trial, Cell_value)
	facts which record how each trial is classified by each
	hypothesis.
	e.g. matrix_cell([codes(gene_d,enzyme_d), (true)], 9,
                         phenotypic_effect(gene_d,[m1,m2]), 1)

oracle.pl
        This file is only needed when ASE-Progol is to ask an Oracle,
	as opposed to a robot, for the outcome of trials. The oracle
	takes the form of a file which should contain either an
	intensional or extensional definition of the observable
	predicate.  An intensional definition should represent the
	outcome for every possible trial by a set of positive and
	negative examples in the usual Progol syntax.

LIMITATIONS

Both the cost of individual trials and the limit on the experimental
resources which may be consumed by ASE-Progol must not exceed (2^31
-1): this is the largest integer which can be represented.