[Top] [Contents] [Index] [ ? ]

TopLog manual


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1. Load TopLog

To run TopLog you need a Prolog interpreter. Currently TopLog supports the following Prolog interpreters: Yap (version 5.1.3), Sicstus (version 3.12.3) and SWI (version 5.6.46). Other versions of these interpreters might work as well. It is strongly recommended to run TopLog in YAP as it is much more (~10x over Sicstus, ~100x over SWI) efficient than the other Prolog interpreters.

Download and extract TopLog from http://www.doc.ic.ac.uk/~jcs06/TopLog/. This will create a directory structure with 3 sub-directories and a run.pl file in the root directory.

The run.pl is a simple script to exemplify the execution of TopLog with the examples. You should edit run.pl and the examples files in order to better understand how TopLog works Basically run.pl script does the following:

 
:- ['source/toplog'].
:- p_consult('examples/mutagenesis/mutagenesis').
:- modelCV.

You can also execute run.pl directly from the command line with yap -l run.pl, sicstus -l run.pl or plcon -l run.pl


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2. Commands

p_consult(+Filename)

Consults the file defined in the first argument. This should be used instead of consult/1.

set(?Setting, ?Value)

Gets or sets the value of a TopLog setting. The following settings are available:

maximum_proof_depth

Maximum depth of a proof. The default setting is 10. If a proof requires a depth value bigger than this setting it will be considered a fail. This setting is used to test the proofs TopLog constructs. The purpose of this setting is mainly to avoid infinite loops.

minimum_literals_in_hypothesis

This is the minimum literals in the body of an hypothesis. By default it is 0.

maximum_literals_in_hypothesis

This is the maximum literals in the body of an hypothesis. This is equivalent to Aleph's clauselength setting. However in Aleph clauselength counts the head as a literal. By default the value is 3. (in Aleph clauselength default value is 4 which has the same meaning). Notice that for some problems this value needs to be increased.

minimum_singletons_in_hypothesis

This sets the number minimum number of singletons in a constructed hypothesis. By singletons we mean variables that only appear once in the hypothesis. By default the value is 0.

maximum_singletons_in_hypothesis

This sets the number maximum number of singletons in a constructed hypothesis. By singletons we mean variables that only appear once in the hypothesis. By default the value is 0. Notice that for some problems this value needs to be increased. Having to increase this value significantly is often a sign that the background knowledge could be rewritten in a more compact way as value. Changing this value has a significant impact in the number of hypothesis generated.

maximum_extra_variables_unbound_in_hypothesis_construction

This is an advanced setting that most users should not need to change. It defines the maximum allowed number of extra temporary unbound variables during the hypothesis construction stage. By default this value is 1. You should have a very good reason to change this value. The total number of unbound variables at any time during the hypothesis construction stage is the number of initially unbound variables plus this setting.

maximum_hypothesis_interpretations

Maximum times an interpretation from a recursive hypothesis may succeed. For non recursive hypothesis the user should use the recall of the mode body declaration. By default this value is 1.

maximum_hypothesis_per_example

Maximum number of hypothesis a positive example may yield. By default this value is 500. You may want to increase this value for certain problems but you will also notice that for many problems this limit will not be reached. A value of 0 means generate all possible hypothesis.

maximum_examples_to_generate_hypothesis

Maximum number of positive examples to used to generate all the hypothesis (currently the first N). By default this value is 0, which means to use all positive examples. Notice that this setting is different than sampling because all hypothesis are still used to compute the hypothesis coverage.

maximum_examples_to_generate_hypothesis

Maximum number of positive examples to used to generate all the hypothesis (currently the first N). By default this value is 0, which means to use all positive examples. Notice that this setting is different than sampling because all hypothesis are still used to compute the hypothesis coverage.

maximum_hypothesis_to_generate_theory

Maximum number of unique hypothesis to collect in order to generate the final theory. By default this value is 50000. A value of -1 means all hypothesis.

minimum_hypothesis_compression

Prune hypothesis with compression below or equal to this value at the example covering stage. This means that as soon as the negative coverage is too high, the coverage computation stops. The compression of an hypothesis is defined as: (Positive Score + Negative Score)/(Num Literals in Hypothesis) Positive score is the total sum of weights of the positive examples it covers. Negative score is the total sum of weights of the negative examples it covers (a negative value). Number of literals in the hypothesis is 1 + the body length. Notice that this compression is computed taking into account the whole dataset and does not consider the folds. Besides being more efficient, it may (although unlikely) affect the final model by making it less overfitable in the case an hypothesis is pruned (and thus cannot participate on the final model) but by looking only at the training the compression would be good. By default this value is 1.0. To ignore it use the value 'disable'.

noise

Noise is defined as abs(sum negative weights covered)/(abs(sum negative weights covered) + (sum positive weights covered)). Noise varies between 0.0 and 1.0. Hypothesis that are above this value are removed. This is computed at a per fold basis. Although it is possible to set noise values higher than 0.5 they would never yield a valid hypothesis in a theory. The default value for this setting is 0.5.

minacc

Min accuracy eliminates rules where abs(sum positive weights covered)/(abs(sum negative weights covered) + (sum positive weights covered)) is below min accuracy. Minacc varies between 0.0 and 1.0. This is computed at a per fold basis. Although it is possible to set minacc to values smaller than 0.5 this would never yield a valid hypothesis in a theory. The default value for this setting is 0.5.

minpos

Minimum positive score a rule must have to be considered valid. This is computed at a per fold basis. The default value is 2.

maxneg

Maximum negative score a rule may have and still be considered valid. This is computed at a per fold basis. By default it is inf, meaning infinite.

sample

Instructs TopLog to use only a sample of the examples to build hypothesis and evaluate their coverage. By default it is 1.0 which means use all examples. This setting must be >0.0 and <=1.0

example_inflation

The example inflation value is multiplied by the weight of each individual example. This may be useful when we have few examples but still want to generate rules. If there are few examples it is possible no rules are generated because the number of literals in an example is higher than the positive minus negative coverage. By default this value is 1. Notice that if we set this value to a negative number the positive and negative examples are swapped.

cross_validation_folds

Number of folds for doing cross validation. By default it is 10.

verbose

Verbose setting. 0 shows minimal information (basically just the overall results). 1 shows percentage completion information, 2 shows example and hypothesis by hypothesis information and 3 gives even further detail. The default is 2.

star_modeb_times

Defines the number of times to which it is equivalent for a star '*' to appear in the modeb/modet definitions. This setting is just used for compatibility with other ILP systems (e.g. Aleph and Progol) and is not particularly useful in TopLog

evalfn

evalfn defines the evaluation function of the final theory: Possible values are:

compression

The compression of an hypothesis is defined as 'positive score - negative score - size of hypothesis'. This measure generates few and generic rules. It is the default and the best for classifying unseen data.

laplace

Laplace measured is defined as (PosScore+1)/(PosScore+NegScore+2). This generates many rules and overfits as much as possible.

coverage

Coverage is simply PosScore-NegScore. It is identical to compression except that it does not take into account the hypothesis size. It is a medium term between compression and laplace and overfits moderately.

The more general evaluation functions will also allow the optimization algorithm to be faster because fewer rules are needed in the final theory. The default value is compression.

modeb/4

Specifies the mode body declarations. It has 4 arguments. First is the number of times the predicate may appear in the body of an hypothesis (default 1). Second is the signature of the predicate. It is of the format predName(?Type1,..,?TypeN) where ? is a symbol, either: '+' for input, '-' for output' or '#' for constant. Third is argument modes. It can be either: 'no_vars_rep' (default), 'vars_rep' or 'commutative'. 'no_vars_rep" means input variables may not be repeated and 'commutative' means the order of the input variables is irrelevant. The fourth argument is the recall, which is the number of times the predicate may succeed. By default it is 1. Note that all arguments except signature may be omitted. An example modeb declaration is modeb(atm(+molecule,-atomid,#element,-charge))

modeh/1

Specifies the mode head declarations. It has only one argument: the signature of the target predicate. e.g. modeh(active(+molecule))

modelTrain/0

Builds a model using all the data as training.

modelCV/0

Builds a model using the number of folds specified in setting cross_validation_folds.

problem_stats/0

Shows some basic statistics like the number of positive and negative examples, the default accuracy and the size of the background knowledge.

help/0

Shows a minimal help screen.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

Predicate Index

Jump to:   C   E   H   L   M   N   P   S   V  
Index Entry Section

C
compression (evalfn/2 option)2. Commands
coverage (evalfn/2 option)2. Commands
cross_validation_folds (set/2 option)2. Commands

E
evalfn (set/2 option)2. Commands
example_inflation (set/2 option)2. Commands

H
help/02. Commands

L
laplace (evalfn/2 option)2. Commands

M
maximum_examples_to_generate_hypothesis (set/2 option)2. Commands
maximum_examples_to_generate_hypothesis (set/2 option)2. Commands
maximum_extra_variables_unbound_in_hypothesis_construction (set/2 option)2. Commands
maximum_hypothesis_interpretations (set/2 option)2. Commands
maximum_hypothesis_per_example (set/2 option)2. Commands
maximum_hypothesis_to_generate_theory (set/2 option)2. Commands
maximum_literals_in_hypothesis (set/2 option)2. Commands
maximum_proof_depth (set/2 option)2. Commands
maximum_singletons_in_hypothesis (set/2 option)2. Commands
maxneg (set/2 option)2. Commands
minacc (set/2 option)2. Commands
minimum_hypothesis_compression (set/2 option)2. Commands
minimum_literals_in_hypothesis (set/2 option)2. Commands
minimum_singletons_in_hypothesis (set/2 option)2. Commands
minpos (set/2 option)2. Commands
modeb/42. Commands
modeh/12. Commands
modelCV/02. Commands
modelTrain/02. Commands

N
noise (set/2 option)2. Commands

P
p_consult/12. Commands
problem_stats/02. Commands

S
sample (set/2 option)2. Commands
set/22. Commands
star_modeb_times (set/2 option)2. Commands

V
verbose (set/2 option)2. Commands

Jump to:   C   E   H   L   M   N   P   S   V  

[Top] [Contents] [Index] [ ? ]

Table of Contents


[Top] [Contents] [Index] [ ? ]

About This Document

This document was generated by U-JoseCarlos-PC\Jose Carlos on May, 6 2008 using texi2html 1.78.

The buttons in the navigation panels have the following meaning:

Button Name Go to From 1.2.3 go to
[ < ] Back Previous section in reading order 1.2.2
[ > ] Forward Next section in reading order 1.2.4
[ << ] FastBack Beginning of this chapter or previous chapter 1
[ Up ] Up Up section 1.2
[ >> ] FastForward Next chapter 2
[Top] Top Cover (top) of document  
[Contents] Contents Table of contents  
[Index] Index Index  
[ ? ] About About (help)  

where the Example assumes that the current position is at Subsubsection One-Two-Three of a document of the following structure:


This document was generated by U-JoseCarlos-PC\Jose Carlos on May, 6 2008 using texi2html 1.78.