| [Top] | [Contents] | [Index] | [ ? ] |
| 1. Load TopLog | The first chapter is the only chapter in this sample. | |
| 2. Commands | All available commands | |
| Predicate Index | Index of all predicates |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To run TopLog you need a Prolog interpreter. Currently TopLog supports the following Prolog interpreters: Yap (version 5.1.3), Sicstus (version 3.12.3) and SWI (version 5.6.46). Other versions of these interpreters might work as well. It is strongly recommended to run TopLog in YAP as it is much more (~10x over Sicstus, ~100x over SWI) efficient than the other Prolog interpreters.
Download and extract TopLog from http://www.doc.ic.ac.uk/~jcs06/TopLog/. This will create a
directory structure with 3 sub-directories and a run.pl file in the root directory.
The run.pl is a simple script to exemplify the execution of TopLog with the examples. You should edit run.pl and the examples files in order to better understand how TopLog works Basically run.pl script does the following:
:- ['source/toplog'].
:- p_consult('examples/mutagenesis/mutagenesis').
:- modelCV.
|
You can also execute run.pl directly from the command line with yap -l run.pl,
sicstus -l run.pl or plcon -l run.pl
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
p_consult(+Filename)Consults the file defined in the first argument. This should be used instead of consult/1.
set(?Setting, ?Value)Gets or sets the value of a TopLog setting. The following settings are available:
maximum_proof_depthMaximum depth of a proof. The default setting is 10. If a proof requires a depth value bigger than this setting it will be considered a fail. This setting is used to test the proofs TopLog constructs. The purpose of this setting is mainly to avoid infinite loops.
minimum_literals_in_hypothesisThis is the minimum literals in the body of an hypothesis. By default it is 0.
maximum_literals_in_hypothesisThis is the maximum literals in the body of an hypothesis. This is equivalent to Aleph's clauselength setting. However in Aleph clauselength counts the head as a literal. By default the value is 3. (in Aleph clauselength default value is 4 which has the same meaning). Notice that for some problems this value needs to be increased.
minimum_singletons_in_hypothesisThis sets the number minimum number of singletons in a constructed hypothesis. By singletons we mean variables that only appear once in the hypothesis. By default the value is 0.
maximum_singletons_in_hypothesisThis sets the number maximum number of singletons in a constructed hypothesis. By singletons we mean variables that only appear once in the hypothesis. By default the value is 0. Notice that for some problems this value needs to be increased. Having to increase this value significantly is often a sign that the background knowledge could be rewritten in a more compact way as value. Changing this value has a significant impact in the number of hypothesis generated.
maximum_extra_variables_unbound_in_hypothesis_constructionThis is an advanced setting that most users should not need to change. It defines the maximum allowed number of extra temporary unbound variables during the hypothesis construction stage. By default this value is 1. You should have a very good reason to change this value. The total number of unbound variables at any time during the hypothesis construction stage is the number of initially unbound variables plus this setting.
maximum_hypothesis_interpretationsMaximum times an interpretation from a recursive hypothesis may succeed. For non recursive hypothesis the user should use the recall of the mode body declaration. By default this value is 1.
maximum_hypothesis_per_exampleMaximum number of hypothesis a positive example may yield. By default this value is 500. You may want to increase this value for certain problems but you will also notice that for many problems this limit will not be reached. A value of 0 means generate all possible hypothesis.
maximum_examples_to_generate_hypothesisMaximum number of positive examples to used to generate all the hypothesis (currently the first N). By default this value is 0, which means to use all positive examples. Notice that this setting is different than sampling because all hypothesis are still used to compute the hypothesis coverage.
maximum_examples_to_generate_hypothesisMaximum number of positive examples to used to generate all the hypothesis (currently the first N). By default this value is 0, which means to use all positive examples. Notice that this setting is different than sampling because all hypothesis are still used to compute the hypothesis coverage.
maximum_hypothesis_to_generate_theoryMaximum number of unique hypothesis to collect in order to generate the final theory. By default this value is 50000. A value of -1 means all hypothesis.
minimum_hypothesis_compressionPrune hypothesis with compression below or equal to this value at the example covering stage. This means that as soon as the negative coverage is too high, the coverage computation stops. The compression of an hypothesis is defined as: (Positive Score + Negative Score)/(Num Literals in Hypothesis) Positive score is the total sum of weights of the positive examples it covers. Negative score is the total sum of weights of the negative examples it covers (a negative value). Number of literals in the hypothesis is 1 + the body length. Notice that this compression is computed taking into account the whole dataset and does not consider the folds. Besides being more efficient, it may (although unlikely) affect the final model by making it less overfitable in the case an hypothesis is pruned (and thus cannot participate on the final model) but by looking only at the training the compression would be good. By default this value is 1.0. To ignore it use the value 'disable'.
noiseNoise is defined as abs(sum negative weights covered)/(abs(sum negative weights covered) + (sum positive weights covered)). Noise varies between 0.0 and 1.0. Hypothesis that are above this value are removed. This is computed at a per fold basis. Although it is possible to set noise values higher than 0.5 they would never yield a valid hypothesis in a theory. The default value for this setting is 0.5.
minaccMin accuracy eliminates rules where abs(sum positive weights covered)/(abs(sum negative weights covered) + (sum positive weights covered)) is below min accuracy. Minacc varies between 0.0 and 1.0. This is computed at a per fold basis. Although it is possible to set minacc to values smaller than 0.5 this would never yield a valid hypothesis in a theory. The default value for this setting is 0.5.
minposMinimum positive score a rule must have to be considered valid. This is computed at a per fold basis. The default value is 2.
maxnegMaximum negative score a rule may have and still be considered valid. This is computed at a per fold basis. By default it is inf, meaning infinite.
sampleInstructs TopLog to use only a sample of the examples to build hypothesis and evaluate their coverage. By default it is 1.0 which means use all examples. This setting must be >0.0 and <=1.0
example_inflationThe example inflation value is multiplied by the weight of each individual example. This may be useful when we have few examples but still want to generate rules. If there are few examples it is possible no rules are generated because the number of literals in an example is higher than the positive minus negative coverage. By default this value is 1. Notice that if we set this value to a negative number the positive and negative examples are swapped.
cross_validation_foldsNumber of folds for doing cross validation. By default it is 10.
verboseVerbose setting. 0 shows minimal information (basically just the overall results). 1 shows percentage completion information, 2 shows example and hypothesis by hypothesis information and 3 gives even further detail. The default is 2.
star_modeb_timesDefines the number of times to which it is equivalent for a star '*' to appear in the modeb/modet definitions. This setting is just used for compatibility with other ILP systems (e.g. Aleph and Progol) and is not particularly useful in TopLog
evalfnevalfn defines the evaluation function of the final theory: Possible values are:
compressionThe compression of an hypothesis is defined as 'positive score - negative score - size of hypothesis'. This measure generates few and generic rules. It is the default and the best for classifying unseen data.
laplaceLaplace measured is defined as (PosScore+1)/(PosScore+NegScore+2). This generates many rules and overfits as much as possible.
coverageCoverage is simply PosScore-NegScore. It is identical to compression except that it does not take into account the hypothesis size. It is a medium term between compression and laplace and overfits moderately.
The more general evaluation functions will also allow the optimization algorithm to be faster because fewer rules are needed in the final theory. The default value is compression.
modeb/4Specifies the mode body declarations. It has 4 arguments. First is the number of times the
predicate may appear in the body of an hypothesis (default 1). Second is the signature of the predicate.
It is of the format predName(?Type1,..,?TypeN) where ? is a symbol, either: '+' for input, '-' for output'
or '#' for constant. Third is argument modes. It can be either: 'no_vars_rep' (default), 'vars_rep' or
'commutative'. 'no_vars_rep" means input variables may not be repeated and 'commutative' means the order of the input
variables is irrelevant. The fourth argument is the recall, which is the number of times the predicate may succeed.
By default it is 1. Note that all arguments except signature may be omitted. An example modeb declaration is
modeb(atm(+molecule,-atomid,#element,-charge))
modeh/1Specifies the mode head declarations. It has only one argument: the signature of the target predicate.
e.g. modeh(active(+molecule))
modelTrain/0Builds a model using all the data as training.
modelCV/0Builds a model using the number of folds specified in setting cross_validation_folds.
problem_stats/0Shows some basic statistics like the number of positive and negative examples, the default accuracy and the size of the background knowledge.
help/0Shows a minimal help screen.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| Jump to: | C E H L M N P S V |
|---|
| Jump to: | C E H L M N P S V |
|---|
| [Top] | [Contents] | [Index] | [ ? ] |
| [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by U-JoseCarlos-PC\Jose Carlos on May, 6 2008 using texi2html 1.78.
The buttons in the navigation panels have the following meaning:
| Button | Name | Go to | From 1.2.3 go to |
|---|---|---|---|
| [ < ] | Back | Previous section in reading order | 1.2.2 |
| [ > ] | Forward | Next section in reading order | 1.2.4 |
| [ << ] | FastBack | Beginning of this chapter or previous chapter | 1 |
| [ Up ] | Up | Up section | 1.2 |
| [ >> ] | FastForward | Next chapter | 2 |
| [Top] | Top | Cover (top) of document | |
| [Contents] | Contents | Table of contents | |
| [Index] | Index | Index | |
| [ ? ] | About | About (help) |
where the Example assumes that the current position is at Subsubsection One-Two-Three of a document of the following structure:
This document was generated by U-JoseCarlos-PC\Jose Carlos on May, 6 2008 using texi2html 1.78.