Inductive Logic Programming II

Abstract

Inductive logic programming (ILP) is a research area lying at the intersection of inductive machine learning and logic programming. The general aim of ILP is to develop theories, techniques and applications of inductive learning from observations and background knowledge in a first order logical framework. This project aims at continuing the ILP1 project in which several significant research results have been obtained. To close the gap between conceptual work and applications, the consortium has identified four key areas where ILP technology has great potential. These areas are (1) natural language processing, (2) data mining and discovery, (3) design and configuration, and (4) data-base design. End-users active in these areas have been united in a users club and collaborate in the project by providing relevant test data. ased on an analysis of these four application domains, 14 scientific problems in need of substantial progress have been identified and organized around 4 the mes:

Background knowledge: Techniques are needed that 1) can handle large numbers of background predicates (Relevance), 2) can update theories structured in many levels (Revision), 3) can carry out predicate invention in deep-structured theories (Invention).

Complex Hypotheses: Techniques are needed that 4) can learn deep structured theories and optimise the choice of a set of clauses for a single predicate (Multi-Clause), 5) can handle long chains of relevant literals, connected by shared variables (Deep), 6) can better handle recursive hypotheses (Recursion), 7) can search efficiently in the presence of structural concepts expressed in complex clauses (Structure).

Built-in semantics: Techniques are needed that 8) better handle numbers (Numbers), 9) can express probabilistic constraints and definitions (Probabilities), 10) can learn and use constraints (Constraints), 11) work more efficiently through the use of built-in predicates and algorithms (Built-in).

Sampling issues. Techniques are needed that 12) can learn from large data sets (Large Data), 13) can learn from small data sets (Small Data), and 14) offer some reliability guarantees (Reliability). \end{itemize} The main methodology applied will be 1) to study the scientific problem starting from given application domains and data (provided by the end-user club), 2) to generalize away from the application, 3) to develop theory, techniques and implementations to cope with a specific problem, 4) to evaluate the developed framework on the application domains and data, and 5) to use the obtained feedback to re-iterate if necessary.

There is also a description of There is also a descriptions of international partners associated with the project.

Home Page