Probabilistic Inference


My work on classification has mainly been concerned with Bayesian methods and has resulted in several generally applicable results. The earliest was a methodology for designing Bayesian networks on which I worked with research student Enrique Sucar and and members of the Philosophy and Artificial Intelligence research group at King's College London. Bayesian networks allow inferences to be made about a set of variables. Given measurements of a subset of the variables they allow the computation of a probability distribution over the unknown variables. They have a structure representing the relationships between the variables, and prior and conditional probabilities. However, the underlying theory on which the computations are made requires certain conditional independence properties to hold between the variables. Our methodology involves testing the structure of the network against the required conditional independence assumptions. Failure of a test will initiate a modification to the network through either node deletion, the addition of a new node or the addition of a new arc. The methodology (Sucar Gillies and Gillies 1993) was successfully demonstrated on a system for providing advice to a practitioner conducting colon endoscopy (Sucar and Gillies 1994).

The next important result was an automatic method of adding a new (hidden) node to improve the accuracy of a Bayesian Network. The idea of incorporating hidden nodes (so called because there is no measured data on them) was originally suggested by Judea Pearl. My work (with Chee Keong Kwoh and Jung Wook Bang) provided a numerical solution, based on the gradient descent method, to calculate the parameters of the unknown node from measured data. (Kwoh and Gillies 1996) The results were highly successful demonstrating improved accuracy in several applications including the analysis of colon endoscopy images, morphometric classification of neural cells and prognostic analysis of hepatitis C data.

Another important result in the field of Bayesian networks was the development of a new method of assessing their accuracy. Previous methods all used the probability of a data set given a network. This measure allows comparison of different networks, but does not provide an absolute measure of accuracy. In my work with Alex Pappas, we devised a new method of assessing accuracy based on comparing the data dependencies in a data set with those in the network. (Pappas and Gillies 2002)

In addition to networks I have recently worked on Bayesian classifiers with my Carlos Thomaz. Problems exist when the data set is small and the variance and co-variance cannot be estimated accurately. Traditional approaches to solving this problem use either maximum likelihood or maximum accuracy methods to estimate the co-variance. Our new method uses the principal of maximum entropy to provide the estimate with the highest information content. This consistently provides equal or better accuracy than traditional methods at a fraction of the computational costs. (Thomaz Gillies Fietosa 2004)