Abstracts of White Papers

Jurupensempequeno The wite paper 'Pragmatic Data Mining' (18 pages, 362K pdf, August 2006) describes an algorithm to extract the distinguishing features from sets of rules and group these into new rules. The first part is an introductory overview with some test results, the second part a mathematical description with proofs, and the third lists a 'prototype' implementation in the functional programming language Haskell. download Note: the code listing in PDM is now outdated.

Key Terms: categorical data analysis, multivariate data analysis, rule based knowledge, data mining, pragmatism

The white paper 'Deriving Heuristic Rulse from Facts' (21 pages, 210K pdf, January 2007) is a successor to 'Pragmatic Data Mining', but it can be read separately. The first section defines a fact model and a rule model in terms of partitions of a set. The second section treats the reduction algorithm and proves it finds all possible shortest rules. Therefore it produces a normal form representation of the rule model. The third section shows how to get the partial order, if there is one, of reduced antecedents through abduction. Entailment and overlap of underlying sets may allow further rule simplification. The fourth section treats additions to a fact model (empirical induction) and shows that reduction also works when some rules are ambiguous. The effect of new rules on a reduced normal form is summarized in two general propositions. Through semantic extrapolation, soundness and completeness of reduced rules are defined. All examples use a data table from a classic paper by J.R. Quinlan, and the reductions were obtained using the computer program listed in the earlier paper. download Aruanapequeno

Key Terms: categorical data analysis, multivariate data analysis, data mining, rule based knowledge, machine learning