Title: Maximum consistency of incomplete data via non-invasive imputation
Authors: Günther Gediga , Institut für Evaluation und Marktanalysen; Brinkstr. 19; D-49143 Jeggen, Germany
Ivo Düntsch , Dept of Computer Science , Brock University , St Catherines, Ontario, L2S 3A1, Canada
(Equal authorship implied)
Status: Artificial Intelligence Review, to appear
Abstract: In this paper we describe an algorithm to impute missing values from given data alone, without representational or other assumptions, and analyse its performance. Our approach is based on non- numeric rule based data analysis. In contrast to statistical procedures, such analysis offers no straightforward way to define loss functions or a likelihood function; these are based on statistical pre- assumptions, which are not given in rule based data analysis. Therefore, other optimisation criteria must be used. A simple criterion is the demand that the rules of the system should have a maximum in terms of consistency, which means if we fill a missing entry with a value, we should result in a rule which is consistent with the other rules of the system. Our algorithm imputes missing values in an attribute vector x by presenting a list of possible values drawn from the set of all vectors y which do not contradict x , i.e. they have the same entries wherever both are defined.

View technical report version View simulation data