Title: Classificatory Filtering in Decision Systems
Authors: Hui Wang, School of Information and Software Engineering , University of Ulster
Ivo Düntsch , Dept of Computer Science , Brock University , St Catherines, Ontario, L2S 3A1, Canada
Günther Gediga , Institut für Evaluation und Marktanalysen; Brinkstr. 19; D-49143 Jeggen; Germany
(Equal authorship implied)
Status: International Journal of Approximate Reasoning 23 (2000), 111-136
Abstract: Classificatory data filtering is concerned with reducing data in size while preserving classification information. Düntsch and Gediga presented a first approach to this problem. Their technique collects values of a single feature into a single value. In this paper we present a novel approach to classificatory filtering, which can be regarded as a generalisation of their approach. This approach is aimed at collecting values of a set of features into a single value. We look at the problem abstractly in the context of lattices. We focus on hypergranules (arrays of sets) in a problem domain, and it turns out the collection of all hypergranules can be made into a lattice. Our solution (namely LM algorithm) is formulated to find a set of maximal elements for each class, which covers all elements in a given dataset and is consistent with the dataset. This is done through the lattice sum operation. In terms of decision systems, LM collects attributes values while preserving classification structure. To use the filtered data for classification, we present and justify two measures for the relationship between two hypergranules. Based on the measures, we propose an algorithm (C2) for classification. Both algorithms are evaluated using real world datasets and are compared with C4.5. The result is analysed using statistical test methods and it turns out that there is no statistical difference between the two. Regression analysis shows that the reduction ratio is a strong indicator of prediction success.

View technical report version