CW 347

S. Verbaeten, T. Cardoen
Techniques for Identifying Mislabeled Training Examples in ILP Classification Problems

Abstract

We consider the problem of noisy training examples, more precisely mislabeled training examples, in the context of ILP classification problems. We address this problem by pre-processing the training set, i.e. by identifying and removing outliers from the training set. We study a number of filtering techniques, some of which were proposed in the literature for attribute-value problems. We evaluate these techniques on a Bongard data set, which we artificially corrupt with different levels of classification noise.

report.pdf / mailto: S. Verbaeten