
Hong, S. J., Hosking, J. R. M., and Winograd, S. (1996).
Use of randomization to normalize feature merits.
In Information, Statistics and Induction in Science,
Proceedings of the ISIS 96 conference,
eds. D. L. Dowe, K. B. Korb and J. J. Oliver, pp. 10-19.
World Scientific, Singapore.
Abstract.
Feature merits are used for feature selection in classification and
regression as well as for decision tree generation. Commonly used merit
functions exhibit a bias towards features that take a large variety of
values. We present a scheme based on randomization for neutralizing
this bias by normalizing the merits. The merit of a feature is
normalized by division by the expected merit of a feature that is
random noise taking the same distribution of values as the given
feature. The noise feature is obtained by randomly permuting the
values of the given feature. The scheme can be used for any merit
function including the Gini and entropy measures. We demonstrate its
effectiveness by applying it to the contextual merit defined by S. J. Hong
["Use of contextual information for feature ranking and discretization",
IBM Research Report RC19664, 1994].
[ J. R. M. Hosking's home page |
IBM Research home page ][
IBM home page |
Order |
Search |
Contact IBM |
Help |
(C) |
(TM)
]