Computer Science Technical Reports
CS at VT

Algorithms for Feature Selection in Rank-Order Spaces

Slotta, Douglas J. and Vergara, John Paul and Ramakrishnan, Naren and Heath, Lenwood S. (2005) Algorithms for Feature Selection in Rank-Order Spaces. Technical Report TR-05-08, Computer Science, Virginia Tech.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.
tech_report.pdf (283627)

Abstract

The problem of feature selection in supervised learning situations is considered, where all features are drawn from a common domain and are best interpreted via ordinal comparisons with other features, rather than as numerical values. In particular, each instance is a member of a space of ranked features. This problem is pertinent in electoral, financial, and bioinformatics contexts, where features denote assessments in terms of counts, ratings, or rankings. Four algorithms for feature selection in such rank-order spaces are presented; two are information-theoretic, and two are order-theoretic. These algorithms are empirically evaluated against both synthetic and real world datasets. The main results of this paper are (i) characterization of relationships and equivalences between different feature selection strategies with respect to the spaces in which they operate, and the distributions they seek to approximate; (ii) identification of computationally simple and efficient strategies that perform surprisingly well; and (iii) a feasibility study of order-theoretic feature selection for large scale datasets.

Item Type:Departmental Technical Report
Keywords:feature selection, rank-order, spoiler count
Subjects:Computer Science > Bioinformatics
Computer Science > Algorithms and Data Structure
ID Code:714
Deposited By:Administrator, Eprints
Deposited On:30 August 2005