Supervised feature selection: A tutorial

Samuel H. Huang

Abstract


Supervised feature selection research has a long history. Its popularity exploded in the past 30 years due to the advance ofinformation technology and the need to analyze high-dimensional data sets. Research papers published during these yearswere mostly from the machine learning and artificial intelligence community. The emphasis was largely on improving modelaccuracy using empirical methods; whereas the issue of feature relevance was somewhat overlooked. Feature selection methodswere loosely classified as filters, wrappers, and embedded methods with little attention paid to their intricate details. This paperprovides a tutorial of supervised feature selection, on the basis of reviewing frequently cited papers in this area and a numberof classical publications from the statistics community. The objective of feature selection (either to improve model predictiveaccuracy or to determine relevance for hypothesis generation) is presented and discussed in details. Various supervised featureselection methods are classified using a detailed taxonomy. Guidelines for using feature selection methods in practice areprovided based on a comprehensive review of the performance of these methods. Issues that require further attention are alsodiscussed.

Full Text:

PDF


DOI: https://doi.org/10.5430/air.v4n2p22

Refbacks

  • There are currently no refbacks.


Artificial Intelligence Research

ISSN 1927-6974 (Print)   ISSN 1927-6982 (Online)

Copyright © Sciedu Press 
To make sure that you can receive messages from us, please add the 'Sciedupress.com' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.