Abstract

We propose a tree regularization framework, which enables many tree models to perform feature selection efficiently. The key idea of the regularization framework is to penalize selecting a new feature for splitting when its gain (e.g. information gain) is similar to the features used in previous splits. The regularization framework is applied on random forest and boosted trees here, and can be easily applied to other tree models. Experimental studies show that the regularized trees can select high-quality feature subsets with regard to both strong and weak classifiers. Because tree models can naturally deal with categorical and numerical variables, missing values, different scales between variables, interactions and nonlinearities etc., the tree regularization framework provides an effective and efficient feature selection solution for many practical problems.

Original languageEnglish (US)
Title of host publication2012 International Joint Conference on Neural Networks, IJCNN 2012
DOIs
StatePublished - 2012
Event2012 Annual International Joint Conference on Neural Networks, IJCNN 2012, Part of the 2012 IEEE World Congress on Computational Intelligence, WCCI 2012 - Brisbane, QLD, Australia
Duration: Jun 10 2012Jun 15 2012

Publication series

NameProceedings of the International Joint Conference on Neural Networks

Other

Other2012 Annual International Joint Conference on Neural Networks, IJCNN 2012, Part of the 2012 IEEE World Congress on Computational Intelligence, WCCI 2012
CountryAustralia
CityBrisbane, QLD
Period6/10/126/15/12

Keywords

  • RBoost
  • RRF
  • regularized boosted trees
  • regularized random forest
  • tree regularization

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Feature selection via regularized trees'. Together they form a unique fingerprint.

Cite this