TY - GEN
T1 - Quantifying features using false nearest neighbors
T2 - 23rd IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2011
AU - Filho, Jose Augusto Andrade
AU - Carvalho, Andre C P L F
AU - Mello, Rodrigo F.
AU - Alelyani, Salem
AU - Liu, Huan
PY - 2011/12/1
Y1 - 2011/12/1
N2 - Real-world datasets commonly present high dimensional data, which means an increased amount of information. However, this does not always imply an improvement in learning technique performance. Furthermore, some features may be correlated or add unexpected noise, thereby reducing data clustering performance. This has motivated the development of feature selection methods to find the most relevant subset of features to describe data. In this work, we focus on the problem of unsupervised feature selection. The main goal is to define a method to identify the number of features to select after sorting them based on some criterion. This task is done by means of the False Nearest Neighbor technique, which is rooted in chaos theory. Results have shown that this technique gives a good approximate number of features to select. When compared to other techniques, in most of the analyzed cases, it maintains the quality of the generated partitions while selecting fewer features.
AB - Real-world datasets commonly present high dimensional data, which means an increased amount of information. However, this does not always imply an improvement in learning technique performance. Furthermore, some features may be correlated or add unexpected noise, thereby reducing data clustering performance. This has motivated the development of feature selection methods to find the most relevant subset of features to describe data. In this work, we focus on the problem of unsupervised feature selection. The main goal is to define a method to identify the number of features to select after sorting them based on some criterion. This task is done by means of the False Nearest Neighbor technique, which is rooted in chaos theory. Results have shown that this technique gives a good approximate number of features to select. When compared to other techniques, in most of the analyzed cases, it maintains the quality of the generated partitions while selecting fewer features.
KW - Chaos Theory
KW - Clustering
KW - Machine Learning
KW - Unsupervised Feature Selection
UR - http://www.scopus.com/inward/record.url?scp=84862925201&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862925201&partnerID=8YFLogxK
U2 - 10.1109/ICTAI.2011.170
DO - 10.1109/ICTAI.2011.170
M3 - Conference contribution
AN - SCOPUS:84862925201
SN - 9780769545967
T3 - Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
SP - 994
EP - 997
BT - Proceedings - 2011 23rd IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2011
Y2 - 7 November 2011 through 9 November 2011
ER -