Co-selection of features and instances for unsupervised rare category analysis

Jingrui He, Jaime Carbonell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Rare category analysis is of key importance both in theory and in practice. Previous research work focuses on supervised rare category analysis, such as rare category detection and rare category classification. In this paper, for the first time, we address the challenge of unsupervised rare category analysis, including feature selection and rare category selection. We propose to jointly deal with the two correlated tasks so that they can benefit from each other. To this end, we design an optimization framework which is able to co-select the relevant features and the examples from the rare category (a.k.a. the minority class). It is well justified theoretically. Furthermore, we develop the Partial Augmented Lagrangian Method (PALM) to solve the optimization problem. Experimental results on both synthetic and real data sets show the effectiveness of the proposed method.

Original languageEnglish (US)
Title of host publicationProceedings of the 10th SIAM International Conference on Data Mining, SDM 2010
Pages525-536
Number of pages12
StatePublished - 2010
Externally publishedYes
Event10th SIAM International Conference on Data Mining, SDM 2010 - Columbus, OH, United States
Duration: Apr 29 2010May 1 2010

Other

Other10th SIAM International Conference on Data Mining, SDM 2010
CountryUnited States
CityColumbus, OH
Period4/29/105/1/10

Fingerprint

Feature extraction
Design optimization

ASJC Scopus subject areas

  • Software

Cite this

He, J., & Carbonell, J. (2010). Co-selection of features and instances for unsupervised rare category analysis. In Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010 (pp. 525-536)

Co-selection of features and instances for unsupervised rare category analysis. / He, Jingrui; Carbonell, Jaime.

Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. 2010. p. 525-536.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

He, J & Carbonell, J 2010, Co-selection of features and instances for unsupervised rare category analysis. in Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. pp. 525-536, 10th SIAM International Conference on Data Mining, SDM 2010, Columbus, OH, United States, 4/29/10.
He J, Carbonell J. Co-selection of features and instances for unsupervised rare category analysis. In Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. 2010. p. 525-536
He, Jingrui ; Carbonell, Jaime. / Co-selection of features and instances for unsupervised rare category analysis. Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. 2010. pp. 525-536
@inproceedings{fe94294dd24c4cffb1d9a33798593de1,
title = "Co-selection of features and instances for unsupervised rare category analysis",
abstract = "Rare category analysis is of key importance both in theory and in practice. Previous research work focuses on supervised rare category analysis, such as rare category detection and rare category classification. In this paper, for the first time, we address the challenge of unsupervised rare category analysis, including feature selection and rare category selection. We propose to jointly deal with the two correlated tasks so that they can benefit from each other. To this end, we design an optimization framework which is able to co-select the relevant features and the examples from the rare category (a.k.a. the minority class). It is well justified theoretically. Furthermore, we develop the Partial Augmented Lagrangian Method (PALM) to solve the optimization problem. Experimental results on both synthetic and real data sets show the effectiveness of the proposed method.",
author = "Jingrui He and Jaime Carbonell",
year = "2010",
language = "English (US)",
pages = "525--536",
booktitle = "Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010",

}

TY - GEN

T1 - Co-selection of features and instances for unsupervised rare category analysis

AU - He, Jingrui

AU - Carbonell, Jaime

PY - 2010

Y1 - 2010

N2 - Rare category analysis is of key importance both in theory and in practice. Previous research work focuses on supervised rare category analysis, such as rare category detection and rare category classification. In this paper, for the first time, we address the challenge of unsupervised rare category analysis, including feature selection and rare category selection. We propose to jointly deal with the two correlated tasks so that they can benefit from each other. To this end, we design an optimization framework which is able to co-select the relevant features and the examples from the rare category (a.k.a. the minority class). It is well justified theoretically. Furthermore, we develop the Partial Augmented Lagrangian Method (PALM) to solve the optimization problem. Experimental results on both synthetic and real data sets show the effectiveness of the proposed method.

AB - Rare category analysis is of key importance both in theory and in practice. Previous research work focuses on supervised rare category analysis, such as rare category detection and rare category classification. In this paper, for the first time, we address the challenge of unsupervised rare category analysis, including feature selection and rare category selection. We propose to jointly deal with the two correlated tasks so that they can benefit from each other. To this end, we design an optimization framework which is able to co-select the relevant features and the examples from the rare category (a.k.a. the minority class). It is well justified theoretically. Furthermore, we develop the Partial Augmented Lagrangian Method (PALM) to solve the optimization problem. Experimental results on both synthetic and real data sets show the effectiveness of the proposed method.

UR - http://www.scopus.com/inward/record.url?scp=84880100676&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880100676&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84880100676

SP - 525

EP - 536

BT - Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010

ER -