Coselection of features and instances for unsupervised rare category analysis

Jingrui He, Jaime Carbonell

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Rare category analysis is of key importance both in theory and in practice. Previous research work focuses on supervised rare category analysis, such as rare category detection and rare category classification. In this paper, for the first time, we address the challenge of unsupervised rare category analysis, including feature selection and rare category selection. We propose to jointly deal with the two correlated tasks, so that they can benefit from each other. To this end, we design an optimization framework which is able to coselect the relevant features and the examples from the rare category (a.k.a. the minority class). It is well justified theoretically. Furthermore, we develop the Partial Augmented Lagrangian Method (PALM) to solve the optimization problem. Experimental results on both synthetic and real data sets show the effectiveness of the proposed method.

Original languageEnglish (US)
Pages (from-to)417-430
Number of pages14
JournalStatistical Analysis and Data Mining
Volume3
Issue number6
DOIs
StatePublished - Dec 2010
Externally publishedYes

Fingerprint

Feature extraction
Augmented Lagrangian Method
Design optimization
Feature Selection
Optimization Problem
Partial
Optimization
Experimental Results

Keywords

  • Augmented Lagrangian
  • Feature selection
  • Optimization
  • Rare category
  • Unsupervised

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Analysis

Cite this

Coselection of features and instances for unsupervised rare category analysis. / He, Jingrui; Carbonell, Jaime.

In: Statistical Analysis and Data Mining, Vol. 3, No. 6, 12.2010, p. 417-430.

Research output: Contribution to journalArticle

@article{6992357769934da4a22982d0b9dacebb,
title = "Coselection of features and instances for unsupervised rare category analysis",
abstract = "Rare category analysis is of key importance both in theory and in practice. Previous research work focuses on supervised rare category analysis, such as rare category detection and rare category classification. In this paper, for the first time, we address the challenge of unsupervised rare category analysis, including feature selection and rare category selection. We propose to jointly deal with the two correlated tasks, so that they can benefit from each other. To this end, we design an optimization framework which is able to coselect the relevant features and the examples from the rare category (a.k.a. the minority class). It is well justified theoretically. Furthermore, we develop the Partial Augmented Lagrangian Method (PALM) to solve the optimization problem. Experimental results on both synthetic and real data sets show the effectiveness of the proposed method.",
keywords = "Augmented Lagrangian, Feature selection, Optimization, Rare category, Unsupervised",
author = "Jingrui He and Jaime Carbonell",
year = "2010",
month = "12",
doi = "10.1002/sam.10091",
language = "English (US)",
volume = "3",
pages = "417--430",
journal = "Statistical Analysis and Data Mining",
issn = "1932-1864",
publisher = "John Wiley and Sons Inc.",
number = "6",

}

TY - JOUR

T1 - Coselection of features and instances for unsupervised rare category analysis

AU - He, Jingrui

AU - Carbonell, Jaime

PY - 2010/12

Y1 - 2010/12

N2 - Rare category analysis is of key importance both in theory and in practice. Previous research work focuses on supervised rare category analysis, such as rare category detection and rare category classification. In this paper, for the first time, we address the challenge of unsupervised rare category analysis, including feature selection and rare category selection. We propose to jointly deal with the two correlated tasks, so that they can benefit from each other. To this end, we design an optimization framework which is able to coselect the relevant features and the examples from the rare category (a.k.a. the minority class). It is well justified theoretically. Furthermore, we develop the Partial Augmented Lagrangian Method (PALM) to solve the optimization problem. Experimental results on both synthetic and real data sets show the effectiveness of the proposed method.

AB - Rare category analysis is of key importance both in theory and in practice. Previous research work focuses on supervised rare category analysis, such as rare category detection and rare category classification. In this paper, for the first time, we address the challenge of unsupervised rare category analysis, including feature selection and rare category selection. We propose to jointly deal with the two correlated tasks, so that they can benefit from each other. To this end, we design an optimization framework which is able to coselect the relevant features and the examples from the rare category (a.k.a. the minority class). It is well justified theoretically. Furthermore, we develop the Partial Augmented Lagrangian Method (PALM) to solve the optimization problem. Experimental results on both synthetic and real data sets show the effectiveness of the proposed method.

KW - Augmented Lagrangian

KW - Feature selection

KW - Optimization

KW - Rare category

KW - Unsupervised

UR - http://www.scopus.com/inward/record.url?scp=78649930542&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78649930542&partnerID=8YFLogxK

U2 - 10.1002/sam.10091

DO - 10.1002/sam.10091

M3 - Article

VL - 3

SP - 417

EP - 430

JO - Statistical Analysis and Data Mining

JF - Statistical Analysis and Data Mining

SN - 1932-1864

IS - 6

ER -