Rare category characterization

Jingrui He, Hanghang Tong, Jaime Carbonell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Citations (Scopus)

Abstract

Rare categories abound and their characterization has heretofore received little attention. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate characterization is challenging due to high-skewness and non-separability from majority classes, e.g., fraudulent transactions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. It is based on an optimization framework which encloses the rare examples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected subgradient method. RACH can be naturally kernelized. Experimental results validate the effectiveness of RACH.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE International Conference on Data Mining, ICDM
Pages226-235
Number of pages10
DOIs
StatePublished - 2010
Externally publishedYes
Event10th IEEE International Conference on Data Mining, ICDM 2010 - Sydney, NSW, Australia
Duration: Dec 14 2010Dec 17 2010

Other

Other10th IEEE International Conference on Data Mining, ICDM 2010
CountryAustralia
CitySydney, NSW
Period12/14/1012/17/10

Fingerprint

Convex optimization

Keywords

  • Characterization
  • Compactness
  • Hyperball
  • Minority class
  • Optimization
  • Rare category
  • Subgradient

ASJC Scopus subject areas

  • Engineering(all)

Cite this

He, J., Tong, H., & Carbonell, J. (2010). Rare category characterization. In Proceedings - IEEE International Conference on Data Mining, ICDM (pp. 226-235). [5693976] https://doi.org/10.1109/ICDM.2010.154

Rare category characterization. / He, Jingrui; Tong, Hanghang; Carbonell, Jaime.

Proceedings - IEEE International Conference on Data Mining, ICDM. 2010. p. 226-235 5693976.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

He, J, Tong, H & Carbonell, J 2010, Rare category characterization. in Proceedings - IEEE International Conference on Data Mining, ICDM., 5693976, pp. 226-235, 10th IEEE International Conference on Data Mining, ICDM 2010, Sydney, NSW, Australia, 12/14/10. https://doi.org/10.1109/ICDM.2010.154
He J, Tong H, Carbonell J. Rare category characterization. In Proceedings - IEEE International Conference on Data Mining, ICDM. 2010. p. 226-235. 5693976 https://doi.org/10.1109/ICDM.2010.154
He, Jingrui ; Tong, Hanghang ; Carbonell, Jaime. / Rare category characterization. Proceedings - IEEE International Conference on Data Mining, ICDM. 2010. pp. 226-235
@inproceedings{bcc4ed152a9447f7888394ece96ab4d8,
title = "Rare category characterization",
abstract = "Rare categories abound and their characterization has heretofore received little attention. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate characterization is challenging due to high-skewness and non-separability from majority classes, e.g., fraudulent transactions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. It is based on an optimization framework which encloses the rare examples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected subgradient method. RACH can be naturally kernelized. Experimental results validate the effectiveness of RACH.",
keywords = "Characterization, Compactness, Hyperball, Minority class, Optimization, Rare category, Subgradient",
author = "Jingrui He and Hanghang Tong and Jaime Carbonell",
year = "2010",
doi = "10.1109/ICDM.2010.154",
language = "English (US)",
isbn = "9780769542560",
pages = "226--235",
booktitle = "Proceedings - IEEE International Conference on Data Mining, ICDM",

}

TY - GEN

T1 - Rare category characterization

AU - He, Jingrui

AU - Tong, Hanghang

AU - Carbonell, Jaime

PY - 2010

Y1 - 2010

N2 - Rare categories abound and their characterization has heretofore received little attention. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate characterization is challenging due to high-skewness and non-separability from majority classes, e.g., fraudulent transactions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. It is based on an optimization framework which encloses the rare examples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected subgradient method. RACH can be naturally kernelized. Experimental results validate the effectiveness of RACH.

AB - Rare categories abound and their characterization has heretofore received little attention. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate characterization is challenging due to high-skewness and non-separability from majority classes, e.g., fraudulent transactions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. It is based on an optimization framework which encloses the rare examples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected subgradient method. RACH can be naturally kernelized. Experimental results validate the effectiveness of RACH.

KW - Characterization

KW - Compactness

KW - Hyperball

KW - Minority class

KW - Optimization

KW - Rare category

KW - Subgradient

UR - http://www.scopus.com/inward/record.url?scp=79951766273&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79951766273&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2010.154

DO - 10.1109/ICDM.2010.154

M3 - Conference contribution

SN - 9780769542560

SP - 226

EP - 235

BT - Proceedings - IEEE International Conference on Data Mining, ICDM

ER -