Abstract

Active learning algorithms automatically identify the salient and exemplar instances from large amounts of unlabeled data and thus reduce human annotation effort in inducing a classification model. More recently, Batch Mode Active Learning (BMAL) techniques have been proposed, where a batch of data samples is selected simultaneously from an unlabeled set. Most active learning algorithms assume a flat label space, that is, they consider the class labels to be independent. However, in many applications, the set of class labels are organized in a hierarchical tree structure, with the leaf nodes as outputs and the internal nodes as clusters of outputs at multiple levels of granularity. In this paper, we propose a novel BMAL algorithm (BatchRank) for hierarchical classification. The sample selection is posed as an NP-hard integer quadratic programming problem and a convex relaxation (based on linear programming) is derived, whose solution is further improved by an iterative truncated power method. Finally, a deterministic bound is established on the quality of the solution. Our empirical results on several challenging, real-world datasets from multiple domains, corroborate the potential of the proposed framework for real-world hierarchical classification applications.

Original languageEnglish (US)
Title of host publicationKDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages99-108
Number of pages10
Volume2015-August
ISBN (Electronic)9781450336642
DOIs
StatePublished - Aug 10 2015
Event21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015 - Sydney, Australia
Duration: Aug 10 2015Aug 13 2015

Other

Other21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015
CountryAustralia
CitySydney
Period8/10/158/13/15

Fingerprint

Learning algorithms
Labels
Quadratic programming
Linear programming
Problem-Based Learning

Keywords

  • Active learning
  • Hierarchical classification
  • Optimization

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Chakraborty, S., Balasubramanian, V., Sankar, A. R., Panchanathan, S., & Ye, J. (2015). BatchRank: A novel batch mode active learning framework for hierarchical classification. In KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Vol. 2015-August, pp. 99-108). Association for Computing Machinery. https://doi.org/10.1145/2783258.2783298

BatchRank : A novel batch mode active learning framework for hierarchical classification. / Chakraborty, Shayok; Balasubramanian, Vineeth; Sankar, Adepu Ravi; Panchanathan, Sethuraman; Ye, Jieping.

KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Vol. 2015-August Association for Computing Machinery, 2015. p. 99-108.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chakraborty, S, Balasubramanian, V, Sankar, AR, Panchanathan, S & Ye, J 2015, BatchRank: A novel batch mode active learning framework for hierarchical classification. in KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. vol. 2015-August, Association for Computing Machinery, pp. 99-108, 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015, Sydney, Australia, 8/10/15. https://doi.org/10.1145/2783258.2783298
Chakraborty S, Balasubramanian V, Sankar AR, Panchanathan S, Ye J. BatchRank: A novel batch mode active learning framework for hierarchical classification. In KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Vol. 2015-August. Association for Computing Machinery. 2015. p. 99-108 https://doi.org/10.1145/2783258.2783298
Chakraborty, Shayok ; Balasubramanian, Vineeth ; Sankar, Adepu Ravi ; Panchanathan, Sethuraman ; Ye, Jieping. / BatchRank : A novel batch mode active learning framework for hierarchical classification. KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Vol. 2015-August Association for Computing Machinery, 2015. pp. 99-108
@inproceedings{40a9d5c889614f12b1b45075fdfd889d,
title = "BatchRank: A novel batch mode active learning framework for hierarchical classification",
abstract = "Active learning algorithms automatically identify the salient and exemplar instances from large amounts of unlabeled data and thus reduce human annotation effort in inducing a classification model. More recently, Batch Mode Active Learning (BMAL) techniques have been proposed, where a batch of data samples is selected simultaneously from an unlabeled set. Most active learning algorithms assume a flat label space, that is, they consider the class labels to be independent. However, in many applications, the set of class labels are organized in a hierarchical tree structure, with the leaf nodes as outputs and the internal nodes as clusters of outputs at multiple levels of granularity. In this paper, we propose a novel BMAL algorithm (BatchRank) for hierarchical classification. The sample selection is posed as an NP-hard integer quadratic programming problem and a convex relaxation (based on linear programming) is derived, whose solution is further improved by an iterative truncated power method. Finally, a deterministic bound is established on the quality of the solution. Our empirical results on several challenging, real-world datasets from multiple domains, corroborate the potential of the proposed framework for real-world hierarchical classification applications.",
keywords = "Active learning, Hierarchical classification, Optimization",
author = "Shayok Chakraborty and Vineeth Balasubramanian and Sankar, {Adepu Ravi} and Sethuraman Panchanathan and Jieping Ye",
year = "2015",
month = "8",
day = "10",
doi = "10.1145/2783258.2783298",
language = "English (US)",
volume = "2015-August",
pages = "99--108",
booktitle = "KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - BatchRank

T2 - A novel batch mode active learning framework for hierarchical classification

AU - Chakraborty, Shayok

AU - Balasubramanian, Vineeth

AU - Sankar, Adepu Ravi

AU - Panchanathan, Sethuraman

AU - Ye, Jieping

PY - 2015/8/10

Y1 - 2015/8/10

N2 - Active learning algorithms automatically identify the salient and exemplar instances from large amounts of unlabeled data and thus reduce human annotation effort in inducing a classification model. More recently, Batch Mode Active Learning (BMAL) techniques have been proposed, where a batch of data samples is selected simultaneously from an unlabeled set. Most active learning algorithms assume a flat label space, that is, they consider the class labels to be independent. However, in many applications, the set of class labels are organized in a hierarchical tree structure, with the leaf nodes as outputs and the internal nodes as clusters of outputs at multiple levels of granularity. In this paper, we propose a novel BMAL algorithm (BatchRank) for hierarchical classification. The sample selection is posed as an NP-hard integer quadratic programming problem and a convex relaxation (based on linear programming) is derived, whose solution is further improved by an iterative truncated power method. Finally, a deterministic bound is established on the quality of the solution. Our empirical results on several challenging, real-world datasets from multiple domains, corroborate the potential of the proposed framework for real-world hierarchical classification applications.

AB - Active learning algorithms automatically identify the salient and exemplar instances from large amounts of unlabeled data and thus reduce human annotation effort in inducing a classification model. More recently, Batch Mode Active Learning (BMAL) techniques have been proposed, where a batch of data samples is selected simultaneously from an unlabeled set. Most active learning algorithms assume a flat label space, that is, they consider the class labels to be independent. However, in many applications, the set of class labels are organized in a hierarchical tree structure, with the leaf nodes as outputs and the internal nodes as clusters of outputs at multiple levels of granularity. In this paper, we propose a novel BMAL algorithm (BatchRank) for hierarchical classification. The sample selection is posed as an NP-hard integer quadratic programming problem and a convex relaxation (based on linear programming) is derived, whose solution is further improved by an iterative truncated power method. Finally, a deterministic bound is established on the quality of the solution. Our empirical results on several challenging, real-world datasets from multiple domains, corroborate the potential of the proposed framework for real-world hierarchical classification applications.

KW - Active learning

KW - Hierarchical classification

KW - Optimization

UR - http://www.scopus.com/inward/record.url?scp=84954096472&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84954096472&partnerID=8YFLogxK

U2 - 10.1145/2783258.2783298

DO - 10.1145/2783258.2783298

M3 - Conference contribution

AN - SCOPUS:84954096472

VL - 2015-August

SP - 99

EP - 108

BT - KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

ER -