Abstract

Active learning techniques have gained popularity to reduce human effort in labeling data instances for inducing a classifier. When faced with large amounts of unlabeled data, such algorithms automatically identify the exemplar instances for manual annotation. More recently, there have been attempts towards a batch mode form of active learning, where a batch of data points is simultaneously selected from an unlabeled set. In this paper, we propose two novel batch mode active learning (BMAL) algorithms: BatchRank and BatchRand. We first formulate the batch selection task as an NP-hard optimization problem; we then propose two convex relaxations, one based on linear programming and the other based on semi-definite programming to solve the batch selection problem. Finally, a deterministic bound is derived on the solution quality for the first relaxation and a probabilistic bound for the second. To the best of our knowledge, this is the first research effort to derive mathematical guarantees on the solution quality of the BMAL problem. Our extensive empirical studies on $15$ binary, multi-class and multi-label challenging datasets corroborate that the proposed algorithms perform at par with the state-of-the-art techniques, deliver high quality solutions and are robust to real-world issues like label noise and class imbalance.

Original languageEnglish (US)
Article number7006697
Pages (from-to)1945-1958
Number of pages14
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume37
Issue number10
DOIs
StatePublished - Oct 1 2015

Fingerprint

Convex Relaxation
Problem-Based Learning
Batch
Active Learning
Labels
Linear Programming
Linear programming
Labeling
Learning algorithms
Noise
Classifiers
Semidefinite Programming
Multi-class
NP-hard Problems
Empirical Study
Annotation
Learning Algorithm
Classifier
Research
Binary

Keywords

  • Batch Mode Active Learning
  • Optimization

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Software
  • Computational Theory and Mathematics
  • Applied Mathematics
  • Medicine(all)

Cite this

Active Batch Selection via Convex Relaxations with Guaranteed Solution Bounds. / Chakraborty, Shayok; Balasubramanian, Vineeth; Sun, Qian; Panchanathan, Sethuraman; Ye, Jieping.

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, No. 10, 7006697, 01.10.2015, p. 1945-1958.

Research output: Contribution to journalArticle

Chakraborty, Shayok ; Balasubramanian, Vineeth ; Sun, Qian ; Panchanathan, Sethuraman ; Ye, Jieping. / Active Batch Selection via Convex Relaxations with Guaranteed Solution Bounds. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015 ; Vol. 37, No. 10. pp. 1945-1958.
@article{6532009a40f44ff5b70094219ce2f2ce,
title = "Active Batch Selection via Convex Relaxations with Guaranteed Solution Bounds",
abstract = "Active learning techniques have gained popularity to reduce human effort in labeling data instances for inducing a classifier. When faced with large amounts of unlabeled data, such algorithms automatically identify the exemplar instances for manual annotation. More recently, there have been attempts towards a batch mode form of active learning, where a batch of data points is simultaneously selected from an unlabeled set. In this paper, we propose two novel batch mode active learning (BMAL) algorithms: BatchRank and BatchRand. We first formulate the batch selection task as an NP-hard optimization problem; we then propose two convex relaxations, one based on linear programming and the other based on semi-definite programming to solve the batch selection problem. Finally, a deterministic bound is derived on the solution quality for the first relaxation and a probabilistic bound for the second. To the best of our knowledge, this is the first research effort to derive mathematical guarantees on the solution quality of the BMAL problem. Our extensive empirical studies on $15$ binary, multi-class and multi-label challenging datasets corroborate that the proposed algorithms perform at par with the state-of-the-art techniques, deliver high quality solutions and are robust to real-world issues like label noise and class imbalance.",
keywords = "Batch Mode Active Learning, Optimization",
author = "Shayok Chakraborty and Vineeth Balasubramanian and Qian Sun and Sethuraman Panchanathan and Jieping Ye",
year = "2015",
month = "10",
day = "1",
doi = "10.1109/TPAMI.2015.2389848",
language = "English (US)",
volume = "37",
pages = "1945--1958",
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
issn = "0162-8828",
publisher = "IEEE Computer Society",
number = "10",

}

TY - JOUR

T1 - Active Batch Selection via Convex Relaxations with Guaranteed Solution Bounds

AU - Chakraborty, Shayok

AU - Balasubramanian, Vineeth

AU - Sun, Qian

AU - Panchanathan, Sethuraman

AU - Ye, Jieping

PY - 2015/10/1

Y1 - 2015/10/1

N2 - Active learning techniques have gained popularity to reduce human effort in labeling data instances for inducing a classifier. When faced with large amounts of unlabeled data, such algorithms automatically identify the exemplar instances for manual annotation. More recently, there have been attempts towards a batch mode form of active learning, where a batch of data points is simultaneously selected from an unlabeled set. In this paper, we propose two novel batch mode active learning (BMAL) algorithms: BatchRank and BatchRand. We first formulate the batch selection task as an NP-hard optimization problem; we then propose two convex relaxations, one based on linear programming and the other based on semi-definite programming to solve the batch selection problem. Finally, a deterministic bound is derived on the solution quality for the first relaxation and a probabilistic bound for the second. To the best of our knowledge, this is the first research effort to derive mathematical guarantees on the solution quality of the BMAL problem. Our extensive empirical studies on $15$ binary, multi-class and multi-label challenging datasets corroborate that the proposed algorithms perform at par with the state-of-the-art techniques, deliver high quality solutions and are robust to real-world issues like label noise and class imbalance.

AB - Active learning techniques have gained popularity to reduce human effort in labeling data instances for inducing a classifier. When faced with large amounts of unlabeled data, such algorithms automatically identify the exemplar instances for manual annotation. More recently, there have been attempts towards a batch mode form of active learning, where a batch of data points is simultaneously selected from an unlabeled set. In this paper, we propose two novel batch mode active learning (BMAL) algorithms: BatchRank and BatchRand. We first formulate the batch selection task as an NP-hard optimization problem; we then propose two convex relaxations, one based on linear programming and the other based on semi-definite programming to solve the batch selection problem. Finally, a deterministic bound is derived on the solution quality for the first relaxation and a probabilistic bound for the second. To the best of our knowledge, this is the first research effort to derive mathematical guarantees on the solution quality of the BMAL problem. Our extensive empirical studies on $15$ binary, multi-class and multi-label challenging datasets corroborate that the proposed algorithms perform at par with the state-of-the-art techniques, deliver high quality solutions and are robust to real-world issues like label noise and class imbalance.

KW - Batch Mode Active Learning

KW - Optimization

UR - http://www.scopus.com/inward/record.url?scp=84941194771&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84941194771&partnerID=8YFLogxK

U2 - 10.1109/TPAMI.2015.2389848

DO - 10.1109/TPAMI.2015.2389848

M3 - Article

C2 - 26353181

AN - SCOPUS:84941194771

VL - 37

SP - 1945

EP - 1958

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 10

M1 - 7006697

ER -