TY - GEN
T1 - Optimal batch selection for active learning in multi-label classification
AU - Chakraborty, Shayok
AU - Balasubramanian, Vineeth
AU - Panchanathan, Sethuraman
PY - 2011
Y1 - 2011
N2 - Multi-label classification is a generalization of conventional classification, where it is possible for a single data point to have multiple labels. Manual annotation of a multi-label data point requires a human oracle to consider the presence/absence of every possible class separately, which involves significant labor. Active learning techniques are effective in reducing human labeling effort to induce a classification model. When exposed to large quantities of unlabeled data, such algorithms automatically select the salient and representative instances for manual annotation. Further, to address the high redundancy in data such as image or video sequences as well as the availability of multiple labeling agents, there have been recent attempts towards a batch mode form of active learning, where a batch of data points is selected simultaneously from an unlabeled set. In this work, we propose a novel optimization based batch mode active learning strategy to minimize human labeling effort in multi-label classification problems. To the best of our knowledge, this is the first attempt to develop such a scheme primarily intended for the multi-label context. The proposed framework is computationally simple, easy to implement and can be suitably modified to perform batch mode active learning in other formulations, such as single-label classification or problems involving hierarchical label spaces. Our results corroborate the efficacy of the proposed algorithm and certify the potential of the framework in being used for real world applications.
AB - Multi-label classification is a generalization of conventional classification, where it is possible for a single data point to have multiple labels. Manual annotation of a multi-label data point requires a human oracle to consider the presence/absence of every possible class separately, which involves significant labor. Active learning techniques are effective in reducing human labeling effort to induce a classification model. When exposed to large quantities of unlabeled data, such algorithms automatically select the salient and representative instances for manual annotation. Further, to address the high redundancy in data such as image or video sequences as well as the availability of multiple labeling agents, there have been recent attempts towards a batch mode form of active learning, where a batch of data points is selected simultaneously from an unlabeled set. In this work, we propose a novel optimization based batch mode active learning strategy to minimize human labeling effort in multi-label classification problems. To the best of our knowledge, this is the first attempt to develop such a scheme primarily intended for the multi-label context. The proposed framework is computationally simple, easy to implement and can be suitably modified to perform batch mode active learning in other formulations, such as single-label classification or problems involving hierarchical label spaces. Our results corroborate the efficacy of the proposed algorithm and certify the potential of the framework in being used for real world applications.
KW - Algortihms
KW - Theory
UR - http://www.scopus.com/inward/record.url?scp=84455168979&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84455168979&partnerID=8YFLogxK
U2 - 10.1145/2072298.2072028
DO - 10.1145/2072298.2072028
M3 - Conference contribution
AN - SCOPUS:84455168979
SN - 9781450306164
T3 - MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops
SP - 1413
EP - 1416
BT - MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops
T2 - 19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11
Y2 - 28 November 2011 through 1 December 2011
ER -