TY - GEN
T1 - Extracting shared subspace for multi-label classification
AU - Ji, Shuiwang
AU - Tang, Lei
AU - Yu, Shipeng
AU - Ye, Jieping
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2008
Y1 - 2008
N2 - Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is non-convex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multi-topic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.
AB - Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is non-convex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multi-topic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.
KW - Least squares
KW - Multi-label classification
KW - Shared subspace
UR - http://www.scopus.com/inward/record.url?scp=65449189832&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=65449189832&partnerID=8YFLogxK
U2 - 10.1145/1401890.1401939
DO - 10.1145/1401890.1401939
M3 - Conference contribution
AN - SCOPUS:65449189832
SN - 9781605581934
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 381
EP - 389
BT - KDD 2008 - Proceedings of the 14th ACMKDD International Conference on Knowledge Discovery and Data Mining
T2 - 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008
Y2 - 24 August 2008 through 27 August 2008
ER -