Extracting shared subspace for multi-label classification

Shuiwang Ji; Lei Tang; Shipeng Yu; Jieping Ye

doi:10.1145/1401890.1401939

Extracting shared subspace for multi-label classification

Shuiwang Ji, Lei Tang, Shipeng Yu, Jieping Ye

Computing and Augmented Intelligence, School of (IAFSE-SCAI)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

195 Scopus citations

Abstract

Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is non-convex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multi-topic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.

Original language	English (US)
Title of host publication	KDD 2008 - Proceedings of the 14th ACMKDD International Conference on Knowledge Discovery and Data Mining
Pages	381-389
Number of pages	9
DOIs	https://doi.org/10.1145/1401890.1401939
State	Published - 2008
Event	14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008 - Las Vegas, NV, United States Duration: Aug 24 2008 → Aug 27 2008

Publication series

Name	Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

Other	14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008
Country/Territory	United States
City	Las Vegas, NV
Period	8/24/08 → 8/27/08

Keywords

Least squares
Multi-label classification
Shared subspace

ASJC Scopus subject areas

Software
Information Systems

Access to Document

10.1145/1401890.1401939

Cite this

Extracting shared subspace for multi-label classification. / Ji, Shuiwang; Tang, Lei; Yu, Shipeng et al.
KDD 2008 - Proceedings of the 14th ACMKDD International Conference on Knowledge Discovery and Data Mining. 2008. p. 381-389 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Ji, S, Tang, L, Yu, S & Ye, J 2008, Extracting shared subspace for multi-label classification. in KDD 2008 - Proceedings of the 14th ACMKDD International Conference on Knowledge Discovery and Data Mining. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 381-389, 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008, Las Vegas, NV, United States, 8/24/08. https://doi.org/10.1145/1401890.1401939

@inproceedings{7e896bb56e72412e89d1aeca7ce7964b,

title = "Extracting shared subspace for multi-label classification",

abstract = "Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is non-convex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multi-topic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.",

keywords = "Least squares, Multi-label classification, Shared subspace",

author = "Shuiwang Ji and Lei Tang and Shipeng Yu and Jieping Ye",

note = "Copyright: Copyright 2012 Elsevier B.V., All rights reserved.; 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008 ; Conference date: 24-08-2008 Through 27-08-2008",

year = "2008",

doi = "10.1145/1401890.1401939",

language = "English (US)",

isbn = "9781605581934",

series = "Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

pages = "381--389",

booktitle = "KDD 2008 - Proceedings of the 14th ACMKDD International Conference on Knowledge Discovery and Data Mining",

}

TY - GEN

T1 - Extracting shared subspace for multi-label classification

AU - Ji, Shuiwang

AU - Tang, Lei

AU - Yu, Shipeng

AU - Ye, Jieping

PY - 2008

Y1 - 2008

N2 - Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is non-convex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multi-topic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.

AB - Multi-label problems arise in various domains such as multi-topic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is non-convex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multi-topic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.

KW - Least squares

KW - Multi-label classification

KW - Shared subspace

UR - http://www.scopus.com/inward/record.url?scp=65449189832&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65449189832&partnerID=8YFLogxK

U2 - 10.1145/1401890.1401939

DO - 10.1145/1401890.1401939

M3 - Conference contribution

AN - SCOPUS:65449189832

SN - 9781605581934

T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

SP - 381

EP - 389

BT - KDD 2008 - Proceedings of the 14th ACMKDD International Conference on Knowledge Discovery and Data Mining

T2 - 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008

Y2 - 24 August 2008 through 27 August 2008

ER -

Extracting shared subspace for multi-label classification

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this