Robust multi-task feature learning

Pinghua Gong, Jieping Ye, Changshui Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

168 Citations (Scopus)

Abstract

Multi-task learning (MTL) aims to improve the performance of multiple related tasks by exploiting the intrinsic relationships among them. Recently, multi-task feature learning algorithms have received increasing attention and they have been successfully applied to many applications involving high dimensional data. However, they assume that all tasks share a common set of features, which is too restrictive and may not hold in real-world applications, since outlier tasks often exist. In this paper, we propose a Robust Multi-Task Feature Learning algorithm (rMTFL) which simultaneously captures a common set of features among relevant tasks and identifies outlier tasks. Specifically, we decompose the weight (model) matrix for all tasks into two components. We impose the well-known group Lasso penalty on row groups of the first component for capturing the shared features among relevant tasks. To simultaneously identify the outlier tasks, we impose the same group Lasso penalty but on column groups of the second component. We propose to employ the accelerated gradient descent to efficiently solve the optimization problem in rMTFL, and show that the proposed algorithm is scalable to large-size problems. In addition, we provide a detailed theoretical analysis on the proposed rMTFL formulation. Specifically, we present a theoretical bound to measure how well our proposed rMTFL approximates the true evaluation, and provide bounds to measure the error between the estimated weights of rMTFL and the underlying true weights. Moreover, by assuming that the underlying true weights are above the noise level, we present a sound theoretical result to show how to obtain the underlying true shared features and outlier tasks (sparsity patterns). Empirical studies on both synthetic and real-world data demonstrate that our proposed rMTFL is capable of simultaneously capturing shared features among tasks and identifying outlier tasks.

Original languageEnglish (US)
Title of host publicationProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages895-903
Number of pages9
DOIs
StatePublished - 2012
Event18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012 - Beijing, China
Duration: Aug 12 2012Aug 16 2012

Other

Other18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012
CountryChina
CityBeijing
Period8/12/128/16/12

Fingerprint

Learning algorithms
Acoustic noise
Acoustic waves

Keywords

  • feature selection
  • multi-task learning
  • outlier tasks detection

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Gong, P., Ye, J., & Zhang, C. (2012). Robust multi-task feature learning. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 895-903) https://doi.org/10.1145/2339530.2339672

Robust multi-task feature learning. / Gong, Pinghua; Ye, Jieping; Zhang, Changshui.

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012. p. 895-903.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Gong, P, Ye, J & Zhang, C 2012, Robust multi-task feature learning. in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 895-903, 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, Beijing, China, 8/12/12. https://doi.org/10.1145/2339530.2339672
Gong P, Ye J, Zhang C. Robust multi-task feature learning. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012. p. 895-903 https://doi.org/10.1145/2339530.2339672
Gong, Pinghua ; Ye, Jieping ; Zhang, Changshui. / Robust multi-task feature learning. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012. pp. 895-903
@inproceedings{3afe3f59c0b74e439a63490f324ab501,
title = "Robust multi-task feature learning",
abstract = "Multi-task learning (MTL) aims to improve the performance of multiple related tasks by exploiting the intrinsic relationships among them. Recently, multi-task feature learning algorithms have received increasing attention and they have been successfully applied to many applications involving high dimensional data. However, they assume that all tasks share a common set of features, which is too restrictive and may not hold in real-world applications, since outlier tasks often exist. In this paper, we propose a Robust Multi-Task Feature Learning algorithm (rMTFL) which simultaneously captures a common set of features among relevant tasks and identifies outlier tasks. Specifically, we decompose the weight (model) matrix for all tasks into two components. We impose the well-known group Lasso penalty on row groups of the first component for capturing the shared features among relevant tasks. To simultaneously identify the outlier tasks, we impose the same group Lasso penalty but on column groups of the second component. We propose to employ the accelerated gradient descent to efficiently solve the optimization problem in rMTFL, and show that the proposed algorithm is scalable to large-size problems. In addition, we provide a detailed theoretical analysis on the proposed rMTFL formulation. Specifically, we present a theoretical bound to measure how well our proposed rMTFL approximates the true evaluation, and provide bounds to measure the error between the estimated weights of rMTFL and the underlying true weights. Moreover, by assuming that the underlying true weights are above the noise level, we present a sound theoretical result to show how to obtain the underlying true shared features and outlier tasks (sparsity patterns). Empirical studies on both synthetic and real-world data demonstrate that our proposed rMTFL is capable of simultaneously capturing shared features among tasks and identifying outlier tasks.",
keywords = "feature selection, multi-task learning, outlier tasks detection",
author = "Pinghua Gong and Jieping Ye and Changshui Zhang",
year = "2012",
doi = "10.1145/2339530.2339672",
language = "English (US)",
isbn = "9781450314626",
pages = "895--903",
booktitle = "Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

}

TY - GEN

T1 - Robust multi-task feature learning

AU - Gong, Pinghua

AU - Ye, Jieping

AU - Zhang, Changshui

PY - 2012

Y1 - 2012

N2 - Multi-task learning (MTL) aims to improve the performance of multiple related tasks by exploiting the intrinsic relationships among them. Recently, multi-task feature learning algorithms have received increasing attention and they have been successfully applied to many applications involving high dimensional data. However, they assume that all tasks share a common set of features, which is too restrictive and may not hold in real-world applications, since outlier tasks often exist. In this paper, we propose a Robust Multi-Task Feature Learning algorithm (rMTFL) which simultaneously captures a common set of features among relevant tasks and identifies outlier tasks. Specifically, we decompose the weight (model) matrix for all tasks into two components. We impose the well-known group Lasso penalty on row groups of the first component for capturing the shared features among relevant tasks. To simultaneously identify the outlier tasks, we impose the same group Lasso penalty but on column groups of the second component. We propose to employ the accelerated gradient descent to efficiently solve the optimization problem in rMTFL, and show that the proposed algorithm is scalable to large-size problems. In addition, we provide a detailed theoretical analysis on the proposed rMTFL formulation. Specifically, we present a theoretical bound to measure how well our proposed rMTFL approximates the true evaluation, and provide bounds to measure the error between the estimated weights of rMTFL and the underlying true weights. Moreover, by assuming that the underlying true weights are above the noise level, we present a sound theoretical result to show how to obtain the underlying true shared features and outlier tasks (sparsity patterns). Empirical studies on both synthetic and real-world data demonstrate that our proposed rMTFL is capable of simultaneously capturing shared features among tasks and identifying outlier tasks.

AB - Multi-task learning (MTL) aims to improve the performance of multiple related tasks by exploiting the intrinsic relationships among them. Recently, multi-task feature learning algorithms have received increasing attention and they have been successfully applied to many applications involving high dimensional data. However, they assume that all tasks share a common set of features, which is too restrictive and may not hold in real-world applications, since outlier tasks often exist. In this paper, we propose a Robust Multi-Task Feature Learning algorithm (rMTFL) which simultaneously captures a common set of features among relevant tasks and identifies outlier tasks. Specifically, we decompose the weight (model) matrix for all tasks into two components. We impose the well-known group Lasso penalty on row groups of the first component for capturing the shared features among relevant tasks. To simultaneously identify the outlier tasks, we impose the same group Lasso penalty but on column groups of the second component. We propose to employ the accelerated gradient descent to efficiently solve the optimization problem in rMTFL, and show that the proposed algorithm is scalable to large-size problems. In addition, we provide a detailed theoretical analysis on the proposed rMTFL formulation. Specifically, we present a theoretical bound to measure how well our proposed rMTFL approximates the true evaluation, and provide bounds to measure the error between the estimated weights of rMTFL and the underlying true weights. Moreover, by assuming that the underlying true weights are above the noise level, we present a sound theoretical result to show how to obtain the underlying true shared features and outlier tasks (sparsity patterns). Empirical studies on both synthetic and real-world data demonstrate that our proposed rMTFL is capable of simultaneously capturing shared features among tasks and identifying outlier tasks.

KW - feature selection

KW - multi-task learning

KW - outlier tasks detection

UR - http://www.scopus.com/inward/record.url?scp=84866007553&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84866007553&partnerID=8YFLogxK

U2 - 10.1145/2339530.2339672

DO - 10.1145/2339530.2339672

M3 - Conference contribution

AN - SCOPUS:84866007553

SN - 9781450314626

SP - 895

EP - 903

BT - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ER -