Deep multimodality model for multi-task multi-view learning

Lecheng Zheng; Yu Cheng; Jingrui He

doi:10.1137/1.9781611975673.2

Deep multimodality model for multi-task multi-view learning

Lecheng Zheng, Yu Cheng, Jingrui He

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

11 Scopus citations

Abstract

Many real-world problems exhibit the coexistence of multiple types of heterogeneity, such as view heterogeneity (i.e., multi-view property) and task heterogeneity (i.e., multi-task property). For example, in an image classification problem containing multiple poses of the same object, each pose can be considered as one view, and the detection of each type of object can be treated as one task. Furthermore, in some problems, the data type of multiple views might be different. In a web classification problem, for instance, we might be provided an image and text mixed data set, where the web pages are characterized by both images and texts. A common strategy to solve this kind of problem is to leverage the consistency of views and the relatedness of tasks to build the prediction model. In the context of deep neural network, multitask relatedness is usually realized by grouping tasks at each layer, while multi-view consistency is usually enforced by finding the maximal correlation coefficient between views. However, there is no existing deep learning algorithm that jointly models task and view dual heterogeneity, particularly for a data set with multiple modalities (text and image mixed data set or text and video mixed data set, etc.). In this paper, we bridge this gap by proposing a deep multi-task multi-view learning framework that learns a deep representation for such dual-heterogeneity problems. Empirical studies on multiple real-world data sets demonstrate the effectiveness of our proposed Deep-MTMV algorithm.

Original language	English (US)
Title of host publication	SIAM International Conference on Data Mining, SDM 2019
Publisher	Society for Industrial and Applied Mathematics Publications
Pages	10-16
Number of pages	7
ISBN (Electronic)	9781611975673
DOIs	https://doi.org/10.1137/1.9781611975673.2
State	Published - 2019
Event	19th SIAM International Conference on Data Mining, SDM 2019 - Calgary, Canada Duration: May 2 2019 → May 4 2019

Publication series

Name	SIAM International Conference on Data Mining, SDM 2019

Conference

Conference	19th SIAM International Conference on Data Mining, SDM 2019
Country/Territory	Canada
City	Calgary
Period	5/2/19 → 5/4/19

Keywords

Deep learning
Multi-task learning
Multi-view learning

ASJC Scopus subject areas

Software

Access to Document

10.1137/1.9781611975673.2

Cite this

Deep multimodality model for multi-task multi-view learning. / Zheng, Lecheng; Cheng, Yu; He, Jingrui.
SIAM International Conference on Data Mining, SDM 2019. Society for Industrial and Applied Mathematics Publications, 2019. p. 10-16 (SIAM International Conference on Data Mining, SDM 2019).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Zheng, L, Cheng, Y & He, J 2019, Deep multimodality model for multi-task multi-view learning. in SIAM International Conference on Data Mining, SDM 2019. SIAM International Conference on Data Mining, SDM 2019, Society for Industrial and Applied Mathematics Publications, pp. 10-16, 19th SIAM International Conference on Data Mining, SDM 2019, Calgary, Canada, 5/2/19. https://doi.org/10.1137/1.9781611975673.2

@inproceedings{ca0988793d0a4caa9978b128eb1b4404,

title = "Deep multimodality model for multi-task multi-view learning",

abstract = "Many real-world problems exhibit the coexistence of multiple types of heterogeneity, such as view heterogeneity (i.e., multi-view property) and task heterogeneity (i.e., multi-task property). For example, in an image classification problem containing multiple poses of the same object, each pose can be considered as one view, and the detection of each type of object can be treated as one task. Furthermore, in some problems, the data type of multiple views might be different. In a web classification problem, for instance, we might be provided an image and text mixed data set, where the web pages are characterized by both images and texts. A common strategy to solve this kind of problem is to leverage the consistency of views and the relatedness of tasks to build the prediction model. In the context of deep neural network, multitask relatedness is usually realized by grouping tasks at each layer, while multi-view consistency is usually enforced by finding the maximal correlation coefficient between views. However, there is no existing deep learning algorithm that jointly models task and view dual heterogeneity, particularly for a data set with multiple modalities (text and image mixed data set or text and video mixed data set, etc.). In this paper, we bridge this gap by proposing a deep multi-task multi-view learning framework that learns a deep representation for such dual-heterogeneity problems. Empirical studies on multiple real-world data sets demonstrate the effectiveness of our proposed Deep-MTMV algorithm.",

keywords = "Deep learning, Multi-task learning, Multi-view learning",

author = "Lecheng Zheng and Yu Cheng and Jingrui He",

note = "Funding Information: This work is supported by the National Science Foundation under Grant No. IIS-1552654, Grant No. IIS-1813464 and Grant No. CNS-1629888, the U.S. Department of Homeland Security under Grant Award Number 17STQAC00001-02-00, and an IBM Faculty Award. The views and conclusions are those of the authors and should not be interpreted as representing the official policies of the funding agencies or the government. Publisher Copyright: Copyright {\textcopyright} 2019 by SIAM.; 19th SIAM International Conference on Data Mining, SDM 2019 ; Conference date: 02-05-2019 Through 04-05-2019",

year = "2019",

doi = "10.1137/1.9781611975673.2",

language = "English (US)",

series = "SIAM International Conference on Data Mining, SDM 2019",

publisher = "Society for Industrial and Applied Mathematics Publications",

pages = "10--16",

booktitle = "SIAM International Conference on Data Mining, SDM 2019",

}

TY - GEN

T1 - Deep multimodality model for multi-task multi-view learning

AU - Zheng, Lecheng

AU - Cheng, Yu

AU - He, Jingrui

N1 - Funding Information: This work is supported by the National Science Foundation under Grant No. IIS-1552654, Grant No. IIS-1813464 and Grant No. CNS-1629888, the U.S. Department of Homeland Security under Grant Award Number 17STQAC00001-02-00, and an IBM Faculty Award. The views and conclusions are those of the authors and should not be interpreted as representing the official policies of the funding agencies or the government. Publisher Copyright: Copyright © 2019 by SIAM.

PY - 2019

Y1 - 2019

N2 - Many real-world problems exhibit the coexistence of multiple types of heterogeneity, such as view heterogeneity (i.e., multi-view property) and task heterogeneity (i.e., multi-task property). For example, in an image classification problem containing multiple poses of the same object, each pose can be considered as one view, and the detection of each type of object can be treated as one task. Furthermore, in some problems, the data type of multiple views might be different. In a web classification problem, for instance, we might be provided an image and text mixed data set, where the web pages are characterized by both images and texts. A common strategy to solve this kind of problem is to leverage the consistency of views and the relatedness of tasks to build the prediction model. In the context of deep neural network, multitask relatedness is usually realized by grouping tasks at each layer, while multi-view consistency is usually enforced by finding the maximal correlation coefficient between views. However, there is no existing deep learning algorithm that jointly models task and view dual heterogeneity, particularly for a data set with multiple modalities (text and image mixed data set or text and video mixed data set, etc.). In this paper, we bridge this gap by proposing a deep multi-task multi-view learning framework that learns a deep representation for such dual-heterogeneity problems. Empirical studies on multiple real-world data sets demonstrate the effectiveness of our proposed Deep-MTMV algorithm.

AB - Many real-world problems exhibit the coexistence of multiple types of heterogeneity, such as view heterogeneity (i.e., multi-view property) and task heterogeneity (i.e., multi-task property). For example, in an image classification problem containing multiple poses of the same object, each pose can be considered as one view, and the detection of each type of object can be treated as one task. Furthermore, in some problems, the data type of multiple views might be different. In a web classification problem, for instance, we might be provided an image and text mixed data set, where the web pages are characterized by both images and texts. A common strategy to solve this kind of problem is to leverage the consistency of views and the relatedness of tasks to build the prediction model. In the context of deep neural network, multitask relatedness is usually realized by grouping tasks at each layer, while multi-view consistency is usually enforced by finding the maximal correlation coefficient between views. However, there is no existing deep learning algorithm that jointly models task and view dual heterogeneity, particularly for a data set with multiple modalities (text and image mixed data set or text and video mixed data set, etc.). In this paper, we bridge this gap by proposing a deep multi-task multi-view learning framework that learns a deep representation for such dual-heterogeneity problems. Empirical studies on multiple real-world data sets demonstrate the effectiveness of our proposed Deep-MTMV algorithm.

KW - Deep learning

KW - Multi-task learning

KW - Multi-view learning

UR - http://www.scopus.com/inward/record.url?scp=85066092321&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066092321&partnerID=8YFLogxK

U2 - 10.1137/1.9781611975673.2

DO - 10.1137/1.9781611975673.2

M3 - Conference contribution

AN - SCOPUS:85066092321

T3 - SIAM International Conference on Data Mining, SDM 2019

SP - 10

EP - 16

BT - SIAM International Conference on Data Mining, SDM 2019

PB - Society for Industrial and Applied Mathematics Publications

T2 - 19th SIAM International Conference on Data Mining, SDM 2019

Y2 - 2 May 2019 through 4 May 2019

ER -

Deep multimodality model for multi-task multi-view learning

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this