Abstract

Data heterogeneity is an intrinsic property of many high impact applications, such as insider threat detection, traffic prediction, brain image analysis, quality control in manufacturing processes, etc. Furthermore, multiple types of heterogeneity (e.g., task/view/instance heterogeneity) often co-exist in these applications, thus pose new challenges to existing techniques, most of which are tailored for a single or dual types of heterogeneity. To address this problem, in this paper, we propose a novel graph-based hybrid approach to simultaneously model multiple types of heterogeneity in a principled framework. The objective is to maximize the smoothness consistency of the neighboring nodes, bag-instance correlation together with task relatedness on the hybrid graphs, and simultaneously minimize the empirical classification loss. Furthermore, we analyze its performance based on Rademacher complexity, which sheds light on the benefits of jointly modeling multiple types of heterogeneity. To solve the resulting non-convex non-smooth problem, we propose an iterative algorithm named M3 Learning, which combines block coordinate descent and the bundle method for optimization. Experimental results on various data sets show the effectiveness of the proposed algorithm.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE International Conference on Data Mining, ICDM
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1081-1086
Number of pages6
Volume2016-January
ISBN (Print)9781467395038
DOIs
StatePublished - Jan 5 2016
Event15th IEEE International Conference on Data Mining, ICDM 2015 - Atlantic City, United States
Duration: Nov 14 2015Nov 17 2015

Other

Other15th IEEE International Conference on Data Mining, ICDM 2015
CountryUnited States
CityAtlantic City
Period11/14/1511/17/15

Fingerprint

Image analysis
Quality control
Brain

Keywords

  • Heterogeneous learning
  • Multi-instance learning
  • Multi-task learning
  • Multi-view learning

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Yang, P., & He, J. (2016). A graph-based hybrid framework for modeling complex heterogeneity. In Proceedings - IEEE International Conference on Data Mining, ICDM (Vol. 2016-January, pp. 1081-1086). [7373439] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDM.2015.109

A graph-based hybrid framework for modeling complex heterogeneity. / Yang, Pei; He, Jingrui.

Proceedings - IEEE International Conference on Data Mining, ICDM. Vol. 2016-January Institute of Electrical and Electronics Engineers Inc., 2016. p. 1081-1086 7373439.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yang, P & He, J 2016, A graph-based hybrid framework for modeling complex heterogeneity. in Proceedings - IEEE International Conference on Data Mining, ICDM. vol. 2016-January, 7373439, Institute of Electrical and Electronics Engineers Inc., pp. 1081-1086, 15th IEEE International Conference on Data Mining, ICDM 2015, Atlantic City, United States, 11/14/15. https://doi.org/10.1109/ICDM.2015.109
Yang P, He J. A graph-based hybrid framework for modeling complex heterogeneity. In Proceedings - IEEE International Conference on Data Mining, ICDM. Vol. 2016-January. Institute of Electrical and Electronics Engineers Inc. 2016. p. 1081-1086. 7373439 https://doi.org/10.1109/ICDM.2015.109
Yang, Pei ; He, Jingrui. / A graph-based hybrid framework for modeling complex heterogeneity. Proceedings - IEEE International Conference on Data Mining, ICDM. Vol. 2016-January Institute of Electrical and Electronics Engineers Inc., 2016. pp. 1081-1086
@inproceedings{845eeeed04954480be7a25074602ff97,
title = "A graph-based hybrid framework for modeling complex heterogeneity",
abstract = "Data heterogeneity is an intrinsic property of many high impact applications, such as insider threat detection, traffic prediction, brain image analysis, quality control in manufacturing processes, etc. Furthermore, multiple types of heterogeneity (e.g., task/view/instance heterogeneity) often co-exist in these applications, thus pose new challenges to existing techniques, most of which are tailored for a single or dual types of heterogeneity. To address this problem, in this paper, we propose a novel graph-based hybrid approach to simultaneously model multiple types of heterogeneity in a principled framework. The objective is to maximize the smoothness consistency of the neighboring nodes, bag-instance correlation together with task relatedness on the hybrid graphs, and simultaneously minimize the empirical classification loss. Furthermore, we analyze its performance based on Rademacher complexity, which sheds light on the benefits of jointly modeling multiple types of heterogeneity. To solve the resulting non-convex non-smooth problem, we propose an iterative algorithm named M3 Learning, which combines block coordinate descent and the bundle method for optimization. Experimental results on various data sets show the effectiveness of the proposed algorithm.",
keywords = "Heterogeneous learning, Multi-instance learning, Multi-task learning, Multi-view learning",
author = "Pei Yang and Jingrui He",
year = "2016",
month = "1",
day = "5",
doi = "10.1109/ICDM.2015.109",
language = "English (US)",
isbn = "9781467395038",
volume = "2016-January",
pages = "1081--1086",
booktitle = "Proceedings - IEEE International Conference on Data Mining, ICDM",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - A graph-based hybrid framework for modeling complex heterogeneity

AU - Yang, Pei

AU - He, Jingrui

PY - 2016/1/5

Y1 - 2016/1/5

N2 - Data heterogeneity is an intrinsic property of many high impact applications, such as insider threat detection, traffic prediction, brain image analysis, quality control in manufacturing processes, etc. Furthermore, multiple types of heterogeneity (e.g., task/view/instance heterogeneity) often co-exist in these applications, thus pose new challenges to existing techniques, most of which are tailored for a single or dual types of heterogeneity. To address this problem, in this paper, we propose a novel graph-based hybrid approach to simultaneously model multiple types of heterogeneity in a principled framework. The objective is to maximize the smoothness consistency of the neighboring nodes, bag-instance correlation together with task relatedness on the hybrid graphs, and simultaneously minimize the empirical classification loss. Furthermore, we analyze its performance based on Rademacher complexity, which sheds light on the benefits of jointly modeling multiple types of heterogeneity. To solve the resulting non-convex non-smooth problem, we propose an iterative algorithm named M3 Learning, which combines block coordinate descent and the bundle method for optimization. Experimental results on various data sets show the effectiveness of the proposed algorithm.

AB - Data heterogeneity is an intrinsic property of many high impact applications, such as insider threat detection, traffic prediction, brain image analysis, quality control in manufacturing processes, etc. Furthermore, multiple types of heterogeneity (e.g., task/view/instance heterogeneity) often co-exist in these applications, thus pose new challenges to existing techniques, most of which are tailored for a single or dual types of heterogeneity. To address this problem, in this paper, we propose a novel graph-based hybrid approach to simultaneously model multiple types of heterogeneity in a principled framework. The objective is to maximize the smoothness consistency of the neighboring nodes, bag-instance correlation together with task relatedness on the hybrid graphs, and simultaneously minimize the empirical classification loss. Furthermore, we analyze its performance based on Rademacher complexity, which sheds light on the benefits of jointly modeling multiple types of heterogeneity. To solve the resulting non-convex non-smooth problem, we propose an iterative algorithm named M3 Learning, which combines block coordinate descent and the bundle method for optimization. Experimental results on various data sets show the effectiveness of the proposed algorithm.

KW - Heterogeneous learning

KW - Multi-instance learning

KW - Multi-task learning

KW - Multi-view learning

UR - http://www.scopus.com/inward/record.url?scp=84963522428&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84963522428&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2015.109

DO - 10.1109/ICDM.2015.109

M3 - Conference contribution

SN - 9781467395038

VL - 2016-January

SP - 1081

EP - 1086

BT - Proceedings - IEEE International Conference on Data Mining, ICDM

PB - Institute of Electrical and Electronics Engineers Inc.

ER -