Bi-Level rare temporal pattern detection

Dawei Zhou; Jingrui He; Yu Cao; Jae-sun Seo

doi:10.1109/ICDM.2016.16

Bi-Level rare temporal pattern detection

Dawei Zhou, Jingrui He, Yu Cao, Jae-sun Seo

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

14 Scopus citations

Abstract

Nowadays, temporal data is generated at an unprecedentedspeed from a variety of applications, such as wearable devices, sensor networks, wireless networks, etc. In contrast to suchlarge amount of temporal data, it is usually the case that onlya small portion of them contains information of interest. Forexample, for the ECG signals collected by wearable devices, most of them collected from healthy people are normal, andonly a small number of them collected from people with certain heart diseases are abnormal. Furthermore, even forthe abnormal temporal sequences, the abnormal patterns mayonly be present in a few time segments and are similar amongthemselves, forming a rare category of temporal patterns. Forexample, the ECG signal collected from an individual with acertain heart disease may be normal in most time segments, and abnormal in only a few time segments, exhibiting similarpatterns. What is even more challenging is that such raretemporal patterns are often non-separable from the normalones. Existing works on outlier detection for temporal datafocus on detecting either the abnormal sequences as a whole, orthe abnormal time segments directly, ignoring the relationshipbetween abnormal sequences and abnormal time segments.Moreover, the abnormal patterns are typically treated asisolated outliers instead of a rare category with self-similarity. In this paper, for the first time, we propose a bi-level(sequence-level/ segment-level) model for rare temporal patterndetection. It is based on an optimization frameworkthat fully exploits the bi-level structure in the data, i.e., therelationship between abnormal sequences and abnormal timesegments. Furthermore, it uses sequence-specific simple hiddenMarkov models to obtain segment-level labels, and leverages the similarity among abnormal time segments to estimate the model parameters. To solve the optimization framework, we propose the unsupervised algorithm BIRAD, and also thesemi-supervised version BIRAD-K which learns from a single labeled example. Experimental results on both synthetic andreal data sets demonstrate the performance of the proposedalgorithms from multiple aspects, outperforming state-of-The-Arttechniques on both temporal outlier detection and rarecategory analysis.

Original language	English (US)
Title of host publication	Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016
Editors	Francesco Bonchi, Josep Domingo-Ferrer, Ricardo Baeza-Yates, Zhi-Hua Zhou, Xindong Wu
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	719-728
Number of pages	10
ISBN (Electronic)	9781509054725
DOIs	https://doi.org/10.1109/ICDM.2016.16
State	Published - Jul 2 2016
Event	16th IEEE International Conference on Data Mining, ICDM 2016 - Barcelona, Catalonia, Spain Duration: Dec 12 2016 → Dec 15 2016

Publication series

Name	Proceedings - IEEE International Conference on Data Mining, ICDM
Volume	0
ISSN (Print)	1550-4786

Other

Other	16th IEEE International Conference on Data Mining, ICDM 2016
Country/Territory	Spain
City	Barcelona, Catalonia
Period	12/12/16 → 12/15/16

Keywords

Rare category detection
Temporal data mining
Time segments
Time series

ASJC Scopus subject areas

General Engineering

Access to Document

10.1109/ICDM.2016.16

Cite this

Zhou, D., He, J., Cao, Y., & Seo, J. (2016). Bi-Level rare temporal pattern detection. In F. Bonchi, J. Domingo-Ferrer, R. Baeza-Yates, Z.-H. Zhou, & X. Wu (Eds.), Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016 (pp. 719-728). Article 7837896 (Proceedings - IEEE International Conference on Data Mining, ICDM; Vol. 0). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDM.2016.16

Bi-Level rare temporal pattern detection. / Zhou, Dawei; He, Jingrui; Cao, Yu et al.
Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016. ed. / Francesco Bonchi; Josep Domingo-Ferrer; Ricardo Baeza-Yates; Zhi-Hua Zhou; Xindong Wu. Institute of Electrical and Electronics Engineers Inc., 2016. p. 719-728 7837896 (Proceedings - IEEE International Conference on Data Mining, ICDM; Vol. 0).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Zhou, D, He, J, Cao, Y & Seo, J 2016, Bi-Level rare temporal pattern detection. in F Bonchi, J Domingo-Ferrer, R Baeza-Yates, Z-H Zhou & X Wu (eds), Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016., 7837896, Proceedings - IEEE International Conference on Data Mining, ICDM, vol. 0, Institute of Electrical and Electronics Engineers Inc., pp. 719-728, 16th IEEE International Conference on Data Mining, ICDM 2016, Barcelona, Catalonia, Spain, 12/12/16. https://doi.org/10.1109/ICDM.2016.16

Zhou D, He J, Cao Y, Seo J. Bi-Level rare temporal pattern detection. In Bonchi F, Domingo-Ferrer J, Baeza-Yates R, Zhou ZH, Wu X, editors, Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 719-728. 7837896. (Proceedings - IEEE International Conference on Data Mining, ICDM). doi: 10.1109/ICDM.2016.16

Zhou, Dawei ; He, Jingrui ; Cao, Yu et al. / Bi-Level rare temporal pattern detection. Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016. editor / Francesco Bonchi ; Josep Domingo-Ferrer ; Ricardo Baeza-Yates ; Zhi-Hua Zhou ; Xindong Wu. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 719-728 (Proceedings - IEEE International Conference on Data Mining, ICDM).

@inproceedings{7e458fb9fada46e5993c33661a8c25aa,

title = "Bi-Level rare temporal pattern detection",

abstract = "Nowadays, temporal data is generated at an unprecedentedspeed from a variety of applications, such as wearable devices, sensor networks, wireless networks, etc. In contrast to suchlarge amount of temporal data, it is usually the case that onlya small portion of them contains information of interest. Forexample, for the ECG signals collected by wearable devices, most of them collected from healthy people are normal, andonly a small number of them collected from people with certain heart diseases are abnormal. Furthermore, even forthe abnormal temporal sequences, the abnormal patterns mayonly be present in a few time segments and are similar amongthemselves, forming a rare category of temporal patterns. Forexample, the ECG signal collected from an individual with acertain heart disease may be normal in most time segments, and abnormal in only a few time segments, exhibiting similarpatterns. What is even more challenging is that such raretemporal patterns are often non-separable from the normalones. Existing works on outlier detection for temporal datafocus on detecting either the abnormal sequences as a whole, orthe abnormal time segments directly, ignoring the relationshipbetween abnormal sequences and abnormal time segments.Moreover, the abnormal patterns are typically treated asisolated outliers instead of a rare category with self-similarity. In this paper, for the first time, we propose a bi-level(sequence-level/ segment-level) model for rare temporal patterndetection. It is based on an optimization frameworkthat fully exploits the bi-level structure in the data, i.e., therelationship between abnormal sequences and abnormal timesegments. Furthermore, it uses sequence-specific simple hiddenMarkov models to obtain segment-level labels, and leverages the similarity among abnormal time segments to estimate the model parameters. To solve the optimization framework, we propose the unsupervised algorithm BIRAD, and also thesemi-supervised version BIRAD-K which learns from a single labeled example. Experimental results on both synthetic andreal data sets demonstrate the performance of the proposedalgorithms from multiple aspects, outperforming state-of-The-Arttechniques on both temporal outlier detection and rarecategory analysis.",

keywords = "Rare category detection, Temporal data mining, Time segments, Time series",

author = "Dawei Zhou and Jingrui He and Yu Cao and Jae-sun Seo",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.; 16th IEEE International Conference on Data Mining, ICDM 2016 ; Conference date: 12-12-2016 Through 15-12-2016",

year = "2016",

month = jul,

day = "2",

doi = "10.1109/ICDM.2016.16",

language = "English (US)",

series = "Proceedings - IEEE International Conference on Data Mining, ICDM",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "719--728",

editor = "Francesco Bonchi and Josep Domingo-Ferrer and Ricardo Baeza-Yates and Zhi-Hua Zhou and Xindong Wu",

booktitle = "Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016",

}

TY - GEN

T1 - Bi-Level rare temporal pattern detection

AU - Zhou, Dawei

AU - He, Jingrui

AU - Cao, Yu

AU - Seo, Jae-sun

PY - 2016/7/2

Y1 - 2016/7/2

N2 - Nowadays, temporal data is generated at an unprecedentedspeed from a variety of applications, such as wearable devices, sensor networks, wireless networks, etc. In contrast to suchlarge amount of temporal data, it is usually the case that onlya small portion of them contains information of interest. Forexample, for the ECG signals collected by wearable devices, most of them collected from healthy people are normal, andonly a small number of them collected from people with certain heart diseases are abnormal. Furthermore, even forthe abnormal temporal sequences, the abnormal patterns mayonly be present in a few time segments and are similar amongthemselves, forming a rare category of temporal patterns. Forexample, the ECG signal collected from an individual with acertain heart disease may be normal in most time segments, and abnormal in only a few time segments, exhibiting similarpatterns. What is even more challenging is that such raretemporal patterns are often non-separable from the normalones. Existing works on outlier detection for temporal datafocus on detecting either the abnormal sequences as a whole, orthe abnormal time segments directly, ignoring the relationshipbetween abnormal sequences and abnormal time segments.Moreover, the abnormal patterns are typically treated asisolated outliers instead of a rare category with self-similarity. In this paper, for the first time, we propose a bi-level(sequence-level/ segment-level) model for rare temporal patterndetection. It is based on an optimization frameworkthat fully exploits the bi-level structure in the data, i.e., therelationship between abnormal sequences and abnormal timesegments. Furthermore, it uses sequence-specific simple hiddenMarkov models to obtain segment-level labels, and leverages the similarity among abnormal time segments to estimate the model parameters. To solve the optimization framework, we propose the unsupervised algorithm BIRAD, and also thesemi-supervised version BIRAD-K which learns from a single labeled example. Experimental results on both synthetic andreal data sets demonstrate the performance of the proposedalgorithms from multiple aspects, outperforming state-of-The-Arttechniques on both temporal outlier detection and rarecategory analysis.

AB - Nowadays, temporal data is generated at an unprecedentedspeed from a variety of applications, such as wearable devices, sensor networks, wireless networks, etc. In contrast to suchlarge amount of temporal data, it is usually the case that onlya small portion of them contains information of interest. Forexample, for the ECG signals collected by wearable devices, most of them collected from healthy people are normal, andonly a small number of them collected from people with certain heart diseases are abnormal. Furthermore, even forthe abnormal temporal sequences, the abnormal patterns mayonly be present in a few time segments and are similar amongthemselves, forming a rare category of temporal patterns. Forexample, the ECG signal collected from an individual with acertain heart disease may be normal in most time segments, and abnormal in only a few time segments, exhibiting similarpatterns. What is even more challenging is that such raretemporal patterns are often non-separable from the normalones. Existing works on outlier detection for temporal datafocus on detecting either the abnormal sequences as a whole, orthe abnormal time segments directly, ignoring the relationshipbetween abnormal sequences and abnormal time segments.Moreover, the abnormal patterns are typically treated asisolated outliers instead of a rare category with self-similarity. In this paper, for the first time, we propose a bi-level(sequence-level/ segment-level) model for rare temporal patterndetection. It is based on an optimization frameworkthat fully exploits the bi-level structure in the data, i.e., therelationship between abnormal sequences and abnormal timesegments. Furthermore, it uses sequence-specific simple hiddenMarkov models to obtain segment-level labels, and leverages the similarity among abnormal time segments to estimate the model parameters. To solve the optimization framework, we propose the unsupervised algorithm BIRAD, and also thesemi-supervised version BIRAD-K which learns from a single labeled example. Experimental results on both synthetic andreal data sets demonstrate the performance of the proposedalgorithms from multiple aspects, outperforming state-of-The-Arttechniques on both temporal outlier detection and rarecategory analysis.

KW - Rare category detection

KW - Temporal data mining

KW - Time segments

KW - Time series

UR - http://www.scopus.com/inward/record.url?scp=85014566739&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85014566739&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2016.16

DO - 10.1109/ICDM.2016.16

M3 - Conference contribution

AN - SCOPUS:85014566739

T3 - Proceedings - IEEE International Conference on Data Mining, ICDM

SP - 719

EP - 728

BT - Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016

A2 - Bonchi, Francesco

A2 - Domingo-Ferrer, Josep

A2 - Baeza-Yates, Ricardo

A2 - Zhou, Zhi-Hua

A2 - Wu, Xindong

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 16th IEEE International Conference on Data Mining, ICDM 2016

Y2 - 12 December 2016 through 15 December 2016

ER -

Bi-Level rare temporal pattern detection

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this