Supervised learning for change-point detection

Fang Li, George Runger, Eugene Tuv

Research output: Contribution to journalArticle

22 Citations (Scopus)

Abstract

The detection of changes in the distribution of process variables is referred to as the change-point problem. Existing methods focus on detecting a single (or few) change point in a univariate (or low-dimensional) process. We consider the important high-dimensional multivariate case with multiple change points and without an assumed distribution. In this work the problem is transformed into a supervised learning problem with time as the output response and the process variables as inputs. Our focus is to identify the subset of variables that change. This important, practical scenario is analysed through a supervised learner with a variable importance measure that is used to identify the variables that change among hundreds of variables. Simulated cases are discussed in the paper to verify the proposed method. Moreover, the same data sets are compared with a multivariate exponentially weighted moving average control chart and the advantages of the supervised learner are illustrated.

Original languageEnglish (US)
Pages (from-to)2853-2868
Number of pages16
JournalInternational Journal of Production Research
Volume44
Issue number14
DOIs
StatePublished - Jul 15 2006

Fingerprint

Supervised learning
Control charts
Change point
Process variables

Keywords

  • Change point
  • Decision tree
  • Out-of-bag
  • Randomforest
  • Statistical process control
  • Variable importance

ASJC Scopus subject areas

  • Industrial and Manufacturing Engineering
  • Management Science and Operations Research

Cite this

Supervised learning for change-point detection. / Li, Fang; Runger, George; Tuv, Eugene.

In: International Journal of Production Research, Vol. 44, No. 14, 15.07.2006, p. 2853-2868.

Research output: Contribution to journalArticle

Li, Fang ; Runger, George ; Tuv, Eugene. / Supervised learning for change-point detection. In: International Journal of Production Research. 2006 ; Vol. 44, No. 14. pp. 2853-2868.
@article{e1386490238649598e4dc216379faf7c,
title = "Supervised learning for change-point detection",
abstract = "The detection of changes in the distribution of process variables is referred to as the change-point problem. Existing methods focus on detecting a single (or few) change point in a univariate (or low-dimensional) process. We consider the important high-dimensional multivariate case with multiple change points and without an assumed distribution. In this work the problem is transformed into a supervised learning problem with time as the output response and the process variables as inputs. Our focus is to identify the subset of variables that change. This important, practical scenario is analysed through a supervised learner with a variable importance measure that is used to identify the variables that change among hundreds of variables. Simulated cases are discussed in the paper to verify the proposed method. Moreover, the same data sets are compared with a multivariate exponentially weighted moving average control chart and the advantages of the supervised learner are illustrated.",
keywords = "Change point, Decision tree, Out-of-bag, Randomforest, Statistical process control, Variable importance",
author = "Fang Li and George Runger and Eugene Tuv",
year = "2006",
month = "7",
day = "15",
doi = "10.1080/00207540600669846",
language = "English (US)",
volume = "44",
pages = "2853--2868",
journal = "International Journal of Production Research",
issn = "0020-7543",
publisher = "Taylor and Francis Ltd.",
number = "14",

}

TY - JOUR

T1 - Supervised learning for change-point detection

AU - Li, Fang

AU - Runger, George

AU - Tuv, Eugene

PY - 2006/7/15

Y1 - 2006/7/15

N2 - The detection of changes in the distribution of process variables is referred to as the change-point problem. Existing methods focus on detecting a single (or few) change point in a univariate (or low-dimensional) process. We consider the important high-dimensional multivariate case with multiple change points and without an assumed distribution. In this work the problem is transformed into a supervised learning problem with time as the output response and the process variables as inputs. Our focus is to identify the subset of variables that change. This important, practical scenario is analysed through a supervised learner with a variable importance measure that is used to identify the variables that change among hundreds of variables. Simulated cases are discussed in the paper to verify the proposed method. Moreover, the same data sets are compared with a multivariate exponentially weighted moving average control chart and the advantages of the supervised learner are illustrated.

AB - The detection of changes in the distribution of process variables is referred to as the change-point problem. Existing methods focus on detecting a single (or few) change point in a univariate (or low-dimensional) process. We consider the important high-dimensional multivariate case with multiple change points and without an assumed distribution. In this work the problem is transformed into a supervised learning problem with time as the output response and the process variables as inputs. Our focus is to identify the subset of variables that change. This important, practical scenario is analysed through a supervised learner with a variable importance measure that is used to identify the variables that change among hundreds of variables. Simulated cases are discussed in the paper to verify the proposed method. Moreover, the same data sets are compared with a multivariate exponentially weighted moving average control chart and the advantages of the supervised learner are illustrated.

KW - Change point

KW - Decision tree

KW - Out-of-bag

KW - Randomforest

KW - Statistical process control

KW - Variable importance

UR - http://www.scopus.com/inward/record.url?scp=33744977241&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33744977241&partnerID=8YFLogxK

U2 - 10.1080/00207540600669846

DO - 10.1080/00207540600669846

M3 - Article

AN - SCOPUS:33744977241

VL - 44

SP - 2853

EP - 2868

JO - International Journal of Production Research

JF - International Journal of Production Research

SN - 0020-7543

IS - 14

ER -