Process partitions from time-ordered clusters

Peter Harnish; B. E N Nelson; George Runger

doi:10.1080/00224065.2009.11917756

Process partitions from time-ordered clusters

Peter Harnish, B. E N Nelson, George Runger

Research output: Contribution to journal › Article › peer-review

15 Scopus citations

Abstract

Statistical analysis of massive multivariate industrial data sets might not be effective unless the data is first partitioned into stable operating regions. Standard statistical methods that are based on global models, like principal component and regression analyses, can be ineffective if the process is not under statistical control. Here, constrained clustering is proposed as a practical, robust, general solution to detect change points and partition historical process data. The constraint is that only observations (or clusters) that are contiguous in time can be joined. This paper describes a method for partitioning data sets into stable regions by modifying agglomerative clustering algorithms to take into account the time order within the data set. A stopping criterion is proposed to evaluate the number of change points generated, and several empirical studies with simulated and actual data are presented. The result is a partitioned data set that is better suited for analysis by standard statistical methods.

Original language	English (US)
Pages (from-to)	3-17
Number of pages	15
Journal	Journal of Quality Technology
Volume	41
Issue number	1
DOIs	https://doi.org/10.1080/00224065.2009.11917756
State	Published - Jan 2009

Keywords

Change point
Cluster analysis
Data mining
Data segmentation
Partitioning

ASJC Scopus subject areas

Safety, Risk, Reliability and Quality
Strategy and Management
Management Science and Operations Research
Industrial and Manufacturing Engineering

Access to Document

10.1080/00224065.2009.11917756

Cite this

@article{25452c5d40ec4aa8a0d0ff51edee32e3,

title = "Process partitions from time-ordered clusters",

abstract = "Statistical analysis of massive multivariate industrial data sets might not be effective unless the data is first partitioned into stable operating regions. Standard statistical methods that are based on global models, like principal component and regression analyses, can be ineffective if the process is not under statistical control. Here, constrained clustering is proposed as a practical, robust, general solution to detect change points and partition historical process data. The constraint is that only observations (or clusters) that are contiguous in time can be joined. This paper describes a method for partitioning data sets into stable regions by modifying agglomerative clustering algorithms to take into account the time order within the data set. A stopping criterion is proposed to evaluate the number of change points generated, and several empirical studies with simulated and actual data are presented. The result is a partitioned data set that is better suited for analysis by standard statistical methods.",

keywords = "Change point, Cluster analysis, Data mining, Data segmentation, Partitioning",

author = "Peter Harnish and Nelson, {B. E N} and George Runger",

year = "2009",

month = jan,

doi = "10.1080/00224065.2009.11917756",

language = "English (US)",

volume = "41",

pages = "3--17",

journal = "Journal of Quality Technology",

issn = "0022-4065",

publisher = "American Society for Quality",

number = "1",

}

TY - JOUR

T1 - Process partitions from time-ordered clusters

AU - Harnish, Peter

AU - Nelson, B. E N

AU - Runger, George

PY - 2009/1

Y1 - 2009/1

N2 - Statistical analysis of massive multivariate industrial data sets might not be effective unless the data is first partitioned into stable operating regions. Standard statistical methods that are based on global models, like principal component and regression analyses, can be ineffective if the process is not under statistical control. Here, constrained clustering is proposed as a practical, robust, general solution to detect change points and partition historical process data. The constraint is that only observations (or clusters) that are contiguous in time can be joined. This paper describes a method for partitioning data sets into stable regions by modifying agglomerative clustering algorithms to take into account the time order within the data set. A stopping criterion is proposed to evaluate the number of change points generated, and several empirical studies with simulated and actual data are presented. The result is a partitioned data set that is better suited for analysis by standard statistical methods.

AB - Statistical analysis of massive multivariate industrial data sets might not be effective unless the data is first partitioned into stable operating regions. Standard statistical methods that are based on global models, like principal component and regression analyses, can be ineffective if the process is not under statistical control. Here, constrained clustering is proposed as a practical, robust, general solution to detect change points and partition historical process data. The constraint is that only observations (or clusters) that are contiguous in time can be joined. This paper describes a method for partitioning data sets into stable regions by modifying agglomerative clustering algorithms to take into account the time order within the data set. A stopping criterion is proposed to evaluate the number of change points generated, and several empirical studies with simulated and actual data are presented. The result is a partitioned data set that is better suited for analysis by standard statistical methods.

KW - Change point

KW - Cluster analysis

KW - Data mining

KW - Data segmentation

KW - Partitioning

UR - http://www.scopus.com/inward/record.url?scp=62949121635&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=62949121635&partnerID=8YFLogxK

U2 - 10.1080/00224065.2009.11917756

DO - 10.1080/00224065.2009.11917756

M3 - Article

AN - SCOPUS:62949121635

SN - 0022-4065

VL - 41

SP - 3

EP - 17

JO - Journal of Quality Technology

JF - Journal of Quality Technology

IS - 1

ER -

Process partitions from time-ordered clusters

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this