Abstract

As manufacturing transitions to real-time sensing, it becomes more important to handle multiple, high-dimensional (non-stationary) time series that generate thousands of measurements for each batch. Predictive models are often challenged by such high-dimensional data and it is important to reduce the dimensionality for better performance. With thousands of measurements, even wavelet coefficients do not reduce the dimensionality sufficiently. We propose a two-stage method that uses energy statistics from a discrete wavelet transform to identify process variables and appropriate resolutions of wavelet coefficients in an initial (screening) model. Variable importance scores from a modern random forest classifier are exploited in this stage. Coefficients that correspond to the identified variables and resolutions are then selected for a second-stage predictive model. The approach is shown to provide good performance, along with interpretable results, in an example where multiple time series are used to indicate the need for preventive maintenance. In general, the two-stage approach can handle high dimensionality and still provide interpretable features linked to the relevant process variables and wavelet resolutions that can be used for further analysis.

Original languageEnglish (US)
Pages (from-to)885-893
Number of pages9
JournalQuality and Reliability Engineering International
Volume27
Issue number7
DOIs
StatePublished - Nov 2011

Fingerprint

Feature extraction
Time series
Preventive maintenance
Discrete wavelet transforms
Screening
Classifiers
Statistics
Wavelets
Coefficients
Dimensionality
Process variables
Wavelet transform
Batch
Energy use
Multiple time series
Non-stationary time series
Manufacturing
Classifier

Keywords

  • discrete wavelet transformation
  • preventive maintenance
  • random forest

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Management Science and Operations Research

Cite this

Feature extraction and classification models for high-dimensional profile data. / Shinde, Amit; Church, George; Janakiram, Mani; Runger, George.

In: Quality and Reliability Engineering International, Vol. 27, No. 7, 11.2011, p. 885-893.

Research output: Contribution to journalArticle

@article{e8c0d871580342709557f6142865e342,
title = "Feature extraction and classification models for high-dimensional profile data",
abstract = "As manufacturing transitions to real-time sensing, it becomes more important to handle multiple, high-dimensional (non-stationary) time series that generate thousands of measurements for each batch. Predictive models are often challenged by such high-dimensional data and it is important to reduce the dimensionality for better performance. With thousands of measurements, even wavelet coefficients do not reduce the dimensionality sufficiently. We propose a two-stage method that uses energy statistics from a discrete wavelet transform to identify process variables and appropriate resolutions of wavelet coefficients in an initial (screening) model. Variable importance scores from a modern random forest classifier are exploited in this stage. Coefficients that correspond to the identified variables and resolutions are then selected for a second-stage predictive model. The approach is shown to provide good performance, along with interpretable results, in an example where multiple time series are used to indicate the need for preventive maintenance. In general, the two-stage approach can handle high dimensionality and still provide interpretable features linked to the relevant process variables and wavelet resolutions that can be used for further analysis.",
keywords = "discrete wavelet transformation, preventive maintenance, random forest",
author = "Amit Shinde and George Church and Mani Janakiram and George Runger",
year = "2011",
month = "11",
doi = "10.1002/qre.1178",
language = "English (US)",
volume = "27",
pages = "885--893",
journal = "Quality and Reliability Engineering International",
issn = "0748-8017",
publisher = "John Wiley and Sons Ltd",
number = "7",

}

TY - JOUR

T1 - Feature extraction and classification models for high-dimensional profile data

AU - Shinde, Amit

AU - Church, George

AU - Janakiram, Mani

AU - Runger, George

PY - 2011/11

Y1 - 2011/11

N2 - As manufacturing transitions to real-time sensing, it becomes more important to handle multiple, high-dimensional (non-stationary) time series that generate thousands of measurements for each batch. Predictive models are often challenged by such high-dimensional data and it is important to reduce the dimensionality for better performance. With thousands of measurements, even wavelet coefficients do not reduce the dimensionality sufficiently. We propose a two-stage method that uses energy statistics from a discrete wavelet transform to identify process variables and appropriate resolutions of wavelet coefficients in an initial (screening) model. Variable importance scores from a modern random forest classifier are exploited in this stage. Coefficients that correspond to the identified variables and resolutions are then selected for a second-stage predictive model. The approach is shown to provide good performance, along with interpretable results, in an example where multiple time series are used to indicate the need for preventive maintenance. In general, the two-stage approach can handle high dimensionality and still provide interpretable features linked to the relevant process variables and wavelet resolutions that can be used for further analysis.

AB - As manufacturing transitions to real-time sensing, it becomes more important to handle multiple, high-dimensional (non-stationary) time series that generate thousands of measurements for each batch. Predictive models are often challenged by such high-dimensional data and it is important to reduce the dimensionality for better performance. With thousands of measurements, even wavelet coefficients do not reduce the dimensionality sufficiently. We propose a two-stage method that uses energy statistics from a discrete wavelet transform to identify process variables and appropriate resolutions of wavelet coefficients in an initial (screening) model. Variable importance scores from a modern random forest classifier are exploited in this stage. Coefficients that correspond to the identified variables and resolutions are then selected for a second-stage predictive model. The approach is shown to provide good performance, along with interpretable results, in an example where multiple time series are used to indicate the need for preventive maintenance. In general, the two-stage approach can handle high dimensionality and still provide interpretable features linked to the relevant process variables and wavelet resolutions that can be used for further analysis.

KW - discrete wavelet transformation

KW - preventive maintenance

KW - random forest

UR - http://www.scopus.com/inward/record.url?scp=80054795464&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80054795464&partnerID=8YFLogxK

U2 - 10.1002/qre.1178

DO - 10.1002/qre.1178

M3 - Article

AN - SCOPUS:80054795464

VL - 27

SP - 885

EP - 893

JO - Quality and Reliability Engineering International

JF - Quality and Reliability Engineering International

SN - 0748-8017

IS - 7

ER -