Preimages for variation patterns from kernel PCA and bagging

Amit Shinde, Anshuman Sahu, Daniel Apley, George Runger

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Manufacturing industries collect massive amounts of multivariate measurement through automated inspection processes. Noisy measurements and high-dimensional, irrelevant features make it difficult to identify useful patterns in the data. Principal component analysis provides linear summaries of datasets with fewer latent variables. Kernel Principal Component Analysis (KPCA), however, identifies nonlinear patterns. One challenge in KPCA is to inverse map the denoised signal from a high-dimensional feature space into its preimage in input space to visualize the nonlinear variation sources. However, such an inverse map is not always defined. This article provides a new meta-method applicable to any KPCA algorithm to approximate the preimage. It improves upon previous work where a strong assumption was the availability of noise-free training data. This is problematic for applications such as manufacturing variation analysis. To attenuate noise in kernel subspace estimation the final preimage is estimated as the average from bagged samples drawn from the original dataset. The improvement is most pronounced when the parameters differ from those that minimize the error rate. Consequently, the proposed approach improves the robustness of any base KPCA algorithm. The usefulness of the proposed method is demonstrated by analyzing a classic handwritten digit dataset and a face dataset. Significant improvement over the existing methods is observed.

Original languageEnglish (US)
Pages (from-to)429-456
Number of pages28
JournalIIE Transactions (Institute of Industrial Engineers)
Volume46
Issue number5
DOIs
StatePublished - May 1 2014
Externally publishedYes

Fingerprint

Principal component analysis
Inspection
Availability
Industry

Keywords

  • denoise
  • ensembles
  • manufacturing variation analysis
  • Principal component analysis

ASJC Scopus subject areas

  • Industrial and Manufacturing Engineering

Cite this

Preimages for variation patterns from kernel PCA and bagging. / Shinde, Amit; Sahu, Anshuman; Apley, Daniel; Runger, George.

In: IIE Transactions (Institute of Industrial Engineers), Vol. 46, No. 5, 01.05.2014, p. 429-456.

Research output: Contribution to journalArticle

Shinde, Amit ; Sahu, Anshuman ; Apley, Daniel ; Runger, George. / Preimages for variation patterns from kernel PCA and bagging. In: IIE Transactions (Institute of Industrial Engineers). 2014 ; Vol. 46, No. 5. pp. 429-456.
@article{1af509cb172c434f86dce5ba336b716d,
title = "Preimages for variation patterns from kernel PCA and bagging",
abstract = "Manufacturing industries collect massive amounts of multivariate measurement through automated inspection processes. Noisy measurements and high-dimensional, irrelevant features make it difficult to identify useful patterns in the data. Principal component analysis provides linear summaries of datasets with fewer latent variables. Kernel Principal Component Analysis (KPCA), however, identifies nonlinear patterns. One challenge in KPCA is to inverse map the denoised signal from a high-dimensional feature space into its preimage in input space to visualize the nonlinear variation sources. However, such an inverse map is not always defined. This article provides a new meta-method applicable to any KPCA algorithm to approximate the preimage. It improves upon previous work where a strong assumption was the availability of noise-free training data. This is problematic for applications such as manufacturing variation analysis. To attenuate noise in kernel subspace estimation the final preimage is estimated as the average from bagged samples drawn from the original dataset. The improvement is most pronounced when the parameters differ from those that minimize the error rate. Consequently, the proposed approach improves the robustness of any base KPCA algorithm. The usefulness of the proposed method is demonstrated by analyzing a classic handwritten digit dataset and a face dataset. Significant improvement over the existing methods is observed.",
keywords = "denoise, ensembles, manufacturing variation analysis, Principal component analysis",
author = "Amit Shinde and Anshuman Sahu and Daniel Apley and George Runger",
year = "2014",
month = "5",
day = "1",
doi = "10.1080/0740817X.2013.849836",
language = "English (US)",
volume = "46",
pages = "429--456",
journal = "IISE Transactions",
issn = "2472-5854",
publisher = "Taylor and Francis Ltd.",
number = "5",

}

TY - JOUR

T1 - Preimages for variation patterns from kernel PCA and bagging

AU - Shinde, Amit

AU - Sahu, Anshuman

AU - Apley, Daniel

AU - Runger, George

PY - 2014/5/1

Y1 - 2014/5/1

N2 - Manufacturing industries collect massive amounts of multivariate measurement through automated inspection processes. Noisy measurements and high-dimensional, irrelevant features make it difficult to identify useful patterns in the data. Principal component analysis provides linear summaries of datasets with fewer latent variables. Kernel Principal Component Analysis (KPCA), however, identifies nonlinear patterns. One challenge in KPCA is to inverse map the denoised signal from a high-dimensional feature space into its preimage in input space to visualize the nonlinear variation sources. However, such an inverse map is not always defined. This article provides a new meta-method applicable to any KPCA algorithm to approximate the preimage. It improves upon previous work where a strong assumption was the availability of noise-free training data. This is problematic for applications such as manufacturing variation analysis. To attenuate noise in kernel subspace estimation the final preimage is estimated as the average from bagged samples drawn from the original dataset. The improvement is most pronounced when the parameters differ from those that minimize the error rate. Consequently, the proposed approach improves the robustness of any base KPCA algorithm. The usefulness of the proposed method is demonstrated by analyzing a classic handwritten digit dataset and a face dataset. Significant improvement over the existing methods is observed.

AB - Manufacturing industries collect massive amounts of multivariate measurement through automated inspection processes. Noisy measurements and high-dimensional, irrelevant features make it difficult to identify useful patterns in the data. Principal component analysis provides linear summaries of datasets with fewer latent variables. Kernel Principal Component Analysis (KPCA), however, identifies nonlinear patterns. One challenge in KPCA is to inverse map the denoised signal from a high-dimensional feature space into its preimage in input space to visualize the nonlinear variation sources. However, such an inverse map is not always defined. This article provides a new meta-method applicable to any KPCA algorithm to approximate the preimage. It improves upon previous work where a strong assumption was the availability of noise-free training data. This is problematic for applications such as manufacturing variation analysis. To attenuate noise in kernel subspace estimation the final preimage is estimated as the average from bagged samples drawn from the original dataset. The improvement is most pronounced when the parameters differ from those that minimize the error rate. Consequently, the proposed approach improves the robustness of any base KPCA algorithm. The usefulness of the proposed method is demonstrated by analyzing a classic handwritten digit dataset and a face dataset. Significant improvement over the existing methods is observed.

KW - denoise

KW - ensembles

KW - manufacturing variation analysis

KW - Principal component analysis

UR - http://www.scopus.com/inward/record.url?scp=84893920758&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893920758&partnerID=8YFLogxK

U2 - 10.1080/0740817X.2013.849836

DO - 10.1080/0740817X.2013.849836

M3 - Article

AN - SCOPUS:84893920758

VL - 46

SP - 429

EP - 456

JO - IISE Transactions

JF - IISE Transactions

SN - 2472-5854

IS - 5

ER -