10 Citations (Scopus)

Abstract

As the challenges and opportunities of using 'big data' expand, there is a need to explore different ways of analyzing large datasets. The semiconductor industry is a good example of a manufacturing process where many data are collected throughout the fabrication of the product. These massive datasets are used for various purposes, primarily to detect problems and determine root causes, control the process, and build models that predict yield. The yield predictions are used for process planning, optimization, and control. However, many current approachesto yield modeling are limited because the actual processes violate the model assumptions, limiting the power of the models' use. This paper explores the use of generalized linear mixed models (GLMMs) to predict semiconductor yield and to provide significant information about the process using a large semiconductor yield dataset. Both batch-specific and population-averaged GLMM approaches are used and compared. Differences in link functions, sample sizes, and levels of aggregation (die-level and wafer-level models) are also compared with each other and with the results from generalized linear models (GLMs). The results of this study show that GLMMs are a reasonable approach to analyzing large datasets by providing additional insight into the fabrication process while maintaining or even improving prediction power compared with GLMs and some prior yield models found in the literature. This paper also provides a modeling strategy through suggestions regarding level of aggregation, link function, and sample size that are appropriate for different research goals.

Original languageEnglish (US)
Pages (from-to)691-707
Number of pages17
JournalApplied Stochastic Models in Business and Industry
Volume30
Issue number6
DOIs
StatePublished - Nov 1 2014

Fingerprint

Generalized Linear Mixed Model
Semiconductors
Semiconductor materials
Modeling
Link Function
Generalized Linear Model
Large Data Sets
Aggregation
Fabrication
Sample Size
Predict
Model
Process Planning
Prediction
Violate
Wafer
Expand
Batch
Die
Agglomeration

Keywords

  • Big data
  • Generalized linear mixed models
  • Generalized linear models
  • GLM
  • GLMM
  • Semiconductor yield modeling

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Modeling and Simulation
  • Management Science and Operations Research

Cite this

Modeling and analyzing semiconductor yield with generalized linear mixed models. / Krueger, D. C.; Montgomery, Douglas.

In: Applied Stochastic Models in Business and Industry, Vol. 30, No. 6, 01.11.2014, p. 691-707.

Research output: Contribution to journalArticle

@article{045f5abc9a644a61b680465b9f79caff,
title = "Modeling and analyzing semiconductor yield with generalized linear mixed models",
abstract = "As the challenges and opportunities of using 'big data' expand, there is a need to explore different ways of analyzing large datasets. The semiconductor industry is a good example of a manufacturing process where many data are collected throughout the fabrication of the product. These massive datasets are used for various purposes, primarily to detect problems and determine root causes, control the process, and build models that predict yield. The yield predictions are used for process planning, optimization, and control. However, many current approachesto yield modeling are limited because the actual processes violate the model assumptions, limiting the power of the models' use. This paper explores the use of generalized linear mixed models (GLMMs) to predict semiconductor yield and to provide significant information about the process using a large semiconductor yield dataset. Both batch-specific and population-averaged GLMM approaches are used and compared. Differences in link functions, sample sizes, and levels of aggregation (die-level and wafer-level models) are also compared with each other and with the results from generalized linear models (GLMs). The results of this study show that GLMMs are a reasonable approach to analyzing large datasets by providing additional insight into the fabrication process while maintaining or even improving prediction power compared with GLMs and some prior yield models found in the literature. This paper also provides a modeling strategy through suggestions regarding level of aggregation, link function, and sample size that are appropriate for different research goals.",
keywords = "Big data, Generalized linear mixed models, Generalized linear models, GLM, GLMM, Semiconductor yield modeling",
author = "Krueger, {D. C.} and Douglas Montgomery",
year = "2014",
month = "11",
day = "1",
doi = "10.1002/asmb.2074",
language = "English (US)",
volume = "30",
pages = "691--707",
journal = "Applied Stochastic Models in Business and Industry",
issn = "1524-1904",
publisher = "John Wiley and Sons Ltd",
number = "6",

}

TY - JOUR

T1 - Modeling and analyzing semiconductor yield with generalized linear mixed models

AU - Krueger, D. C.

AU - Montgomery, Douglas

PY - 2014/11/1

Y1 - 2014/11/1

N2 - As the challenges and opportunities of using 'big data' expand, there is a need to explore different ways of analyzing large datasets. The semiconductor industry is a good example of a manufacturing process where many data are collected throughout the fabrication of the product. These massive datasets are used for various purposes, primarily to detect problems and determine root causes, control the process, and build models that predict yield. The yield predictions are used for process planning, optimization, and control. However, many current approachesto yield modeling are limited because the actual processes violate the model assumptions, limiting the power of the models' use. This paper explores the use of generalized linear mixed models (GLMMs) to predict semiconductor yield and to provide significant information about the process using a large semiconductor yield dataset. Both batch-specific and population-averaged GLMM approaches are used and compared. Differences in link functions, sample sizes, and levels of aggregation (die-level and wafer-level models) are also compared with each other and with the results from generalized linear models (GLMs). The results of this study show that GLMMs are a reasonable approach to analyzing large datasets by providing additional insight into the fabrication process while maintaining or even improving prediction power compared with GLMs and some prior yield models found in the literature. This paper also provides a modeling strategy through suggestions regarding level of aggregation, link function, and sample size that are appropriate for different research goals.

AB - As the challenges and opportunities of using 'big data' expand, there is a need to explore different ways of analyzing large datasets. The semiconductor industry is a good example of a manufacturing process where many data are collected throughout the fabrication of the product. These massive datasets are used for various purposes, primarily to detect problems and determine root causes, control the process, and build models that predict yield. The yield predictions are used for process planning, optimization, and control. However, many current approachesto yield modeling are limited because the actual processes violate the model assumptions, limiting the power of the models' use. This paper explores the use of generalized linear mixed models (GLMMs) to predict semiconductor yield and to provide significant information about the process using a large semiconductor yield dataset. Both batch-specific and population-averaged GLMM approaches are used and compared. Differences in link functions, sample sizes, and levels of aggregation (die-level and wafer-level models) are also compared with each other and with the results from generalized linear models (GLMs). The results of this study show that GLMMs are a reasonable approach to analyzing large datasets by providing additional insight into the fabrication process while maintaining or even improving prediction power compared with GLMs and some prior yield models found in the literature. This paper also provides a modeling strategy through suggestions regarding level of aggregation, link function, and sample size that are appropriate for different research goals.

KW - Big data

KW - Generalized linear mixed models

KW - Generalized linear models

KW - GLM

KW - GLMM

KW - Semiconductor yield modeling

UR - http://www.scopus.com/inward/record.url?scp=84919778898&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84919778898&partnerID=8YFLogxK

U2 - 10.1002/asmb.2074

DO - 10.1002/asmb.2074

M3 - Article

AN - SCOPUS:84919778898

VL - 30

SP - 691

EP - 707

JO - Applied Stochastic Models in Business and Industry

JF - Applied Stochastic Models in Business and Industry

SN - 1524-1904

IS - 6

ER -