Variable selection via Gibbs sampling

Edward I. George, Robert McCulloch

Research output: Contribution to journalArticle

1247 Citations (Scopus)

Abstract

A crucial problem in building a multiple regression model is the selection of predictors to include. The main thrust of this article is to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure entails embedding the regression setup in a hierarchical normal mixture model where latent variables are used to identify subset choices. In this framework the promising subsets of predictors can be identified as those with higher posterior probability. The computational burden is then alleviated by using the Gibbs sampler to indirectly sample from this multinomial posterior distribution on the set of possible subset choices. Those subsets with higher probability—the promising ones—can then be identified by their more frequent appearance in the Gibbs sample.

Original languageEnglish (US)
Pages (from-to)881-889
Number of pages9
JournalJournal of the American Statistical Association
Volume88
Issue number423
DOIs
StatePublished - 1993
Externally publishedYes

Fingerprint

Gibbs Sampling
Variable Selection
Subset
Predictors
Normal Mixture
Multinomial Distribution
Gibbs Sampler
Multiple Regression
Posterior Probability
Multiple Models
Latent Variables
Posterior distribution
Mixture Model
Regression Model
Regression
Variable selection
Gibbs sampling
Latent variables
Gibbs sampler
Multiple regression

Keywords

  • Data augmentation
  • Hierarchical Bayes
  • Latent variables
  • Mixture
  • Multiple regression

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Variable selection via Gibbs sampling. / George, Edward I.; McCulloch, Robert.

In: Journal of the American Statistical Association, Vol. 88, No. 423, 1993, p. 881-889.

Research output: Contribution to journalArticle

@article{48a65d1739914fe4a0e7f7cd67e1d3f5,
title = "Variable selection via Gibbs sampling",
abstract = "A crucial problem in building a multiple regression model is the selection of predictors to include. The main thrust of this article is to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure entails embedding the regression setup in a hierarchical normal mixture model where latent variables are used to identify subset choices. In this framework the promising subsets of predictors can be identified as those with higher posterior probability. The computational burden is then alleviated by using the Gibbs sampler to indirectly sample from this multinomial posterior distribution on the set of possible subset choices. Those subsets with higher probability—the promising ones—can then be identified by their more frequent appearance in the Gibbs sample.",
keywords = "Data augmentation, Hierarchical Bayes, Latent variables, Mixture, Multiple regression",
author = "George, {Edward I.} and Robert McCulloch",
year = "1993",
doi = "10.1080/01621459.1993.10476353",
language = "English (US)",
volume = "88",
pages = "881--889",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "423",

}

TY - JOUR

T1 - Variable selection via Gibbs sampling

AU - George, Edward I.

AU - McCulloch, Robert

PY - 1993

Y1 - 1993

N2 - A crucial problem in building a multiple regression model is the selection of predictors to include. The main thrust of this article is to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure entails embedding the regression setup in a hierarchical normal mixture model where latent variables are used to identify subset choices. In this framework the promising subsets of predictors can be identified as those with higher posterior probability. The computational burden is then alleviated by using the Gibbs sampler to indirectly sample from this multinomial posterior distribution on the set of possible subset choices. Those subsets with higher probability—the promising ones—can then be identified by their more frequent appearance in the Gibbs sample.

AB - A crucial problem in building a multiple regression model is the selection of predictors to include. The main thrust of this article is to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure entails embedding the regression setup in a hierarchical normal mixture model where latent variables are used to identify subset choices. In this framework the promising subsets of predictors can be identified as those with higher posterior probability. The computational burden is then alleviated by using the Gibbs sampler to indirectly sample from this multinomial posterior distribution on the set of possible subset choices. Those subsets with higher probability—the promising ones—can then be identified by their more frequent appearance in the Gibbs sample.

KW - Data augmentation

KW - Hierarchical Bayes

KW - Latent variables

KW - Mixture

KW - Multiple regression

UR - http://www.scopus.com/inward/record.url?scp=84893179575&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893179575&partnerID=8YFLogxK

U2 - 10.1080/01621459.1993.10476353

DO - 10.1080/01621459.1993.10476353

M3 - Article

AN - SCOPUS:84893179575

VL - 88

SP - 881

EP - 889

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 423

ER -