Regression Models Involving Nonlinear Effects With Missing Data: A Sequential Modeling Approach Using Bayesian Estimation

Oliver Lüdtke, Alexander Robitzsch, Stephen West

Research output: Contribution to journalArticle

Abstract

When estimating multiple regression models with incomplete predictor variables, it is necessary to specify a joint distribution for the predictor variables. A convenient assumption is that this distribution is a joint normal distribution, the default in many statistical software packages. This distribution will in general be misspecified if the predictors with missing data have nonlinear effects (e.g., x2) or are included in interaction terms (e.g., x z). In the present article, we discuss a sequential modeling approach that can be applied to decompose the joint distribution of the variables into 2 parts: (a) a part that is due to the model of interest and (b) a part that is due to the model for the incomplete predictors. We demonstrate how the sequential modeling approach can be used to implement a multiple imputation strategy based on Bayesian estimation techniques that can accommodate rather complex substantive regression models with nonlinear effects and also allows a flexible treatment of auxiliary variables. In 4 simulation studies, we showed that the sequential modeling approach can be applied to estimate nonlinear effects in regression models with missing values on continuous, categorical, or skewed predictor variables under a broad range of conditions and investigated the robustness of the proposed approach against distributional misspecifications. We developed the R package mdmb, which facilitates a user-friendly application of the sequential modeling approach, and we present a real-data example that illustrates the flexibility of the software.

Original languageEnglish (US)
JournalPsychological Methods
DOIs
StateAccepted/In press - Jan 1 2019

Fingerprint

Bayes Theorem
Software
Normal Distribution

Keywords

  • Interaction effects
  • Missing data
  • Multiple imputation
  • Multiple regression

ASJC Scopus subject areas

  • Psychology (miscellaneous)

Cite this

Regression Models Involving Nonlinear Effects With Missing Data : A Sequential Modeling Approach Using Bayesian Estimation. / Lüdtke, Oliver; Robitzsch, Alexander; West, Stephen.

In: Psychological Methods, 01.01.2019.

Research output: Contribution to journalArticle

@article{24310f3acb014572a05eadfa1c675105,
title = "Regression Models Involving Nonlinear Effects With Missing Data: A Sequential Modeling Approach Using Bayesian Estimation",
abstract = "When estimating multiple regression models with incomplete predictor variables, it is necessary to specify a joint distribution for the predictor variables. A convenient assumption is that this distribution is a joint normal distribution, the default in many statistical software packages. This distribution will in general be misspecified if the predictors with missing data have nonlinear effects (e.g., x2) or are included in interaction terms (e.g., x z). In the present article, we discuss a sequential modeling approach that can be applied to decompose the joint distribution of the variables into 2 parts: (a) a part that is due to the model of interest and (b) a part that is due to the model for the incomplete predictors. We demonstrate how the sequential modeling approach can be used to implement a multiple imputation strategy based on Bayesian estimation techniques that can accommodate rather complex substantive regression models with nonlinear effects and also allows a flexible treatment of auxiliary variables. In 4 simulation studies, we showed that the sequential modeling approach can be applied to estimate nonlinear effects in regression models with missing values on continuous, categorical, or skewed predictor variables under a broad range of conditions and investigated the robustness of the proposed approach against distributional misspecifications. We developed the R package mdmb, which facilitates a user-friendly application of the sequential modeling approach, and we present a real-data example that illustrates the flexibility of the software.",
keywords = "Interaction effects, Missing data, Multiple imputation, Multiple regression",
author = "Oliver L{\"u}dtke and Alexander Robitzsch and Stephen West",
year = "2019",
month = "1",
day = "1",
doi = "10.1037/met0000233",
language = "English (US)",
journal = "Psychological Methods",
issn = "1082-989X",
publisher = "American Psychological Association Inc.",

}

TY - JOUR

T1 - Regression Models Involving Nonlinear Effects With Missing Data

T2 - A Sequential Modeling Approach Using Bayesian Estimation

AU - Lüdtke, Oliver

AU - Robitzsch, Alexander

AU - West, Stephen

PY - 2019/1/1

Y1 - 2019/1/1

N2 - When estimating multiple regression models with incomplete predictor variables, it is necessary to specify a joint distribution for the predictor variables. A convenient assumption is that this distribution is a joint normal distribution, the default in many statistical software packages. This distribution will in general be misspecified if the predictors with missing data have nonlinear effects (e.g., x2) or are included in interaction terms (e.g., x z). In the present article, we discuss a sequential modeling approach that can be applied to decompose the joint distribution of the variables into 2 parts: (a) a part that is due to the model of interest and (b) a part that is due to the model for the incomplete predictors. We demonstrate how the sequential modeling approach can be used to implement a multiple imputation strategy based on Bayesian estimation techniques that can accommodate rather complex substantive regression models with nonlinear effects and also allows a flexible treatment of auxiliary variables. In 4 simulation studies, we showed that the sequential modeling approach can be applied to estimate nonlinear effects in regression models with missing values on continuous, categorical, or skewed predictor variables under a broad range of conditions and investigated the robustness of the proposed approach against distributional misspecifications. We developed the R package mdmb, which facilitates a user-friendly application of the sequential modeling approach, and we present a real-data example that illustrates the flexibility of the software.

AB - When estimating multiple regression models with incomplete predictor variables, it is necessary to specify a joint distribution for the predictor variables. A convenient assumption is that this distribution is a joint normal distribution, the default in many statistical software packages. This distribution will in general be misspecified if the predictors with missing data have nonlinear effects (e.g., x2) or are included in interaction terms (e.g., x z). In the present article, we discuss a sequential modeling approach that can be applied to decompose the joint distribution of the variables into 2 parts: (a) a part that is due to the model of interest and (b) a part that is due to the model for the incomplete predictors. We demonstrate how the sequential modeling approach can be used to implement a multiple imputation strategy based on Bayesian estimation techniques that can accommodate rather complex substantive regression models with nonlinear effects and also allows a flexible treatment of auxiliary variables. In 4 simulation studies, we showed that the sequential modeling approach can be applied to estimate nonlinear effects in regression models with missing values on continuous, categorical, or skewed predictor variables under a broad range of conditions and investigated the robustness of the proposed approach against distributional misspecifications. We developed the R package mdmb, which facilitates a user-friendly application of the sequential modeling approach, and we present a real-data example that illustrates the flexibility of the software.

KW - Interaction effects

KW - Missing data

KW - Multiple imputation

KW - Multiple regression

UR - http://www.scopus.com/inward/record.url?scp=85072061722&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072061722&partnerID=8YFLogxK

U2 - 10.1037/met0000233

DO - 10.1037/met0000233

M3 - Article

AN - SCOPUS:85072061722

JO - Psychological Methods

JF - Psychological Methods

SN - 1082-989X

ER -