Risk of bias: a simulation study of power to detect study-level moderator effects in meta-analysis.

Susanne Hempel; Jeremy N.V. Miles; Marika J. Booth; Zhen Wang; Sally C. Morton; Paul G. Shekelle

doi:10.1186/2046-4053-2-107

Risk of bias: a simulation study of power to detect study-level moderator effects in meta-analysis.

Susanne Hempel, Jeremy N.V. Miles, Marika J. Booth, Zhen Wang, Sally C. Morton, Paul G. Shekelle

Research output: Contribution to journal › Article › peer-review

101 Scopus citations

Abstract

There are both theoretical and empirical reasons to believe that design and execution factors are associated with bias in controlled trials. Statistically significant moderator effects, such as the effect of trial quality on treatment effect sizes, are rarely detected in individual meta-analyses, and evidence from meta-epidemiological datasets is inconsistent. The reasons for the disconnect between theory and empirical observation are unclear. The study objective was to explore the power to detect study level moderator effects in meta-analyses. We generated meta-analyses using Monte-Carlo simulations and investigated the effect of number of trials, trial sample size, moderator effect size, heterogeneity, and moderator distribution on power to detect moderator effects. The simulations provide a reference guide for investigators to estimate power when planning meta-regressions. The power to detect moderator effects in meta-analyses, for example, effects of study quality on effect sizes, is largely determined by the degree of residual heterogeneity present in the dataset (noise not explained by the moderator). Larger trial sample sizes increase power only when residual heterogeneity is low. A large number of trials or low residual heterogeneity are necessary to detect effects. When the proportion of the moderator is not equal (for example, 25% 'high quality', 75% 'low quality' trials), power of 80% was rarely achieved in investigated scenarios. Application to an empirical meta-epidemiological dataset with substantial heterogeneity (I(2) = 92%, τ(2) = 0.285) estimated >200 trials are needed for a power of 80% to show a statistically significant result, even for a substantial moderator effect (0.2), and the number of trials with the less common feature (for example, few 'high quality' studies) affects power extensively. Although study characteristics, such as trial quality, may explain some proportion of heterogeneity across study results in meta-analyses, residual heterogeneity is a crucial factor in determining when associations between moderator variables and effect sizes can be statistically detected. Detecting moderator effects requires more powerful analyses than are employed in most published investigations; hence negative findings should not be considered evidence of a lack of effect, and investigations are not hypothesis-proving unless power calculations show sufficient ability to detect effects.

Original language	English (US)
Article number	107
Pages (from-to)	107
Number of pages	1
Journal	Systematic Reviews
Volume	2
DOIs	https://doi.org/10.1186/2046-4053-2-107
State	Published - 2013
Externally published	Yes

ASJC Scopus subject areas

Medicine (miscellaneous)

Access to Document

10.1186/2046-4053-2-107

Cite this

@article{cae8e6235d064d14a58b1b930fb50829,

title = "Risk of bias: a simulation study of power to detect study-level moderator effects in meta-analysis.",

abstract = "There are both theoretical and empirical reasons to believe that design and execution factors are associated with bias in controlled trials. Statistically significant moderator effects, such as the effect of trial quality on treatment effect sizes, are rarely detected in individual meta-analyses, and evidence from meta-epidemiological datasets is inconsistent. The reasons for the disconnect between theory and empirical observation are unclear. The study objective was to explore the power to detect study level moderator effects in meta-analyses. We generated meta-analyses using Monte-Carlo simulations and investigated the effect of number of trials, trial sample size, moderator effect size, heterogeneity, and moderator distribution on power to detect moderator effects. The simulations provide a reference guide for investigators to estimate power when planning meta-regressions. The power to detect moderator effects in meta-analyses, for example, effects of study quality on effect sizes, is largely determined by the degree of residual heterogeneity present in the dataset (noise not explained by the moderator). Larger trial sample sizes increase power only when residual heterogeneity is low. A large number of trials or low residual heterogeneity are necessary to detect effects. When the proportion of the moderator is not equal (for example, 25% 'high quality', 75% 'low quality' trials), power of 80% was rarely achieved in investigated scenarios. Application to an empirical meta-epidemiological dataset with substantial heterogeneity (I(2) = 92%, τ(2) = 0.285) estimated >200 trials are needed for a power of 80% to show a statistically significant result, even for a substantial moderator effect (0.2), and the number of trials with the less common feature (for example, few 'high quality' studies) affects power extensively. Although study characteristics, such as trial quality, may explain some proportion of heterogeneity across study results in meta-analyses, residual heterogeneity is a crucial factor in determining when associations between moderator variables and effect sizes can be statistically detected. Detecting moderator effects requires more powerful analyses than are employed in most published investigations; hence negative findings should not be considered evidence of a lack of effect, and investigations are not hypothesis-proving unless power calculations show sufficient ability to detect effects.",

author = "Susanne Hempel and Miles, {Jeremy N.V.} and Booth, {Marika J.} and Zhen Wang and Morton, {Sally C.} and Shekelle, {Paul G.}",

note = "Funding Information: We thank Breanne Johnsen, Tanja Perry, Aneesa Motala, Di Valentine, and Sydne Newberry for assistance with the data and manuscript. We thank Sally Hopewell and Ly-Mee Yu for providing the citations of RCTs published in 2006 and indexed in PubMed. Funding from the Department of Veterans Affairs (VA), 2011 Under Secretary{\textquoteright}s Award in Health Services Research to Paul Shekelle; the Agency for Healthcare Research and Quality (AHRQ), Contract No. 290-2007-10062-I; and the RAND Corporation supported the collation of empirical datasets, Monte Carlo Simulations, and the preparation of the manuscript. SH, JNVM, and MJB were supported by RAND, the VA, and AHRQ; ZW and PS received funding from the VA and AHRQ; and SM did not receive any funding for contributions to the project. The funding agencies had no role in the design, the collection, analysis, and interpretation of the presented data, in the writing of the manuscript, or in the decision to submit this manuscript for publication.",

year = "2013",

doi = "10.1186/2046-4053-2-107",

language = "English (US)",

volume = "2",

pages = "107",

journal = "Systematic Reviews",

issn = "2046-4053",

publisher = "BioMed Central",

}

TY - JOUR

T1 - Risk of bias

T2 - a simulation study of power to detect study-level moderator effects in meta-analysis.

AU - Hempel, Susanne

AU - Miles, Jeremy N.V.

AU - Booth, Marika J.

AU - Wang, Zhen

AU - Morton, Sally C.

AU - Shekelle, Paul G.

N1 - Funding Information: We thank Breanne Johnsen, Tanja Perry, Aneesa Motala, Di Valentine, and Sydne Newberry for assistance with the data and manuscript. We thank Sally Hopewell and Ly-Mee Yu for providing the citations of RCTs published in 2006 and indexed in PubMed. Funding from the Department of Veterans Affairs (VA), 2011 Under Secretary’s Award in Health Services Research to Paul Shekelle; the Agency for Healthcare Research and Quality (AHRQ), Contract No. 290-2007-10062-I; and the RAND Corporation supported the collation of empirical datasets, Monte Carlo Simulations, and the preparation of the manuscript. SH, JNVM, and MJB were supported by RAND, the VA, and AHRQ; ZW and PS received funding from the VA and AHRQ; and SM did not receive any funding for contributions to the project. The funding agencies had no role in the design, the collection, analysis, and interpretation of the presented data, in the writing of the manuscript, or in the decision to submit this manuscript for publication.

PY - 2013

Y1 - 2013

N2 - There are both theoretical and empirical reasons to believe that design and execution factors are associated with bias in controlled trials. Statistically significant moderator effects, such as the effect of trial quality on treatment effect sizes, are rarely detected in individual meta-analyses, and evidence from meta-epidemiological datasets is inconsistent. The reasons for the disconnect between theory and empirical observation are unclear. The study objective was to explore the power to detect study level moderator effects in meta-analyses. We generated meta-analyses using Monte-Carlo simulations and investigated the effect of number of trials, trial sample size, moderator effect size, heterogeneity, and moderator distribution on power to detect moderator effects. The simulations provide a reference guide for investigators to estimate power when planning meta-regressions. The power to detect moderator effects in meta-analyses, for example, effects of study quality on effect sizes, is largely determined by the degree of residual heterogeneity present in the dataset (noise not explained by the moderator). Larger trial sample sizes increase power only when residual heterogeneity is low. A large number of trials or low residual heterogeneity are necessary to detect effects. When the proportion of the moderator is not equal (for example, 25% 'high quality', 75% 'low quality' trials), power of 80% was rarely achieved in investigated scenarios. Application to an empirical meta-epidemiological dataset with substantial heterogeneity (I(2) = 92%, τ(2) = 0.285) estimated >200 trials are needed for a power of 80% to show a statistically significant result, even for a substantial moderator effect (0.2), and the number of trials with the less common feature (for example, few 'high quality' studies) affects power extensively. Although study characteristics, such as trial quality, may explain some proportion of heterogeneity across study results in meta-analyses, residual heterogeneity is a crucial factor in determining when associations between moderator variables and effect sizes can be statistically detected. Detecting moderator effects requires more powerful analyses than are employed in most published investigations; hence negative findings should not be considered evidence of a lack of effect, and investigations are not hypothesis-proving unless power calculations show sufficient ability to detect effects.

AB - There are both theoretical and empirical reasons to believe that design and execution factors are associated with bias in controlled trials. Statistically significant moderator effects, such as the effect of trial quality on treatment effect sizes, are rarely detected in individual meta-analyses, and evidence from meta-epidemiological datasets is inconsistent. The reasons for the disconnect between theory and empirical observation are unclear. The study objective was to explore the power to detect study level moderator effects in meta-analyses. We generated meta-analyses using Monte-Carlo simulations and investigated the effect of number of trials, trial sample size, moderator effect size, heterogeneity, and moderator distribution on power to detect moderator effects. The simulations provide a reference guide for investigators to estimate power when planning meta-regressions. The power to detect moderator effects in meta-analyses, for example, effects of study quality on effect sizes, is largely determined by the degree of residual heterogeneity present in the dataset (noise not explained by the moderator). Larger trial sample sizes increase power only when residual heterogeneity is low. A large number of trials or low residual heterogeneity are necessary to detect effects. When the proportion of the moderator is not equal (for example, 25% 'high quality', 75% 'low quality' trials), power of 80% was rarely achieved in investigated scenarios. Application to an empirical meta-epidemiological dataset with substantial heterogeneity (I(2) = 92%, τ(2) = 0.285) estimated >200 trials are needed for a power of 80% to show a statistically significant result, even for a substantial moderator effect (0.2), and the number of trials with the less common feature (for example, few 'high quality' studies) affects power extensively. Although study characteristics, such as trial quality, may explain some proportion of heterogeneity across study results in meta-analyses, residual heterogeneity is a crucial factor in determining when associations between moderator variables and effect sizes can be statistically detected. Detecting moderator effects requires more powerful analyses than are employed in most published investigations; hence negative findings should not be considered evidence of a lack of effect, and investigations are not hypothesis-proving unless power calculations show sufficient ability to detect effects.

UR - http://www.scopus.com/inward/record.url?scp=84899497147&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84899497147&partnerID=8YFLogxK

U2 - 10.1186/2046-4053-2-107

DO - 10.1186/2046-4053-2-107

M3 - Article

C2 - 24286208

AN - SCOPUS:84899497147

SN - 2046-4053

VL - 2

SP - 107

JO - Systematic Reviews

JF - Systematic Reviews

M1 - 107

ER -

Risk of bias: a simulation study of power to detect study-level moderator effects in meta-analysis.

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this