In this section, we will discuss a number of meta-analytic techniques. I will demonstrate how to perform each in R as we go along. We will primarily be using the metafor package to perform these exercises.
The traditional form of conducting a research synthesis is performing a literature review. However, there are now also a number of procedural and statistical techniques for synthesizing results of studies. In experiments, this is most commonly done by:
1.) Replicating experiments (i.e. conduct the same experiment holding treatment and outcomes constant) with different subjects, ideally drawn from the same population.
2.) Pooling the findings from these experiments into a single across-study treatment effect. The resulting increase in sample size will also result in an increase in precision.
From Gerber and Green: “The attraction of meta-analysis is that a series of small experiments may each be unable to speak to a hypothesis with precision, but when pooled together, these experiments may suggest a clear conclusion.”
Meta-analysis began in the 1980s as a way to synthesize educational and psychological research, but has since expanded, particularly in the medical and social sciences.
In medicine, the Cochrane Collaboration was started in 1993, and today contains thousands of systematic reviews of medical interventions. This kind of research synthesis is considered by many to be the gold standard for determining the effectiveness of different health care interventions.
Examples of meta-analysis in political science:
Under fixed effects models, each study is assumed to differ from the others only due to having just a sample of observations from the total population. Observed studies therefore yield different effect sizes only because of sampling error. Fixed effects model do not assume that the true effects are homogeneous (this is sometimes erroneosuly stated). In other words, fixed-effects models provide perfectly valid inferences under heterogeneity, as long we restrict our inferences about the average effect size across studies to the set of studies included in the meta-analysis.
Under random effects models, we don’t assume that there is just one population effect size. Instead, a distribution of population effect sizes exists that is generated by a distribution of possible study realizations. In other words, observed outcomes in studies would differ from each other not just because of sampling error, but also because they reflect these true, underlying population differences. In contrast to the fixed-effects model, random/mixed-effects models provide an unconditional inference about a larger set of studies from which the \(k\) studies included in the meta-analysis are assumed to be a random sample. We typically do not assume that this larger set consists only of studies that have actually been conducted, but instead envision a hypothetical population of studies that comprises studies that have been conducted, that could have been conducted, or that may be conducted in the future.
Which should you use?
Answer: Some argue that random effects models are preferred on conceptual grounds because they better reflect the inherent uncertainty in meta-analytic inference, and because they reduce to the fixed-effects model when the variance component is zero.
\[y_i = \theta_i + e_i\] Where \(y_i\) denotes the observed effect in the \(i\)th study, \(\theta_i\) is the corresponding (unknown) true effect, \(e_i\) is the sampling error, and \(e_i ∼ N(0,v_i)\).
A fixed effect model tells us: how large is the average true effect in the set of \(k\) studies included in the meta-analysis?
This is typically estimated using weighted least squares, where:
\[\hat{\theta} = \frac{\sum_{i=1}^k \: w_i \: \: \theta_i \:}{\sum_{i=1}^k \: w_i \:}\]
where \(w_i\) represents the weight assigned to each study. These weights are typically equal to \(w_i = \frac{1}{v_i}\), or the inverse of the within-study variance (the square of the standard error) of the estimated effect. Note that the variance is inversely proportional to within-study sample size (because \(v = \frac{\sum (x_i - \bar{x})^2}{n - 1}\)). Therefore, the larger the sample, the smaller the variance, so the more precise the estimate of effect size should be. Hence, larger weights are assigned to effect sizes from studies that have larger within-study sample sizes.
When all observed effect size indicators (\(\theta_i\)) estimate a single population parameter, as is hypothesized under a fixed effects model, then \(\hat{\theta}\) is an unbiased estimate of the population parameter.
Same as above, but now \(\theta_i\) is not fixed, but is itself random and has its own distribution:
\[\theta_i = \mu + \mu_i\] where \(\mu_i ∼ N(0,\tau^2)\). \(\mu_i\) can be thought of as the between-studies variance. Therefore, the true effects are assumed to be normally distributed with mean \(\mu\) and variance \(\tau^2\).
\(\hat{\theta}\) is still estimated as:
\[\hat{\theta} = \frac{\sum_{i=1}^k \: w_i \: \: \theta_i \:}{\sum_{i=1}^k \: w_i \:}\]
But now \(w_i = \frac{1}{v_i + \hat{\tau^2}}\), where \(\hat{\tau^2}\) is an estimate of \(\tau^2\).
This implies that random effects models follow a two-stage process: (1) estimate the amount of heterogenity \(\tau^2\) using one of a number of proposed estimators, and (2) estimate \(\mu\) using WLS.
The true effects are therefore assumed to be normally distributed with mean \(\mu\) and variance \(\tau^2\). The goal is then to estimate \(\mu\), the average true effect and \(\tau^2\), the (total) amount of heterogeneity among the true effects. If \(\tau^2 = 0\), then this implies homogeneity among the true effects (i.e., \(\theta_1 = . . . = \theta_k ≡ \theta\)), so that \(\mu = \theta\) then denotes the true effect.
What does this last statement imply about the difference between fixed and random effects meta-analysis under treatment effect homogeneity?
Answer: The two models will provide the same estimate since \(\mu = \theta\).
Since the variation under random effects incorporates the same error as fixed effects plus an additional component, it cannot be less than the variation under the fixed effect model. As long as the between-studies variation is non-zero, the variance, standard error, and confidence interval will therefore always be larger under random effects.
A robust literature explores how to detect and correct for
publication bias in meta-analysis. Which method is the best in each
circumstance remains a subject of active debate.
The following tests have been proposed to detect publication bias:
The following estimates have been proposed to correct for publication bias:
# PETPEESE R function
petpeese <- function(dataset) {
pet = lm(ate ~ se, weight = 1/var, data = dataset)
peese = lm(ate ~ var, weight = 1/var, data = dataset)
int_pet = pet$coefficients[1]
se_pet = summary(pet)$coefficients[1,2]
int_peese = peese$coefficients[1]
se_peese = summary(peese)$coefficients[1,2]
p_pet = summary(pet)$coefficients[1,4]
petpeese_int = ifelse(p_pet > .05, int_pet, int_peese)
petpeese_se = ifelse(p_pet > .05, se_pet, se_peese)
return(c(petpeese_int, petpeese_se))
}
Part of the heterogenity in a study may be due to the influence of “moderators.” For example, in a medical trial, results might depend on important differences in subjects (e.g. gender, a pre-existing condition, etc.). If these “covariates” are known, we can account for them in our meta-analysis.
In practice, we will then get a coefficient on this variable reflecting how much it is associated with the variation in results across studies. We can also recover an estimate of how much of the “total heterogenity” across studies we have “accounted for” by including this covariate.
Let’s look at an example in R.
Upside: Higher likelihood of a publication or an impressive research synthesis in your dissertation.
Downside: Much more work, and there may not have been enough studies conducted on a question you are interested in to perform a meta-analysis.