What is one-sided non compliance?
Can non compliance exist in the control group?
What does the notation \(d_i\)(1) = 0 mean?
How would you read the following potential outcomes?
\(E[Y_i(z=1,d=0)|d_i(1)=0]\)
\(d_i(1) = 0\) is the shorthand for “Never Takers” under one-sided non-compliance. The term gives the expected outcome of never takers in the treatment group.
If we can back out the outcome for compliers in the control group, why would we need to use a placebo design?
The excludability assumption states \(Y_i(z,d)=Y_i(d)\). Put another way: \(Y_i(z=1,d)=Y_i(z=0,d)=Y_i(d)\). Or, potential outcomes respond to treatments, not treatment assignments.
Look at the potential outcomes table below and identify the “types” that violate the excludability assumption (i.e. subjects whose potential outcomes respond to treatment assignment \(z\)):
Answer: Type “B” violates excludability because \(Y_i(z=1,d=0) \neq Y_i(z=0,d=0)\).
\[CACE = \frac{ITT_Y}{ITT_D}\]
Where:
Compliance and Outcome | Control | Treatment |
---|---|---|
Contacted and Voted | 0 | 700 |
Contacted and Abstained | 0 | 100 |
Not contacted and Voted | 1100 | 600 |
Not contacted and Abstained | 900 | 600 |
\(ITT_Y\) = Assigned to treatment who voted minus assigned to control who voted =
\[ITT_Y = \frac{700 + 600}{2000} - \frac{1100}{2000} = 0.1\]
\(ITT_Y\) = Proportion assigned to treatment who received treatment minus proportion assigned to control who would received treatment (we can ignore the second part of this becuase we are only looking at one-sided non compliance) =
\[ITT_D = \frac{700 + 100}{2000} - \frac{0 + 0}{2000} = 0.4\]
\[CACE = \frac{ITT_Y}{ITT_D} = \frac{0.1}{0.4} = 0.25\]
Angrist and Pischke, Mostly Harmless Econometrics, Chapter 4: Instrumental Variables and Causality (p.85):
Examples of IV estimation in political science (non-experimental):
Angrist (1990): Draft lottery as an IV for military service (income as outcome)
Acemoglu et al (2001): settler mortality as an IV for institutional quality (GDP/capita as outcome)
Levitt (1997): being an election year as IV for police force size (crime as outcome)
Kern & Hainmueller (2009): having West German TV reception in East Berlin as an instrument for West German TV watching (outcome is support for the East German regime)
Nunn & Wantchekon (2011): historical distance of ethnic group to the coast as a instrument for the slave raiding of that ethnic group (outcome are trust attitudes today)
Acharya, Blackwell, Sen (2015): cotton suitability as IV for proportion slave in 1860 (outcome is white attitudes today)
Because IV estimation and CACE estimation are technically identitical, we can use the same tools to estimate the CACE as used in instrumental variables, AKA two-stage least squares (2SLS).
Let’s import data with a treatment assignment \(Z\), treatment \(D\), and outcome \(Y\), and estimate the CACE three different ways.
# Import data
dat5 <- read.csv("http://hdl.handle.net/10079/kh189dd")
# Rename variables
dat5 = dat5 %>%
rename(Z = watch, D = watchdps, Y = ochange)
# Output table using kable
kable(head(dat5)) %>% kable_styling(position = "center")
Z | D | Y |
---|---|---|
0 | 0 | 0 |
0 | 0 | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
1 | 1 | 0 |
0 | 0 | 1 |
First, as the (difference-in-means on the outcome)/(difference-in-means on treatment assignment).
# Difference in means
itt <- with(dat5, mean(dat5[Z == 1, "Y"]) - mean(dat5[Z == 0, "Y"]))
ittd <- with(dat5, mean(dat5[Z == 1, "D"]) - mean(dat5[Z == 0, "D"]))
cace <- itt/ittd
cace
## [1] 0.2787494
Second, let’s do this using 2SLS. In practice, we will be using one regression to estimate the \(ITT_Y\), and another to estimate the \(ITT_D\). Example of calculating CACE using 2SLS:
Calculate the \(ITT\) using OLS.
model_itt <- lm_robust(Y ~ Z, data = dat5)
kable(tidy(model_itt)) %>% kable_styling(position = "center")
term | estimate | std.error | statistic | p.value | conf.low | conf.high | df | outcome |
---|---|---|---|---|---|---|---|---|
(Intercept) | 0.4181818 | 0.0221928 | 18.843102 | 0.0000000 | 0.3746318 | 0.4617318 | 998 | Y |
Z | 0.0570657 | 0.0314219 | 1.816111 | 0.0696533 | -0.0045949 | 0.1187263 | 998 | Y |
Next, let’s calculate the \(ITT_D\) using OLS.
model_ittd <- lm_robust(D ~ Z, data = dat5)
kable(tidy(model_ittd)) %>% kable_styling(position = "center")
term | estimate | std.error | statistic | p.value | conf.low | conf.high | df | outcome |
---|---|---|---|---|---|---|---|---|
(Intercept) | 0.1616162 | 0.0165615 | 9.758530 | 0 | 0.1291168 | 0.1941156 | 998 | D |
Z | 0.2047205 | 0.0271084 | 7.551908 | 0 | 0.1515244 | 0.2579166 | 998 | D |
Now let’s calculate the \(CACE\) as the ratio of the two.
# Calculate the CACE
itt = model_itt$coefficients[2]
itt_d = model_ittd$coefficients[2]
itt/itt_d
## Z
## 0.2787494
Third, we can use “iv_robust” from DeclareDesign (or ivreg from the AER package) to do 2SLS in one step.
model_cace <- iv_robust(Y ~ D | Z, data = dat5)
kable(tidy(model_cace)) %>% kable_styling(position = "center")
term | estimate | std.error | statistic | p.value | conf.low | conf.high | df | outcome |
---|---|---|---|---|---|---|---|---|
(Intercept) | 0.3731314 | 0.0435674 | 8.564455 | 0.0000000 | 0.2876371 | 0.4586257 | 998 | Y |
D | 0.2787494 | 0.1530055 | 1.821826 | 0.0687807 | -0.0215001 | 0.5789988 | 998 | Y |
Adding covariates. You must explicitly specify all exogenous regressors on both sides of the bar.
# Add covariates
dat5$X1 = with(dat5, rnorm(nrow(dat5), mean = 0, sd = 2))
dat5$X2 = with(dat5,
ifelse(Y == 1,
rbinom(nrow(dat5), size = 1, prob = 0.7),
rbinom(nrow(dat5), size = 1, prob = 0.3)))
# IV robust with covariate
iv_robust_cov_right = iv_robust(Y ~ D + X1 + X2| Z + X1 + X2, data = dat5)
iv_robust_cov_wrong = iv_robust(Y ~ D | Z + X1 + X2, data = dat5)
# Output results
kable(tidy(iv_robust_cov_right)) %>% kable_styling(position = "center")
term | estimate | std.error | statistic | p.value | conf.low | conf.high | df | outcome |
---|---|---|---|---|---|---|---|---|
(Intercept) | 0.1849672 | 0.0395286 | 4.679328 | 0.0000033 | 0.1073983 | 0.2625360 | 996 | Y |
D | 0.2974963 | 0.1402557 | 2.121100 | 0.0341599 | 0.0222658 | 0.5727269 | 996 | Y |
X1 | -0.0175552 | 0.0074307 | -2.362528 | 0.0183422 | -0.0321368 | -0.0029736 | 996 | Y |
X2 | 0.3885044 | 0.0293627 | 13.231204 | 0.0000000 | 0.3308845 | 0.4461243 | 996 | Y |
kable(tidy(iv_robust_cov_wrong)) %>% kable_styling(position = "center")
term | estimate | std.error | statistic | p.value | conf.low | conf.high | df | outcome |
---|---|---|---|---|---|---|---|---|
(Intercept) | 0.3400422 | 0.0431174 | 7.886423 | 0.0000000 | 0.2554310 | 0.4246534 | 998 | Y |
D | 0.4036144 | 0.1513270 | 2.667168 | 0.0077732 | 0.1066589 | 0.7005699 | 998 | Y |
Suppose that we run an experiment in which some units are assigned to control \((Z=0)\), others to a condition in which minimal effort is made to treat units \((Z=1)\), and a third condition in which maximal effort is made to treat units \((Z=2)\). In essence, we are manipulating compliance rates, and want to compare treatment effects for different types of compliers. Under one-sided non-compliance, we can define types and estimands.
Hint: this should be helpful on question 9 of your problem set.
Type | Z = 0 | Z = 1 | Z = 2 |
---|---|---|---|
Never Takers | 0 | 0 | 0 |
Minimal Compliers | 0 | 1 | 1 |
Maximal Compliers | 0 | 0 | 1 |
1.) First, let’s estimate the treatment effect for all compliers.
\[E[Yi|Z=2]−E[Yi|Z=0] = \]
\[ E[Y_i(0)|Z=2,NT] \times π_{NT} + E[Y_i(1)|Z=2,Min] \times π_{Min} + E[Y_i(1)|Z=2,Max] \times π_{Max} \]
\[ − (E[Y_i(0)|Z=0,NT] \times π_{NT} + E[Y_i(0)|Z=0,Min] \times π_{Min} + E[Y_i(0)|Z=0,Max] \times π_{Max})\]
Because of random assignment, each experimental group is a “random subset” of the population with similar proportions of each type (\(π\)’s). Due to the excludability assumption, \(E[Yi(0)|Z=2,NT]=E[Y_i(0)|Z=0,NT]\). This is because potential outcomes respond to \(d\), not \(z\), and \(d(z=2)=d(z=0)=0\) for never takers. This therefore simplifies to:
\[E[Y_i(Z=2,D=1)−Y_i(Z=0,D=0)|Min] \times π_{Min} + E[Y_i(Z=2,D=1) − Y_i(Z=0,D=0)|Max] \times π_{Max} = \]
\[E[Y_i(1)−Y_i(0)|Complier] \times π_{Min+Max} \]
So:
\[=E[Y_i(1)−Y_i(0)|Complier] \times π_{Min+Max} = CACE \times π_{C}\]
\[CACE = \frac{E[Y_i|Z=2]−E[Y_i|Z=0]}{π_{C}}\]
2.) Now let’s estimate the treatment effect for minimal compliers. We will compare the first treatment group to the control group:
\[E[Y_i|Z=1] − E[Y_i|Z=0] = \] \[= E[Y_i(0)|Z=1,NT] \times π_{NT} + E[Y_i(1)|Z=1,Min] \times π_{Min} + E[Y_i(0)|Z=1,Max] \times π_{Max} - \]
\[(E[Y_i(0)|Z=0,NT] \times π_{NT} + E[Y_i(0)|Z=0,Min] \times π_{Min} + E[Y_i(0)|Z=0,Max] \times π-{Max}) = \] \[E[Y_i(1) − Y_i(0)|Min] \times π_{Min}\]
So:
\[CACE_{Min} = \frac{E[Y_i | Z = 1] = E[Y_i | Z = 0]}{\pi_{min}}\]
3.) To estimate the treatment effect on maximal compliers, compare the second and first treatment groups:
\[E[Y_i|Z=2] − E[Y_i|Z=1] = \] \[E[Y_i(0)|Z=2,NT] \times π_{NT} + E[Y_i(1)|Z=2,Min] \times π_{Min} + E[Y_i(1)|Z=2,Max] \times π_{Max} - \]
\[(E[Y_i(0)|Z=1,NT] \times π_{NT} + E[Y_i(1)|Z=1,Min] \times π_{Min} + E[Y_i(0)|Z=1,Max] \times π_{Max}) = \]
\[E[Y_i(1) − Y_i(0)|Max] \times π_{Max}\]
So:
\[CACE_{Max} = \frac{E[Y_i | Z = 2] = E[Y_i | Z = 1]}{\pi_{max}} \]
Let’s practice conducting a power analysis using DeclareDesign.
First, we will define our research design.
# we should turn this into a picture labeling MIDA
simple_design <-
# M: model
# a 100-unit population with an unobserved shock 'e'
declare_population(N = 100, u = rnorm(N)) +
# two potential outcomes, Y_Z_0 and Y_Z_1
# Y_Z_0 is the control potential outcome (what would happen if the unit is untreated)
# it is equal to the unobserved shock 'u'
# Y_Z_1 is the treated potential outcome
# it is equal to the control potential outcome plus a treatment effect of 0.25
declare_potential_outcomes(Y_Z_0 = u,
Y_Z_1 = Y_Z_0 + rnorm(N, mean = 0.25, sd = 0.1)) +
# I: inquiry
# we are interested in the average treatment effect in the population (PATE)
declare_estimand(PATE = mean(Y_Z_1 - Y_Z_0)) +
# D: data strategy
# sampling: we randomly sample 50 of the 100 units in the population
declare_sampling(n = 50) +
# assignment: we randomly assign half of the 50 sampled units to treatment (half to control)
declare_assignment(prob = 0.5) +
# reveal outcomes: construct outcomes from the potential outcomes named Y depending on
# the realized value of their assignment variable named Z
declare_reveal(outcome_variables = Y, assignment_variables = Z) +
# A: answer strategy
# calculate the difference-in-means of Y depending on Z
# we link this estimator to PATE because this is our estimate of our inquiry
declare_estimator(Y ~ Z, model = difference_in_means, estimand = "PATE")
Now let’s tell DeclareDesign what properties of our design we are interested in. We probably want to know power here more than anything.
Power: The probability over many iterations of a study of getting a statistically significant result (i.e. how often you reject a null hypothesis that you should reject).
# Select diagnosands
simple_design_diagnosands <-
declare_diagnosands(select = c(bias, rmse, power))
Now we ask DeclareDesign to return the properties we are interested in.
# Diagnose the design
simple_design_diagnosis <-
diagnose_design(simple_design,
diagnosands = simple_design_diagnosands,
sims = 1000)
simple_design_diagnosis
##
## Research design diagnosis based on 1000 simulations. Diagnosand estimates with bootstrapped standard errors in parentheses (100 replicates).
##
## Design Label Estimand Label Estimator Label Term N Sims Bias RMSE
## simple_design PATE estimator Z 1000 0.01 0.28
## (0.01) (0.01)
## Power
## 0.14
## (0.01)
Looks like our experiment is underpowered. Let’s rewrite the code above more compactly and try again with a larger sample size.
# Or more compactly
# Let's always assign half of our units to treatment
N = 1000
m = N/2
simple_design <-
declare_population(N = N, u = rnorm(N)) +
declare_potential_outcomes(Y_Z_0 = u, Y_Z_1 = Y_Z_0 + 0.25) +
declare_estimand(PATE = mean(Y_Z_1 - Y_Z_0)) +
declare_sampling(n = m) +
declare_assignment(prob = 0.5) +
declare_reveal(outcome_variables = Y, assignment_variables = Z) +
declare_estimator(Y ~ Z, model = difference_in_means, estimand = "PATE")
simple_design_diagnosands <-
declare_diagnosands(select = c(bias, rmse, power))
simple_design_diagnosis <-
diagnose_design(simple_design,
diagnosands = simple_design_diagnosands,
sims = 1000)
simple_design_diagnosis
##
## Research design diagnosis based on 1000 simulations. Diagnosand estimates with bootstrapped standard errors in parentheses (100 replicates).
##
## Design Label Estimand Label Estimator Label Term N Sims Bias RMSE
## simple_design PATE estimator Z 1000 0.00 0.09
## (0.00) (0.00)
## Power
## 0.81
## (0.01)
Candle experiment
# Candle experiment
N = 50
m = N/2
# Histogram
hist(rnorm(N, mean = 180, sd = 10))
candle_design <-
declare_population(N = N, u = rnorm(N, mean = 180, sd = 10)) +
declare_potential_outcomes(Y_Z_0 = u,
Y_Z_1 = Y_Z_0 + rnorm(N, mean = 60, sd = 5)) +
declare_estimand(PATE = mean(Y_Z_1 - Y_Z_0)) +
declare_sampling(n = m) +
declare_assignment(prob = 0.5) +
declare_reveal(outcome_variables = Y, assignment_variables = Z) +
declare_estimator(Y ~ Z, model = difference_in_means, estimand = "PATE")
candle_diagnosis <-
declare_diagnosands(select = c(bias, rmse, power))
candle_diagnosis <-
diagnose_design(candle_design,
diagnosands = candle_diagnosis,
sims = 1000)
candle_diagnosis
##
## Research design diagnosis based on 1000 simulations. Diagnosand estimates with bootstrapped standard errors in parentheses (100 replicates).
##
## Design Label Estimand Label Estimator Label Term N Sims Bias RMSE
## candle_design PATE estimator Z 1000 0.00 4.30
## (0.13) (0.10)
## Power
## 1.00
## (0.00)