# Causal Analysis in Theory and Practice

## June 1, 2019

### Graphical Models and Instrumental Variables

Filed under: Uncategorized — Judea Pearl @ 8:09 am

At the request of readers, we re-post below a previous comment from Bryant and Elias (2014) concerning the use of graphical models for determining whether a variable is a valid IV.

Following your exchange with Judea, we would like to present concrete examples of how graphical tools can help  determine whether a variable qualifies as an instrument. We use the example of  job training program which Imbens used in his paper on instrumental variables.

In this example, the goal is to estimate the effect of a training program (X) on earnings (Y). Imbens suggested  proximity (Z) as a possible instrument to assess the effect of X on Y. He then mentioned that the assumption that Z is independent of the potential outcomes {Yx} is a strong one, noting that this can be made more plausible by conditioning on covariates.

To illustrate how graphical models can be used  in determining the plausibility of the exclusion restriction, conditional on different covariates, let us consider the following scenarios.

Scenario 1. Suppose that the training program is located in the workplace. In this case, proximity (Z) may affect the numbers of hours  employees spend at the office (W) since they spend less time commuting, and this, in turn, may affect their earnings (Y).

Scenario 2. Suppose further that the efficiency of the workers (unmeasured) affects both the number of hours (W) and their salary (Y). (This is represented in the graph through the inclusion of a bidirected arrow between W and Y.)

Scenario 3. Suppose even further that this is a high-tech industry and workers can easily work from home. In this case, the number of hours spent at the office (W) has no effect on earnings (Y). (This is represented in the graph through the removal of the directed arrow from W to Y.)

Scenario 4. Finally, suppose that worker efficiency also affects whether they attend the program because less efficient workers are more likely to benefit from training. (This is represented in the graph through the inclusion of a bidirected arrow from W to X.)

The following figures correspond to the scenarios discussed above. The reasons we like to work with graphs on such problems is, first, we can represent these scenarios clearly and unambiguously and, second, we can derive the answer in each of these scenarios by inspection of the causal graphs. Here are our  answers: (We assume a linear model. For nonparametric, use LATE.)

Scenario 1.
Is the effect of X on Y identifiable? Yes
How? Using Z as an instrument conditioning on W and the effect is equal to r_{zy.w} / r_{zx.w}.
Testable implications? (W independent X given Z)

Scenario 2.
Is the effect of X on Y identifiable? No
How? n/a.
Testable implications? (W independent X given Z)

Scenario 3.
Is the effect of X on Y identifiable? Yes
How? Using Z as an instrument and the effect is equal to r_{zy} / r_{zx}.
Remark. Conditioning on W disqualifies Z as an instrument.
Testable implications? (W independent X given Z)

Scenario 4.
Is the effect of X on Y identifiable? Yes
How? Using Z as an instrument and the effect is equal to r_{zy} / r_{zx}.
Conditioning on W disqualifies Z as an instrument.
Testable implications?

In summary, the examples demonstrate Imben’s point that judging whether a variable (Z) qualifies as an instrument hinges on substantive assumptions underlying the problem being studied. Naturally, these assumptions follow from the causal story about the phenomenon under study. We believe graphs can be an attractive language to solve this type of problem for two reasons. First, it is a transparent representation in which researchers can express the causal story and discuss its plausibility. Second, as a formal representation of those assumptions, it allows us to apply mechanical procedures to evaluate the queries of interest. For example, whether a specific set Z qualifies as an instrument; whether there exists a set Z that qualifies as instrument; what are the testable implications of the causal story.

We hope the examples illustrate these points.
Bryant and Elias