Causal Analysis in Theory and Practice

July 24, 2016

External Validity and Extrapolations

Filed under: Generalizability,Selection Bias — bryantc @ 7:52 pm

Author: Judea Pearl

The July issue of the Proceedings of the National Academy of Sciences contains several articles on Causal Analysis in the age of Big Data, among them our (Bareinboim and Pearl’s) paper on data fusion and external validity. Several nuances of this problem were covered earlier on this blog under titles such as transportability, generalizability, extrapolation and selection-bias, see and

The PNAS paper has attracted the attention of the UCLA Newsroom which issued a press release with a very accessible description of the problem and its solution. You can find it here:

A few remarks:
I consider the mathematical solution of the external validity problem to be one of the real gems of modern causal analysis. The problem has its roots in the writings of 18th century demographers and its more recent awareness is usually associated with Campbell (1957) and Cook and Campbell (1979) writings on quasi-experiments. Our formal treatment of the problem using do-calculus has reduced it to a puzzle in logic and graph theory (see Bareinboim has further given this puzzle a complete algorithmic solution.

I said it is a gem because solving any problem instance gives me as much pleasure as solving a puzzle in ancient Greek geometry. It is in fact more fun than solving geometry problems, for two reasons.

First, when you stare at any external validity problem you do not have a clue whether it has or does not have a solution (i.e., whether an externally valid estimate exists or not) yet after a few steps of analysis — Eureka — the answer shines at you with clarity and says: “how could you have missed me?”. It is like communicating secretly with the oracle of Delphi, who whispers in your ears: “trisecting an angle?” forget it; “trisecting a line segment?” I will show you how. A miracle!

Second, while geometrical construction problems reside in the province of recreational mathematics, external validity is a serious matter; it has practical ramifications in every branch of science.

My invitation to readers of this blog: Anyone with intellectual curiosity and a thrill for mathematical discovery, please join us in the excitement over the mathematical solution of the external validity problem. Try it, and please send us your impressions.

It is hard for me to predict when scientists who critically need solutions to real-life extrapolation problems would come to recognize that an elegant and complete solution now exists for them. Most of these scientists (e.g., Campbell’s disciples) do not read graphs and cannot therefore heed my invitation. Locked in a graph-deprived vocabulary, they are left to struggle with meta-analytic techniques or opaque re-calibration routines (see waiting perhaps for a more appealing invitation to discover the availability of a solution to their problems.

It will be interesting to see how long it would take, in the age of internet.

July 19, 2012

A note posted by Elias Bareinboim

In the past week, I have been engaged in a discussion with Andrew Gelman and his blog readers regarding causal inference, selection bias, confounding, and generalizability. I was trying to understand how his method which he calls “hierarchical modelling” would handle these issues and what guarantees it provides. Unfortunately, I could not reach an understanding of Gelman’s method (probably because no examples were provided).

Still, I think that this discussion having touched core issues of scientific methodology would be of interest to readers of this blog, the link follows:

Previous discussions took place regarding Rubin and Pearl’s dispute, here are some interesting links:

If anyone understands how “hierarchical modeling” can solve a simple toy problem (e.g., M-bias, control of confounding, mediation, generalizability), please share with us.


January 8, 2012

The Match-Maker Paradox

Filed under: Discussion,Matching,Selection Bias — moderator @ 6:30 am

The following paradox was brought to our attention by Pablo Lardelli from Granada (Spain).

Pablo writes:

1. Imagine that you design a cohort study to assess the causal effect of X on Y, E[Y|do(X=x)]. Prior knowledge informs you that variable M is a possible confounder of the process X—>Y, which leads you to assume X<---M--->Y.

To adjust for the effect of this confounder, you decide to design a matched cohort study, matching on M non exposed to exposed. You know that matching breaks down the association between X and M in the sample.
The problem arises when you draw the DAG […] and realize that S is a collider on the path X—>S<---M and, since we are conditioning on S (because the study sample is restricted to S=1) we are in fact opening a non causal path between X and Y (through M) in the sample. But this stands in contradiction to everything we are told by our textbooks. Click here for full discussion of matching in DAGs, persistent-unfiathfulness and unit-to-unit interactions.

Powered by WordPress