Causal Analysis in Theory and Practice

January 9, 2019

Can causal inference be done in statistical vocabulary?

Filed under: Uncategorized — Judea Pearl @ 6:59 am

Andrew Gelman has just posted a review of The Book of Why (, my answer to some of his comments follows below:


The hardest thing for people to snap out of is the bubble of their own language. You say: “I find it baffling that Pearl and his colleagues keep taking statistical problems and, to my mind, complicating them by wrapping them in a causal structure (see, for example, here).” 

No way! and again: No way! There is no way to answer causal questions without snapping out of statistical vocabulary.  I have tried to demonstrate it to you in the past several years, but was not able to get you to solve ONE toy problem from beginning to end. 

This will remain a perennial stumbling block until one of your readers tries honestly to solve ONE toy problem from beginning to end. No links to books or articles, no naming of fancy statistical techniques, no global economics problems, just a simple causal question whose answer we know in advance. (e.g. take Simpson’s paradox: Which data should be consulted? The aggregated or the disaggregated?) 

Even this group of 73 Editors found it impossible, and have issued the following guidelines for reporting observational studies:

To readers of your blog: Please try it. The late Dennis Lindley was the only statistician I met who had the courage to admit:  “We need to enrich our language with a do-operator”. Try it, and you will see why he came to this conclusion, and perhaps you will also see why Andrew is unable to follow him.”


In his response to my comment above, Andrew Gelman suggested that we agree to disagree, since science is full of disagreements and there is lots of room for progress using different methods. Unfortunately, the need to enrich statistics with new vocabulary is a mathematical fact, not an opinion. This need cannot be resolved by “there are many ways to skin a cat” without snapping out of traditional statistical language and enriching it  with causal vocabulary.  Neyman-Rubin’s potential outcomes vocabulary is an example of such enrichment, since it goes beyond joint distributions of observed variables.

Andrew further refers us to three chapters in his book (with Jennifer Hill) on causal inference. I am craving instead for one toy problem, solved from assumptions to conclusions, so that we can follow precisely the roll played by the extra-statistical vocabulary, and why it is absolutely needed. The Book of Why presents dozen such examples, but readers would do well to choose their own.


  1. Why must it be the do() operator? It’s not necessary to get d-separation or its probabilistic implications, or many of the insights of causal inference, only it’s an elegant way to explain the Simpson paradox. Replacing the conditioning operator seems wrong.

    Comment by artifex — January 9, 2019 @ 7:27 am

  2. Where can we find a statement of the toy problem?

    Comment by Roy — January 9, 2019 @ 9:30 pm

  3. I think part of the disagreement stems from how you make a very strong claim (i.e. that the need for causal notation is a mathematical fact, not just another tool), the consequences of which are difficult to fully appreciate. Causality may be simple, but it is not *easy*.

    Related: in addition to the causal hierarchy, I think Heckman’s three tasks in causal inference deserves more attention from the community. “The scientific model of causality” tasks are:

    1.) Definitions of counterfactuals
    2.) Identification from population distributions
    3.) Identification from real data (estimation / hypothesis testing)

    As far as I can tell, most of the structural causal model literature focuses on the first two tasks; most working statisticians focus on the third. The tasks are conflated in practice, exacerbating confusion. For example – as you have pointed out – propensity score matching is an efficient estimation technique (task 3), that is asymptotically correct (task 2), assuming that the appropriate conditional independence holds (task 1). However, many (most?) users of the potential outcomes notation do not recognize the need for causal models; the syntax is in use, but often without regard for the underlying semantics.

    Because of the difficulty in using potential outcomes notation correctly, there is now a ‘history’ of a mathematical/statistical tool, that was supposed to permit rigorous causal analysis, that often produced incorrect results (due to the conditional independence assumptions being incorrect). Given the difficulty of determining whether or not, e.g. Y_x _||_ X | Z was correct for a certain analysis, statisticians often use conditioning as a substitute for intervention, which, for many problems, produces better results than, e.g. propensity score matching.

    Compounding the issue is that it is often not immediately obvious how to represent interventions of interest with the do() notation. Gelman gives the example of a health study in which X is a patient’s weight. do(x) is an unrealistic intervention; a realistic intervention to change a patient’s weight would involve a probability distribution over several possible interventions of other variables (e.g. diet, exercise).

    Finally, it’s often unclear how to analyze cyclic causal models (although, it’s my understanding that this is an active research area). Given the difficulty of performing rigorous causal inference with the do() operator, I understand why that many researchers choose to use conditioning as a substitute.

    I don’t know what the solution is, but I think more examples of how not using do() can fail catastrophically, and examples/tools to make using do() easier could help.

    Comment by Joshua Brulé — January 10, 2019 @ 7:18 am

  4. Dear Roy,
    1. we do not insist on the do() operator. You can invent your own. We insist only on supplementing the conditioning operator with one that will simulate action and will enable us to answer causal questions.

    2. The Book of Why is full of toy examples. But I would be elated if Gelman and his students can just answer the Simpson’s question: Is the drug helpful or not for a person with unknown gender?. You can also find a bunch of toy examples in here: or here:

    Comment by judea pearl — January 10, 2019 @ 7:53 am

RSS feed for comments on this post. TrackBack URI

Leave a comment

Powered by WordPress