1. we do not insist on the do() operator. You can invent your own. We insist only on supplementing the conditioning operator with one that will simulate action and will enable us to answer causal questions.

2. The Book of Why is full of toy examples. But I would be elated if Gelman and his students can just answer the Simpson’s question: Is the drug helpful or not for a person with unknown gender?. You can also find a bunch of toy examples in here:https://ucla.in/2mhxKdO or here: https://ucla.in/2N7S0K9

]]>Related: in addition to the causal hierarchy, I think Heckman’s three tasks in causal inference deserves more attention from the community. “The scientific model of causality” tasks are:

1.) Definitions of counterfactuals

2.) Identification from population distributions

3.) Identification from real data (estimation / hypothesis testing)

As far as I can tell, most of the structural causal model literature focuses on the first two tasks; most working statisticians focus on the third. The tasks are conflated in practice, exacerbating confusion. For example – as you have pointed out – propensity score matching is an efficient estimation technique (task 3), that is asymptotically correct (task 2), assuming that the appropriate conditional independence holds (task 1). However, many (most?) users of the potential outcomes notation do not recognize the need for causal models; the syntax is in use, but often without regard for the underlying semantics.

Because of the difficulty in using potential outcomes notation correctly, there is now a ‘history’ of a mathematical/statistical tool, that was supposed to permit rigorous causal analysis, that often produced incorrect results (due to the conditional independence assumptions being incorrect). Given the difficulty of determining whether or not, e.g. Y_x _||_ X | Z was correct for a certain analysis, statisticians often use conditioning as a substitute for intervention, which, for many problems, produces better results than, e.g. propensity score matching.

Compounding the issue is that it is often not immediately obvious how to represent interventions of interest with the do() notation. Gelman gives the example of a health study in which X is a patient’s weight. do(x) is an unrealistic intervention; a realistic intervention to change a patient’s weight would involve a probability distribution over several possible interventions of other variables (e.g. diet, exercise).

Finally, it’s often unclear how to analyze cyclic causal models (although, it’s my understanding that this is an active research area). Given the difficulty of performing rigorous causal inference with the do() operator, I understand why that many researchers choose to use conditioning as a substitute.

I don’t know what the solution is, but I think more examples of how not using do() can fail catastrophically, and examples/tools to make using do() easier could help.

]]>