Causal Analysis in Theory and Practice

August 2, 2017

2017 Mid-Summer Update

Filed under: Counterfactual,Discussion,Epidemiology — Andrew Forney @ 12:55 am

Dear friends in causality research,

Welcome to the 2017 Mid-summer greeting from the Ucla Causality Blog.

This greeting discusses the following topics:

1. “The Eight Pillars of Causal Wisdom” and the WCE 2017 Virtual Conference Website.
2. A discussion panel: “Advances in Deep Neural Networks”,
3. Comments on “The Tale Wagged by the DAG”,
4. A new book: “The book of Why”,
5. A new paper: Disjunctive Counterfactuals,
6. Causality in Education Award,
7. News on “Causal Inference: A  Primer”

1. “The Eight Pillars of Causal Wisdom”

The tenth annual West Coast Experiments Conference was held at UCLA on April 24-25, 2017, preceded by a training workshop  on April 23.

You will be pleased to know that the WCE 2017 Virtual Conference Website is now available here:
It provides videos of the talks as well as some of the papers and presentations.

The conference brought together scholars and graduate students in economics, political science and other social sciences who share an interest in causal analysis. Speakers included:

1. Angus Deaton, on Understanding and misunderstanding randomized controlled trials.
2. Chris Auld, on the on-going confusion between regression vs. structural equations in the econometric literature.
3. Clark Glymour, on Explanatory Research vs Confirmatory Research.
4. Elias Barenboim, on the solution to the External Validity problem.
5. Adam Glynn, on Front-door approaches to causal inference.
6. Karthika Mohan, on Missing Data from a causal modeling perspective.
7. Judea Pearl, on “The Eight Pillars of Causal Wisdom.”
8. Adnan Darwiche, on Model-based vs. Model-Blind Approaches to Artificial Intelligence.
9. Niall Cardin, Causal inference for machine learning.
10. Karim Chalak, Measurement Error without Exclusion.
11. Ed Leamer, “Causality Complexities Example: Supply and Demand.
12. Rosa Matzkin, “Identification is simultaneous equation.
13 Rodrigo Pinto, Randomized Biased-controlled Trials.

The video of my lecture “The Eight Pillars of Causal Wisdom” can be watched here:
A transcript of the talk can be found here:

2. “Advances in Deep Neural Networks”

As part of the its celebration of the 50 years of the Turing Award, the ACM has organized several discussion sessions on selected topics in computer science. I participated in a panel discussion on
“Advances in Deep Neural Networks”, which gave me an opportunity to share thoughts on whether learning methods based solely on data fitting can ever achieve a human-level intelligence. The discussion video can be viewed here:
A position paper that defends these thoughts is available here:

3. The Tale Wagged by the DAG

An article by this title, authored by Nancy Krieger and George Davey Smith has appeared in the International Journal of Epidemiology, IJE 2016 45(6) 1787-1808.
It is part of a special IJE issue on causal analysis which, for the reasons outlined below, should be of interest to readers of this blog.

As the title tell-tales us, the authors are unhappy with the direction that modern epidemiology has taken, which is too wedded to a two-language framework:
(1) Graphical models (DAGs) — to express what we know, and
(2) Counterfactuals (or potential outcomes) — to express what we wish to know.

The specific reasons for the authors unhappiness are still puzzling to me, because the article does not demonstrate concrete alternatives to current methodologies. I can only speculate however that it is the dazzling speed with which epidemiology has modernized its tools that lies behind the authors discomfort. If so, it would be safe for us to assume that the discomfort will subside as soon as researchers gain greater familiarity with the capabilities and flexibility of these new tools.  I nevertheless recommend that the article, and the entire special issue of IJE be studied by our readers, because they reflect an interesting soul-searching attempt by a forward-looking discipline to assess its progress in the wake of a profound paradigm shift.

Epidemiology, as I have written on several occasions, has been a pioneer in accepting the DAG-counterfactuals symbiosis as a ruling paradigm — way ahead of mainstream statistics and its other satellites. (The social sciences, for example, are almost there, with the exception of the model-blind branch of econometrics. See Feb. 22 2017 posting)

In examining the specific limitations that Krieger and Davey Smith perceive in DAGs, readers will be amused to note that these limitations coincide precisely with the strengths for which DAGs are praised.

For example, the article complains that DAGs provide no information about variables that investigators chose not to include in the model.  In their words: “the DAG does not provide a comprehensive picture. For example, it does not include paternal factors, ethnicity, respiratory infections or socioeconomic position…” (taken from the Editorial introduction). I have never considered this to be a limitation of DAGs or of any other scientific modelling. Quite the contrary. It would be a disaster if models were permitted to provide information unintended by the modeller. Instead, I have learned to admire the ease with which DAGs enable researchers to incorporate knowledge about new variables, or new mechanisms, which the modeller wishes
to embrace.

Model misspecification, after all,  is a problem that plagues every  exercise in causal inference, no matter what framework one chooses to adapt. It can only be cured by careful model-building
strategies, and by enhancing the modeller’s knowledge. Yet, when it comes to minimizing misspecification errors, DAGS have no match. The transparency with which DAGs display the causal assumptions in the model, and the ease with which the DAG identifies the testable implications of those assumptions are incomparable; these facilitate speedy model diagnosis and repair with no match in sight.

Or, to take another example, the authors call repeatedly for an ostensibly unavailable methodology which they label “causal triangulation” (it appears 19 times in the article). In their words: “In our field, involving dynamic populations of people in dynamic societies and ecosystems, methodical triangulation of diverse types of evidence from diverse types of study settings and involving diverse populations is essential.”  Ironically, however, the task of treating “diverse type of evidence from diverse populations” has been accomplished quite successfully in the dag-counterfactual framework. See, for example the formal and complete results of (Bareinbaum and Pearl, 2016, which have emerged from DAG-based perspective and invoke the do-calculus. (See also is inconceivable for me to imagine anyone pooling data from two different designs (say
experimental and observational) without resorting to DAGs or (equivalently) potential outcomes, I am open to learn.

Another conceptual paradigm which the authors hope would liberate us from the tyranny of DAGs and counterfactuals is Lipton’s (2004) romantic aspiration for “Inference to the Best Explanation.” It is a compelling, century old mantra, going back at least to Charles Pierce theory of abduction (Pragmatism and Pragmaticism, 1870) which, unfortunately, has never operationalized its key terms: “explanation,” “Best” and “inference to”.  Again, I know of only one framework in which this aspiration has been explicated with sufficient precision to produce tangible results — it is the structural framework of DAGs and counterfactuals. See, for example, Causes of Effects and Effects of Causes”
and Halpern and Pearl (2005) “Causes and explanations: A structural-model approach”

In summary, what Krieger and Davey Smith aspire to achieve by abandoning the structural framework has already been accomplished with the help and grace of that very framework.
More generally, what we learn from these examples is that the DAG-counterfactual symbiosis is far from being a narrow “ONE approach to causal inference” which ” may potentially lead to spurious causal inference” (their words). It is in fact a broad and flexible framework within which a plurality of tasks and aspirations can be formulated, analyzed and implemented. The quest for metaphysical alternatives is not warranted.

I was pleased to note that, by and large, commentators on Krieger and Davey Smith paper seemed to be aware of the powers and generality of the DAG-counterfactual framework, albeit not exactly for the reasons that I have described here. [footnote: I have many disagreements with the other commentators as well, but I wish to focus here on the TALE WAGGED DAG where the problems appear more glaring.] My talk on “The Eight Pillars of Causal Wisdom” provides a concise summary of those reasons and explains why I take the poetic liberty of calling these pillars “The Causal Revolution”

All in all, I believe that epidemiologists should be commended for the incredible progress they have made in the past two decades. They will no doubt continue to develop and benefit from the new tools that the DAG-counterfactual symbiosis has spawn. At the same time, I hope that the discomfort that Krieger and Davey Smith’s have expressed will be temporary and that it will inspire a greater understanding of the modern tools of causal inference.

Comments on this special issue of IJE are invited on this blog.

4. The Book of WHY

As some of you know, I am co-authoring another book, titled: “The Book of Why: The new science of cause and effect”. It will attempt to present the eight pillars of causal wisdom to the general public using words, intuition and examples to replace equations. My co-author is science writer Dana MacKenzie ( and our publishing house is Basic Books. If all goes well, the book will see your shelf by March 2018. Selected sections will appear periodically on this blog.

5. Disjunctive Counterfactuals

The structural interpretation of counterfactuals as formulated in Balke and Pearl (1994) excludes  disjunctive conditionals, such as “had X been x1 or x2”, as well as disjunctive actions such as do(X=x1 or X=x2).  In contrast, the closest-world interpretation of Lewis ( 1973) assigns truth values to all counterfactual sentences, regardless of the logical form of the antecedant. The next issue of the Journal of Causal Inference will include a paper that extends the vocabulary of structural counterfactuals with disjunctions, and clarifies the assumptions needed for the extension. An advance copy can be viewed here:

6.  ASA Causality in Statistics Education Award

Congratulations go to Ilya Shpitser, Professor of Computer Science at Johns Hopkins University, who is the 2017 recipient of the ASA Causality in Statistics Education Award.  Funded by Microsoft Research and Google, the $5,000 Award, will be presented to Shpitser at the 2017 Joint Statistical Meetings (JSM 2017) in Baltimore.

Professor Shpitser has developed Masters level graduate course material that takes causal inference from the ivory towers of research to the level of students with a machine learning and data science background. It combines techniques of graphical and counterfactual models and provides both an accessible coverage of the field and excellent conceptual, computational and project-oriented exercises for students.

These winning materials and those of the previous Causality in Statistics Education Award winners are available to download online at

Information concerning nominations, criteria and previous winners can be viewed here:
and here:

7. News on “Causal Inference: A Primer”

Wiley, the publisher of our latest book “Causal Inference in Statistics: A Primer” (2016, Pearl, Glymour and Jewell) is informing us that the book is now in its 4th printing, corrected for all the errors we (and others) caught since the first publications. To buy a corrected copy, make sure you get the “4th “printing”. The trick is to look at the copyright page and make sure
the last line reads: 10 9 8 7 6 5 4

If you already have a copy, look up our errata page,
where all corrections are marked in red. The publisher also tells us the the Kindle version is much improved. I hope you concur.

Happy Summer-end, and may all your causes
produce healthy effects.

October 27, 2014

Are economists smarter than epidemiologists? (Comments on Imbens’s recent paper)

Filed under: Discussion,Economics,Epidemiology,General — eb @ 4:45 pm

In a recent survey on Instrumental Variables (link), Guido Imbens fleshes out the reasons why some economists “have not felt that graphical models have much to offer them.”

His main point is: “In observational studies in social science, both these assumptions [exogeneity and exclusion] tend to be controversial. In this relatively simple setting [3-variable IV setting] I do not see the causal graphs as adding much to either the understanding of the problem, or to the analyses.” [page 377]

What Imbens leaves unclear is whether graph-avoiding economists limit themselves to “relatively simple settings” because, lacking graphs, they cannot handle more than 3 variables, or do they refrain from using graphs to prevent those “controversial assumptions” from becoming transparent, hence amenable to scientific discussion and resolution.

When students and readers ask me how I respond to people of Imbens’s persuasion who see no use in tools they vow to avoid, I direct them to the post “The deconstruction of paradoxes in epidemiology”, in which Miquel Porta describes the “revolution” that causal graphs have spawned in epidemiology. Porta observes: “I think the “revolution — or should we just call it a renewal”? — is deeply changing how epidemiological and clinical research is conceived, how causal inferences are made, and how we assess the validity and relevance of epidemiological findings.”

So, what is it about epidemiologists that drives them to seek the light of new tools, while economists (at least those in Imbens’s camp) seek comfort in partial blindness, while missing out on the causal revolution? Can economists do in their heads what epidemiologists observe in their graphs? Can they, for instance, identify the testable implications of their own assumptions? Can they decide whether the IV assumptions (i.e., exogeneity and exclusion) are satisfied in their own models of reality? Of course the can’t; such decisions are intractable to the graph-less mind. (I have challenged them repeatedly to these tasks, to the sound of a pin-drop silence)

Or, are problems in economics different from those in epidemiology? I have examined the structure of typical problems in the two fields, the number of variables involved, the types of data available, and the nature of the research questions. The problems are strikingly similar.

I have only one explanation for the difference: Culture.

The arrow-phobic culture started twenty years ago, when Imbens and Rubin (1995) decided that graphs “can easily lull the researcher into a false sense of confidence in the resulting causal conclusions,” and Paul Rosenbaum (1995) echoed with “No basis is given for believing” […] “that a certain mathematical operation, namely this wiping out of equations and fixing of variables, predicts a certain physical reality” [ See discussions here. ]

Lingering symptoms of this phobia are still stifling research in the 2nd decade of our century, yet are tolerated as scientific options. As Andrew Gelman put it last month: “I do think it is possible for a forward-looking statistician to do causal inference in the 21st century without understanding graphical models.” (link)

I believe the most insightful diagnosis of the phenomenon is given by Larry Wasserman:
“It is my impression that the “graph people” have studied the Rubin approach carefully while the reverse is not true.” (link)

November 25, 2012

Conrad (Ontario/Canada) on SEM in Epidemiology

Filed under: Counterfactual,Epidemiology,structural equations — moderator @ 4:00 am

Conrad writes:

In the recent issue of IJE (, Tyler VanderWeele argues that SEM should be used in Epidemiology only when 1) the interest is on a wide range of effects 2) the purpose of the analysis is to generate hypothesis. However if the interest is on a single fixed exposure, he thinks traditional regression methods are more superior.

According to him, the latter relies on fewer assumptions e.g. we don’t need to know the functional form of the association between a confounder and exposure (or outcome) during estimation, and hence are less prone to bias. How valid is this argument given that some of (if not all) the causal modeling methods are simply a special case of SEM (e.g. the Robin’s G methods and even the regression methods he’s talking about).

Judea replies:

Dear Conrad,

Thank you for raising these questions about Tyler’s article. I believe several of Tyler’s statements stand the risk of being misinterpreted by epidemiologists, for they may create the impression that the use of SEM, including its nonparametric variety, is somehow riskier than the use of other techniques. This is not the case. I believe Tyler’s critics were aimed specifically at parametric SEM, such as those used in Arlinghaus etal (2012), but not at nonparametric SEMs which he favors and names “causal diagrams”. Indeed, nonparametric SEM’s are blessed with unequal transparency to assure that each and every assumption is visible and passes the scrutiny of scientific judgment.

While it is true that SEMs have the capacity to make bolder assumptions, some not discernible from experiments, (e.g., no confounding between mediator and outcome) this does not mean that investigators, acting properly, would make such assumptions when they stand contrary to scientific judgment, nor does it mean that investigators are under weaker protection from the ramifications of unwarranted assumptions. Today we know precisely which of SEM’s claims are discernible from experiments (i.e., reducible to do(x) expressions) and which are not (see Shpitser and Pearl, 2008)

I therefore take issue with Tyler’s statement: “SEMs themselves tend to make much stronger assumptions than these other techniques” (from his abstract) when applied to nonparametric analysis. SEMs do not make assumptions, nor do they “tend to make assumptions”; investigators do. I am inclined to believe that Tyler’s critics were aims at a specific application of SEM rather than SEM as a methodology.

Purging SEM from epidemiology would amount to purging counterfactuals from epidemiology — the latter draws its legitimacy from the former.

I also reject occasional calls to replace SEM and Causal Diagrams with weaker types of graphical models which presumably make weaker assumptions. No matter how we label alternative models (e.g., interventional graphs, agnostic graphs, causal Bayesian networks, FFRCISTG models, influence diagrams, etc.), they all must rest on judgmental assumptions and people think science (read SEM), not experiments. In other words, when an investigators asks him/herself whether an arrow from X to Y is warranted, the investigator does not ask whether an intervention on X would change the probability of Y (read: P(y|do(x)) = P(y)) but whether the function f in the mechanism y=f(x, u) depends on x for some u. Claims that the stronger assumptions made by SEMs (compared with interventional graphs) may have unintended consequences are supported by a few contrived cases where people can craft a nontrivial f(x,u) despite the equality P(y|do(x)) = P(y)). (See an example in Causality page 24.)

For a formal distinction between SEM and interventional graphs (also known as “Causal Bayes networks”, see Causality pages 23-24, 33-36). For more philosophical discussions defending counterfactuals and SEM against false alarms see:

I hope this help clarify the issue.

February 22, 2007

Back-door criterion and epidemiology

Filed under: Back-door criterion,Book (J Pearl),Epidemiology — moderator @ 9:03 am

The definition of the back-door condition (Causality, page 79, Definition 3.3.1) seems to be contrived. The exclusion of descendants of X (Condition (i)) seems to be introduced as an after fact, just because we get into trouble if we dont. Why cant we get it from first principles; first define sufficiency of Z in terms of the goal of removing bias and, then, show that, to achieve this goal, you neither want nor need descendants of X in Z.

Powered by WordPress