Causal Analysis in Theory and Practice

January 4, 2023

Causal Inference (CI) − A year in review

2022 has witnessed a major upsurge in the status of CI, primarily in its general recognition as an independent and essential component in every aspect of intelligent decision making. Visible evidence of this recognition were several prestigious prizes awarded explicitly to CI-related research accomplishments. These include (1) the Nobel Prize in economics, awarded to David Card, Joshua Angrist, and Guido Imbens for their works on cause and effect relations in natural experiments (2) The BBVA Frontiers of Knowledge Award to Judea Pearl for “laying the foundations of modern AI” and (3) The Rousseeuw Prize for Statistics to Jamie Robins, Thomas Richardson, Andrea Rotnitzky, Miguel Hernán, and Eric Tchetchgen Tchetchgen, for their “pioneering work on Causal Inference with applications in Medicine and Public Health”
My acceptance speech at the BBVA award can serve as a gentle summary of the essence of causal inference, its basic challenges and major achievements:
It is not a secret that I have been critical of the approach Angrist and Imbens are taking in econometrics, for reasons elaborated here, and mainly here I nevertheless think that their selection to receive the Nobel Prize in economics is a positive step for CI, in that it calls public attention to the problems that CI is trying to solve and will eventually inspire curious economists to seek a more broad-minded approach to these problems, so as to leverage the full arsenal of tools that CI has developed.
Coupled with these highlights of recognition, 2022 has seen a substantial increase in CI activities on both the academic and commercial fronts. The number of citations to CI related articles has reached a record high of over 10,200 citations in 2022, , showing positive derivatives in all CI categories. Dozens, if not hundreds of seminars, workshops and symposia have been organized in major conferences to disseminate progress in CI research. New results on individualized decision making were prominently featured in these meetings (e.g., Several commercial outfits have come up with platforms for CI in their areas of specialization, ranging from healthcare to finance and marketing. (Company names such as #causallense, and Vianai Systems come to mind:, Naturally, these activities have led to increasing demands for trained researchers and educators, versed in the tools of CI; jobs openings explicitly requiring experience in CI have become commonplace in both industry and academia.
I am also happy to see CI becoming an issue of contention in AI and Machine Learning (ML), increasingly recognized as an essential capability for human-level AI and, simultaneously, raising the question of whether the data-fitting methodologies of Big Data and Deep Learning could ever acquire these capabilities. In I’ve answered this question in the negative, though various attempts to dismiss CI as a species of “inductive bias” (e.g., or “missing data problem” (e.g., are occasionally being proposed as conceptualizations that could potentially replace the tools of CI. The Ladder of Causation tells us what extra-data information would be required to operationalize such metaphorical aspirations.
Researchers seeking a gentle introduction to CI are often attracted to multi-disciplinary forums or debates, where basic principles are compiled and where differences and commonalities among various approaches are compared and analyzed by leading researchers. Not many such forums were published in 2022, perhaps because the differences and commonalities are now well understood or, as I tend to believe, CI and its Structural Causal Model (SCM) unifies and embraces all other approaches. I will describe two such forums in which I participated.
(1) In March of 2022, the Association for Computing Machinery (ACM) has published an anthology containing highlights of my works (1980-2020) together with commentaries and critics from two dozens authors, representing several disciplines. The Table of Content can be seen here: It includes 17 of my most popular papers, annotated for context and scope, followed by 17 contributed articles of colleagues and critics. The ones most relevant to CI in 2022 are in Chapters 21-26.
Among these, I consider the causal resolution of Simpson’s paradox (Chapter 22, to be one of the crown achievements of CI. The paradox lays bare the core differences between causal and statistical thinking, and its resolution brings an end to a century of debates and controversies by the best philosophers of our time. It is also related to Lord’s Paradox (see − a qualitative version of Simpson’s Paradox which became a focus of endless debates with statisticians and trialists throughout 2022 (on Twitter @yudapearl). I often cite Simpson’s paradox as a proof that our brain is governed by causal, not statistical, calculus.
This question − causal or statistical brain − is not a cocktail party conversation but touches on the practical question of choosing an appropriate language for casting the knowledge necessary for commencing any CI exercise. Philip Dawid − a proponent of counterfactual-free statistical languages − has written a critical essay on the topic ( and my counterfactual-based rebuttal,, clarifies the issues involved.
(2) The second forum of inter-disciplinary discussions can be found in a special issue of the Journal Observational Studies (edited by Ian Shrier, Russell Steele, Tibor Schuster and Mireille Schnitzer) in a form of interviews with Don Rubin, Jamie Robins, James Heckman and myself.
In my interview,, I compiled aspects of CI that I normally skip in scholarly articles. These include historical perspectives of the development of CI, its current state of affairs and, most importantly for our purpose, the lingering differences between CI and other frameworks. I believe that this interview provides a fairly concise summary of these differences, which have only intensified in 2022.
Most disappointing to me are the graph-avoiding frameworks of Rubin, Angrist, Imbens and Heckman, which still dominate causal analysis in economics and some circles of statistics and social science. The reasons for my disappointments are summarized in the following paragraph:
Graphs are new mathematical objects, unfamiliar to most researchers in the statistical sciences, and were of course rejected as “non-scientific ad-hockery” by top leaders in the field [Rubin, 2009]. My attempts to introduce causal diagrams to statistics [Pearl, 1995; Pearl, 2000] have taught me that inertial forces play at least as strong a role in science as they do in politics. That is the reason that non-causal mediation analysis is still practiced in certain circles of social science [Hayes, 2017], “ignorability” assumptions still dominate large islands of research [Imbens and Rubin, 2015], and graphs are still tabooed in the econometric literature [Angrist and Pischke, 2014]. While most researchers today acknowledge the merits of graph as a transparent language for articulating scientific information, few appreciate the computational role of graphs as “reasoning engines,” namely, bringing to light the logical ramifications of the information used in their construction. Some economists even go to great pains to suppress this computational miracle [Heckman and Pinto, 2015; Pearl, 2013].
My disagreements with Heckman go back to 2007 when he rejected the do-operator for metaphysical reasons (see and then to 2013, when he celebrated the do-operator after renaming it “fixing” but remained in denial of d-separation (see In this denial he retreated 3 decades in time while castrating graphs from their inferential power. Heckman’s 2022 interview in Observational Studies continues his on-going crusade to prove that econometrics has nothing to learn from neighboring fields. His fundamental mistake lies in assuming that the rules of do-calculus lie “outside of formal statistics”; they are in fact logically derivable from formal statistics, REGARDLESS of our modeling assumptions but (much like theorems in geometry) once established, save us the labor of going back to the basic axioms.
My differences with Angrist, Imbens and Rubin go even deeper (see, for they involve not merely the avoidance of graphs but also the First Law of Causal Inference ( hence issues of transparency and credibility. These differences are further accentuated in Imbens’s Nobel lecture which treats CI as a computer science creation, irrelevant to “credible” econometric research. In, as well as in my book Causality, I present dozens of simple problems that economists need, but are unable to solve, lacking the tools of CI.
It is amazing to watch leading researchers, in 2022, still resisting the benefits of CI while committing their respective fields to the tyranny of outdatedness.
To summarize, 2022 has seen an unprecedented upsurge in CI popularity, activity and stature. The challenge of harnessing CI tools to solve critical societal problems will continue to inspire creative researchers from all fields, and the aspirations of advancing towards human-level artificial intelligence will be pursued with an accelerated pace in 2023.

Wishing you a productive new year,


January 29, 2020

On Imbens’s Comparison of Two Approaches to Empirical Economics

Filed under: Counterfactual,d-separation,DAGs,do-calculus,Imbens — judea @ 11:00 pm

Many readers have asked for my reaction to Guido Imbens’s recent paper, titled, “Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics,” arXiv.19071v1 [stat.ME] 16 Jul 2019.

The note below offers brief comments on Imbens’s five major claims regarding the superiority of potential outcomes [PO] vis a vis directed acyclic graphs [DAGs].

These five claims are articulated in Imbens’s introduction (pages 1-3). [Quoting]:

” … there are five features of the PO framework that may be behind its current popularity in economics.”

I will address them sequentially, first quoting Imbens’s claims, then offering my counterclaims.

I will end with a comment on Imbens’s final observation, concerning the absence of empirical evidence in a “realistic setting” to demonstrate the merits of the DAG approach.

Before we start, however, let me clarify that there is no such thing as a “DAG approach.” Researchers using DAGs follow an approach called  Structural Causal Model (SCM), which consists of functional relationships among variables of interest, and of which DAGs are merely a qualitative abstraction, spelling out the arguments in each function. The resulting graph can then be used to support inference tools such as d-separation and do-calculus. Potential outcomes are relationships derived from the structural model and several of their properties can be elucidated using DAGs. These interesting relationships are summarized in chapter 7 of (Pearl, 2009a) and in a Statistical Survey overview (Pearl, 2009c)

Imbens’s Claim # 1
“First, there are some assumptions that are easily captured in the PO framework relative to the DAG approach, and these assumptions are critical in many identification strategies in economics. Such assumptions include
monotonicity ([Imbens and Angrist, 1994]) and other shape restrictions such as convexity or concavity ([Matzkin et al.,1991, Chetverikov, Santos, and Shaikh, 2018, Chen, Chernozhukov, Fernández-Val, Kostyshak, and Luo, 2018]). The instrumental variables setting is a prominent example, and I will discuss it in detail in Section 4.2.”

Pearl’s Counterclaim # 1
It is logically impossible for an assumption to be “easily captured in the PO framework” and not simultaneously be “easily captured” in the “DAG approach.” The reason is simply that the latter embraces the former and merely enriches it with graph-based tools. Specifically, SCM embraces the counterfactual notation Yx that PO deploys, and does not exclude any concept or relationship definable in the PO approach.

Take monotonicity, for example. In PO, monotonicity is expressed as

Yx (u) ≥ Yx’ (u) for all u and all x > x’

In the DAG approach it is expressed as:

Yx (u) ≥ Yx’ (u) for all u and all x > x’

(Taken from Causality pages 291, 294, 398.)

The two are identical, of course, which may seem surprising to PO folks, but not to DAG folks who know how to derive the counterfactuals Yx from structural models. In fact, the derivation of counterfactuals in
terms of structural equations (Balke and Pearl, 1994) is considered one of the fundamental laws of causation in the SCM framework see (Bareinboim and Pearl, 2016) and (Pearl, 2015).

Imbens’s Claim # 2
“Second, the potential outcomes in the PO framework connect easily to traditional approaches to economic models such as supply and demand settings where potential outcome functions are the natural primitives. Related to this, the insistence of the PO approach on manipulability of the causes, and its attendant distinction between non-causal attributes and causal variables has resonated well with the focus in empirical work on policy relevance ([Angrist and Pischke, 2008, Manski, 2013]).”

Pearl’s Counterclaim #2
Not so. The term “potential outcome” is a late comer to the economics literature of the 20th century, whose native vocabulary and natural primitives were functional relationships among variables, not potential outcomes. The latters are defined in terms of a “treatment assignment” and hypothetical outcome, while the formers invoke only observable variables like “supply” and “demand”. Don Rubin cited this fundamental difference as sufficient reason for shunning structural equation models, which he labeled “bad science.”

While it is possible to give PO interpretation to structural equations, the interpretation is both artificial and convoluted, especially in view of PO insistence on manipulability of causes. Haavelmo, Koopman and Marschak would not hesitate for a moment to write the structural equation:

Damage = f (earthquake intensity, other factors).

PO researchers, on the other hand, would spend weeks debating whether earthquakes have “treatment assignments” and whether we can legitimately estimate the “causal effects” of earthquakes. Thus, what Imbens perceives as a helpful distinction is, in fact, an unnecessary restriction that suppresses natural scientific discourse. See also (Pearl, 2018; 2019).

Imbens’s Claim #3
“Third, many of the currently popular identification strategies focus on models with relatively few (sets of) variables, where identification questions have been worked out once and for all.”

Pearl’s Counterclaim #3

First, I would argue that this claim is actually false. Most IV strategies that economists use are valid “conditional on controls” (see examples listed in Imbens (2014))  and the criterion that distinguishes “good controls” from “bad controls” is not trivial to articulate without the help of graphs. (See, A Crash Course in Good and Bad Control). It can certainly not be discerned “once and for all”.

Second, even if economists are lucky to guess “good controls,” it is still unclear whether they focus  on relatively few variables because, lacking graphs, they cannot handle more variables, or do they refrain from using graphs to hide the opportunities missed by focusing on few pre-fabricated, “once and for all” identification strategies.

I believe both apprehensions play a role in perpetuating the graph-avoiding subculture among economists. I have elaborated on this question here: (Pearl, 2014).

Imbens’s Claim # 4
“Fourth, the PO framework lends itself well to accounting for treatment effect heterogeneity in estimands ([Imbens and Angrist, 1994, Sekhon and Shem-Tov, 2017]) and incorporating such heterogeneity in estimation and the design of optimal policy functions ([Athey and Wager, 2017, Athey, Tibshirani, Wager, et al., 2019, Kitagawa and Tetenov, 2015]).”

Pearl’s Counterclaim #4
Indeed, in the early 1990s, economists felt ecstatic liberating themselves from the linear tradition of structural equation models and finding a framework (PO) that allowed them to model treatment effect heterogeneity.

However, whatever role treatment heterogeneity played in this excitement should have been amplified ten-fold in 1995, when completely non parametric structural equation models came into being, in which non-linear interactions and heterogeneity were assumed a priori. Indeed, the tools developed in the econometric literature cover only a fraction of the treatment-heterogeneity tasks that are currently managed by SCM. In particular, the latter includes such problems as “necessary and sufficient” causation, mediation, external validity, selection bias and more.

Speaking more generally, I find it odd for a discipline to prefer an “approach” that rejects tools over one that invites and embraces tools.

Imbens’s claim #5
“Fifth, the PO approach has traditionally connected well with design, estimation, and inference questions. From the outset Rubin and his coauthors provided much guidance to researchers and policy makers for practical implementation including inference, with the work on the propensity score ([Rosenbaum and Rubin, 1983b]) an influential example.”

Pearl’s Counterclaim #5
The initial work of Rubin and his co-authors has indeed provided much needed guidance to researchers and policy makers who were in a state of desperation, having no other mathematical notation to express causal questions of interest. That happened because economists were not aware of the counterfactual content of structural equation models, and of the non-parametric extension of those models.

Unfortunately, the clumsy and opaque notation introduced in this initial work has become a ritual in the PO framework that has prevailed, and the refusal to commence the analysis with meaningful assumptions has led to several blunders and misconceptions. One such misconception has been propensity score analysis which researchers have taken as a tool for reducing confounding bias. I have elaborated on this misguidance in Causality, Section 11.3.5, “Understanding Propensity Scores” (Pearl, 2009a).

Imbens’s final observation: Empirical Evidence
“Separate from the theoretical merits of the two approaches, another reason for the lack of adoption in economics is that the DAG literature has not shown much evidence of the benefits for empirical practice in settings that are important in economics. The potential outcome studies in MACE, and the chapters in [Rosenbaum, 2017], CISSB and MHE have detailed empirical examples of the various identification strategies proposed. In realistic settings they demonstrate the merits of the proposed methods and describe in detail the corresponding estimation and inference methods. In contrast in the DAG literature, TBOW, [Pearl, 2000], and [Peters, Janzing, and Schölkopf, 2017] have no substantive empirical examples, focusing largely on identification questions in what TBOW refers to as “toy” models. Compare the lack of impact of the DAG literature in economics with the recent embrace of regression discontinuity designs imported from the psychology literature, or with the current rapid spread of the machine learning methods from computer science, or the recent quick adoption of synthetic control methods [Abadie, Diamond, and Hainmueller, 2010]. All came with multiple concrete examples that highlighted their benefits over traditional methods. In the absence of such concrete examples the toy models in the DAG literature sometimes appear to be a set of solutions in search of problems, rather than a set of solutions for substantive problems previously posed in social sciences.”

Pearl’s comments on: Empirical Evidence
There is much truth to Imbens’s observation. The PO excitement that swept natural experimentalists in the 1990s came with outright rejection of graphical models. The hundreds, if not thousands, of empirical economists who plunged into empirical work, were warned repeatedly that graphical models may be “ill-defined,” “deceptive,” and “confusing,” and structural models have no scientific underpinning (see (Pearl, 19952009b)). Not a single paper in the econometric literature has acknowledged the existence of SCM as an alternative or complementary approach to PO.

The result has been the exact opposite of what has taken place in epidemiology where DAGs became a second language to both scholars and field workers, [Due in part to the influential 1999 paper by Greenland, Pearl and Robins.] In contrast, PO-led economists have launched a massive array of experimental programs lacking graphical tools for guidance. I would liken it to a Phoenician armada exploring the Atlantic coast in leaky boats and no compass to guide its way.

This depiction might seem pretentious and overly critical, considering the pride with which natural experimentalists take in the results of their studies (though no objective verification of validity can be undertaken.) Yet looking back at the substantive empirical examples listed by Imbens, one cannot but wonder how much more credible those studies could have been with graphical tools to guide the way. These include a friendly language to communicate assumptions, powerful means to test their implications, and ample opportunities to uncover new natural experiments (Brito and Pearl, 2002).

Summary and Recommendation 

The thrust of my reaction to Imbens’s article is simple:

It is unreasonable to prefer an “approach” that rejects tools over one that invites and embraces tools.

Technical comparisons of the PO and SCM approaches, using concrete examples, have been published since 1993 in dozens of articles and books in computer science, statistics, epidemiology, and social science, yet none in the econometric literature. Economics students are systematically deprived of even the most elementary graphical tools available to other researchers, for example, to determine if one variable is independent of another given a third, or if a variable is a valid IV given a set S of observed variables.

This avoidance can no longer be justified by appealing to “We have not found this [graphical] approach to aid the drawing of causal inferences” (Imbens and Rubin, 2015, page 25).

To open an effective dialogue and a genuine comparison between the two approaches, I call on Professor Imbens to assume leadership in his capacity as Editor in Chief of Econometrica and invite a comprehensive survey paper on graphical methods for the front page of his Journal. This is how creative editors move their fields forward.

Balke, A. and Pearl, J. “Probabilistic Evaluation of Counterfactual Queries,” In Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, Volume I, 230-237, July 31 – August 4, 1994.

Brito, C. and Pearl, J. “General instrumental variables,” In A. Darwiche and N. Friedman (Eds.), Uncertainty in Artificial Intelligence, Proceedings of the Eighteenth Conference, Morgan Kaufmann: San Francisco, CA, 85-93, August 2002.

Bareinboim, E. and Pearl, J. “Causal inference and the data-fusion problem,” Proceedings of the National Academy of Sciences, 113(27): 7345-7352, 2016.

Greenland, S., Pearl, J., and Robins, J. “Causal diagrams for epidemiologic research,” Epidemiology, Vol. 1, No. 10, pp. 37-48, January 1999.

Imbens, G. “Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics,” arXiv.19071v1 [stat.ME] 16 Jul 2019.

Imbens, G. and Rubin, D. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge, MA: Cambridge University Press; 2015.

Imbens, Guido W. Instrumental Variables: An Econometrician’s Perspective. Statist. Sci. 29 (2014), no. 3, 323–358. doi:10.1214/14-STS480.

Pearl, J. “Causal diagrams for empirical research,” (With Discussions), Biometrika, 82(4): 669-710, 1995.

Pearl, J. “Understanding Propensity Scores” in J. Pearl’s Causality: Models, Reasoning, and Inference, Section 11.3.5, Second edition, NY: Cambridge University Press, pp. 348-352, 2009a.

Pearl, J. “Myth, confusion, and science in causal analysis,” University of California, Los Angeles, Computer Science Department, Technical Report R-348, May 2009b.

Pearl, J. “Causal inference in statistics: An overview”  Statistics Surveys, Vol. 3, 96–146, 2009c.

Pearl, J. “Are economists smarter than epidemiologists? (Comments on Imbens’s recent paper),” Causal Analysis in Theory and Practice Blog, October 27, 2014.

Pearl, J. “Trygve Haavelmo and the Emergence of Causal Calculus,” Econometric Theory, 31: 152-179, 2015.

Pearl, J. “Does obesity shorten life? Or is it the Soda? On non-manipulable causes,” Journal of Causal Inference, Causal, Casual, and Curious Section, 6(2), online, September 2018.

Pearl, J. “On the interpretation of do(x),” Journal of Causal Inference, Causal, Casual, and Curious Section, 7(1), online, March 2019.

Powered by WordPress