Causal Analysis in Theory and Practice

February 22, 2017

Winter-2017 Greeting from UCLA Causality Blog

Filed under: Announcement,Causal Effect,Economics,Linear Systems — bryantc @ 6:03 pm

Dear friends in causality research,

In this brief greeting I would like to first call attention to an approaching deadline and then discuss a couple of recent articles.

Causality in Education Award – March 1, 2017

We are informed that the deadline for submitting a nomination for the ASA Causality in Statistics Education Award is March 1, 2017. For purpose, criteria and other information please see .

The next issue of the Journal of Causal Inference (JCI) is schedule to appear March, 2017. See

MY contribution to this issue includes a tutorial paper entitled: “A Linear ‘Microscope’ for Interventions and Counterfactuals”. An advance copy can be viewed here:

Overturning Econometrics Education (or, do we need a “causal interpretation”?)

My attention was called to a recent paper by Josh Angrist and Jorn-Steffen Pischke titled: “Undergraduate econometrics instruction” (A NBER working paper)

This paper advocates a pedagogical paradigm shift that has methodological ramifications beyond econometrics instruction; As I understand it, the shift stands contrary to the traditional teachings of causal inference, as defined by Sewall Wright (1920), Haavelmo (1943), Marschak (1950), Wold (1960), and other founding fathers of econometrics methodology.

In a nut shell, Angrist and Pischke  start with a set of favorite statistical routines such as IV, regression, differences-in-differences among others, and then search for “a set of control variables needed to insure that the regression-estimated effect of the variable of interest has a causal interpretation”. Traditional causal inference (including economics) teaches us that asking whether the output of a statistical routine “has a causal interpretation” is the wrong question to ask, for it misses the direction of the analysis. Instead, one should start with the target causal parameter itself, and asks whether it is ESTIMABLE (and if so how), be it by IV, regression, differences-in-differences, or perhaps by some new routine that is yet to be discovered and ordained by name. Clearly, no “causal interpretation” is needed for parameters that are intrinsically causal; for example, “causal effect”, “path coefficient”, “direct effect”, “effect of treatment on the treated”, or “probability of causation”.

In practical terms, the difference between the two paradigms is that estimability requires a substantive model while interpretability appears to be model-free. A model exposes its assumptions explicitly, while statistical routines give the deceptive impression that they run assumptions-free (hence their popular appeal). The former lends itself to judgmental and statistical tests, the latter escapes such scrutiny.

In conclusion, if an educator needs to choose between the “interpretability” and “estimability” paradigms, I would go for the latter. If traditional econometrics education
is tailored to support the estimability track, I do not believe a paradigm shift is warranted towards an “interpretation seeking” paradigm as the one proposed by Angrist and Pischke,

I would gladly open this blog for additional discussion on this topic.

I tried to post a comment on NBER (National Bureau of Economic Research), but was rejected for not being an approved “NBER family member”. If any of our readers is a “”NBER family member” feel free to post the above. Note: “NBER working papers are circulated for discussion and comment purposes.” (page 1).

November 9, 2014

Causal inference without graphs

Filed under: Counterfactual,Discussion,Economics,General — moderator @ 3:45 am

In a recent posting on this blog, Elias and Bryant described how graphical methods can help decide if a pseudo-randomized variable, Z, qualifies as an instrumental variable, namely, if it satisfies the exogeneity and exclusion requirements associated with the definition of an instrument. In this note, I aim to describe how inferences of this type can be performed without graphs, using the language of potential outcome. This description should give students of causality an objective comparison of graph-less vs. graph-based inferences. See my exchange with Guido Imbens [here].

Every problem of causal inference must commence with a set of untestable, theoretical assumptions that the modeler is prepared to defend on scientific grounds. In structural modeling, these assumptions are encoded in a causal graph through missing arrows and missing latent variables. Graphless methods encode these same assumptions symbolically, using two types of statements:

1. Exclusion restrictions, and
2. Conditional independencies among observable and potential outcomes.

For example, consider the causal Markov chain which represents the structural equations:

with and being omitted factors such that X, , are mutually independent.

These same assumptions can also be encoded in the language of counterfactuals, as follows:

(3) represents the missing arrow from X to Z, and (4)-(6) convey the mutual independence of X, , and .
[Remark: General rules for translating graphical models to counterfactual notation are given in Pearl (2009, pp. 232-234).]

Assume now that we are given the four counterfactual statements (3)-(6) as a specification of a model; What machinery can we use to answer questions that typically come up in causal inference tasks? One such question is, for example, is the model testable? In other words, is there an empirical test conducted on the observed variables X, Y, and Z that could prove (3)-(6) wrong? We note that none of the four defining conditions (3)-(6) is testable in isolation, because each invokes an unmeasured counterfactual entity. On the other hand, the fact the equivalent graphical model advertises the conditional independence of X and Z given Y, X _||_ Z | Y, implies that the combination of all four counterfactual statements should yield this testable implication.

Another question often posed to causal inference is that of identifiability, for example, whether the
causal effect of X on Z is estimable from observational studies.

Whereas graphical models enjoy inferential tools such as d-separation and do-calculus, potential-outcome specifications can use the axioms of counterfactual logic (Galles and Pearl 1998, Halpern, 1998) to determine identification and testable implication. In a recent paper, I have combined the graphoid and counterfactual axioms to provide such symbolic machinery (link).

However, the aim of this note is not to teach potential outcome researchers how to derive the logical consequences of their assumptions but, rather, to give researchers the flavor of what these derivation entail, and the kind of problems the potential outcome specification presents vis a vis the graphical representation.

As most of us would agree, the chain appears more friendly than the 4 equations in (3)-(6), and the reasons are both representational and inferential. On the representational side we note that it would take a person (even an expert in potential outcome) a pause or two to affirm that (3)-(6) indeed represent the chain process he/she has in mind. More specifically, it would take a pause or two to check if some condition is missing from the list, or whether one of the conditions listed is redundant (i.e., follows logically from the other three) or whether the set is consistent (i.e., no statement has its negation follows from the other three). These mental checks are immediate in the graphical representation; the first, because each link in the graph corresponds to a physical process in nature, and the last two because the graph is inherently consistent and non-redundant. As to the inferential part, using the graphoid+counterfactual axioms as inference rule is computationally intractable. These axioms are good for confirming a derivation if one is proposed, but not for finding a derivation when one is needed.

I believe that even a cursory attempt to answer research questions using (3)-(5) would convince the reader of the merits of the graphical representation. However, the reader of this blog is already biased, having been told that (3)-(5) is the potential-outcome equivalent of the chain X—>Y—>Z. A deeper appreciation can be reached by examining a new problem, specified in potential- outcome vocabulary, but without its graphical mirror.

Assume you are given the following statements as a specification.

It represents a familiar model in causal analysis that has been throughly analyzed. To appreciate the power of graphs, the reader is invited to examine this representation above and to answer a few questions:

a) Is the process described familiar to you?
b) Which assumption are you willing to defend in your interpretation of the story.
c) Is the causal effect of X on Y identifiable?
d) Is the model testable?

I would be eager to hear from readers
1. if my comparison is fair.
2. which argument they find most convincing.

October 27, 2014

Are economists smarter than epidemiologists? (Comments on Imbens’s recent paper)

Filed under: Discussion,Economics,Epidemiology,General — eb @ 4:45 pm

In a recent survey on Instrumental Variables (link), Guido Imbens fleshes out the reasons why some economists “have not felt that graphical models have much to offer them.”

His main point is: “In observational studies in social science, both these assumptions [exogeneity and exclusion] tend to be controversial. In this relatively simple setting [3-variable IV setting] I do not see the causal graphs as adding much to either the understanding of the problem, or to the analyses.” [page 377]

What Imbens leaves unclear is whether graph-avoiding economists limit themselves to “relatively simple settings” because, lacking graphs, they cannot handle more than 3 variables, or do they refrain from using graphs to prevent those “controversial assumptions” from becoming transparent, hence amenable to scientific discussion and resolution.

When students and readers ask me how I respond to people of Imbens’s persuasion who see no use in tools they vow to avoid, I direct them to the post “The deconstruction of paradoxes in epidemiology”, in which Miquel Porta describes the “revolution” that causal graphs have spawned in epidemiology. Porta observes: “I think the “revolution — or should we just call it a renewal”? — is deeply changing how epidemiological and clinical research is conceived, how causal inferences are made, and how we assess the validity and relevance of epidemiological findings.”

So, what is it about epidemiologists that drives them to seek the light of new tools, while economists (at least those in Imbens’s camp) seek comfort in partial blindness, while missing out on the causal revolution? Can economists do in their heads what epidemiologists observe in their graphs? Can they, for instance, identify the testable implications of their own assumptions? Can they decide whether the IV assumptions (i.e., exogeneity and exclusion) are satisfied in their own models of reality? Of course the can’t; such decisions are intractable to the graph-less mind. (I have challenged them repeatedly to these tasks, to the sound of a pin-drop silence)

Or, are problems in economics different from those in epidemiology? I have examined the structure of typical problems in the two fields, the number of variables involved, the types of data available, and the nature of the research questions. The problems are strikingly similar.

I have only one explanation for the difference: Culture.

The arrow-phobic culture started twenty years ago, when Imbens and Rubin (1995) decided that graphs “can easily lull the researcher into a false sense of confidence in the resulting causal conclusions,” and Paul Rosenbaum (1995) echoed with “No basis is given for believing” […] “that a certain mathematical operation, namely this wiping out of equations and fixing of variables, predicts a certain physical reality” [ See discussions here. ]

Lingering symptoms of this phobia are still stifling research in the 2nd decade of our century, yet are tolerated as scientific options. As Andrew Gelman put it last month: “I do think it is possible for a forward-looking statistician to do causal inference in the 21st century without understanding graphical models.” (link)

I believe the most insightful diagnosis of the phenomenon is given by Larry Wasserman:
“It is my impression that the “graph people” have studied the Rubin approach carefully while the reverse is not true.” (link)

December 27, 2012

Causal Inference Symposium: Heckman and Pearl

Filed under: Discussion,Economics,General — eb @ 2:30 pm

Judea Pearl Writes:

Last week I attended a causal inference symposium at the University of Michigan, and had a very lively discussion with James Heckman (Chicago, economics) on causal reasoning in econometrics, statistics and computer science. Video and slides of the two lectures can be watched here:

In the QA session (not in the video), I described the problems of transportability and external validity, and their solutions according to:

Heckman asked: What makes this problem different from the one that economists solve routinely — when they find a new distribution that differs from the one they estimated, they simply re-estimate the parameters by which the two differ and keep those on which they agree.

My answer stressed three facts that should be kept in mind when dealing with “transporatability”:
1. We cannot speak here about differing “distributions” because transportability is a causal, not statistical problem. In other words, what needs to be re-estimated depends not on the two “distributions” but on the causal story behind the distributions. (This is shown vividly in Example 2 of R-372).

2. We are now dealing with the task of transporting “experimental findings” (e.g., causal effects), not distributions, from a place where they are available to a place where they are not estimable.

3. We cannot even speak about re-estimating “parameters” because the problem is entirely non-parametric.

More comments on audience questions will follow.

December 17, 2012

Blog discussion on Causality in Econometric and Statistical education

Filed under: Announcement,Discussion,Economics — moderator @ 1:30 am

A recent discussion on Andrew Gelman’s blog has touched on some interesting points concerning the teaching of causality in econometric and statistics classes (link here). I responded to some of the discussants and, below, I share my replies with readers of this blog.

1. Andrew Gelman asked why the review in is critical of econometrics, “I thought that causality was central to econometrics; see, for example, Angrist and Pischke’s book .”

Judea Pearl replies:
Causality is indeed central to econometrics. Our survey of econometric textbooks is critical of econometric education today, not of econometric methodology proper. Econometric models, from the time of Haavelmo (1943), have been and remained causal (see ) despite two attempted hijacking, first by regressionists, and second by “quasi-experimentalists,” like Angrist and Paschke (AP). The six textbooks we reviewed reflect a painful recovery from the regressionist assault which more or less disappeared from serious econometric research, but is still obfuscating authors of econometric textbooks.

As to the debate between the structuralists and experimentalists, I address it in Section 4 of this article: (see

Your review of Angrist and Paschke book “Mostly Harmless Econometrics” leaves out what in my opinion is the major drawback of their methodology: sole reliance of instrumental variables and failure to express and justify the assumptions that underlie the choice of instruments. Since the choice of instruments rests on the same type of assumptions (ie.,exclusion and exogeneity) that Angrist and Paschke are determined to avoid (for being “unreliable,) readers are left with no discussion of what assumptions do go into the choice of instruments, how they are encoded in a model, what scientific knowledge can be used to defend them, and whether the assumptions have any testable implications.

In your review, you point out that Angrist and Pischke completely avoid the task of model-building; I agree. And I attribute this avoidance, not to lack of good intentions but to lacking mathematical tools necessary for model-building. Angrist and Pischke have deprived themselves of using such tools by making an exclusive commitment to the potential outcome language, while shunning the language of nonparametric structural models. This is something only he/she can appreciate who attempted to solve a problem, from start to end, in both languages, side by side. No philosophy, ideology, or hours of blog discussion can replace the insight one can gain by such an exercise.

2. A discussant named Jack writes:
An economist (econometrician) friend of mine often corresponds with Prof. Pearl, and what I understand is that Pearl believes the econometrics approach to causality is deeply, fundamentally wrong. (And econometricians tend to think Pearl’s approach is fundamentally wrong.) It sounds to me like Pearl was being purposefully snarky.

Judea Pearl replies:
Jack, I think you misunderstood what your friend told you. If you read my papers and books you will come to realize immediately that I believe the econometrics approach to causality is deeply an fundamentally right (I repeat: RIGHT, not WRONG). Though, admittedly, there have been two attempts to distort this approach by influx of researchers from adjacent fields (see my reply to Andrew on this page, or read

Next, I think you are wrong in concluding that “econometricians tend to think Pearl’s approach is fundamentally wrong”. First, I do not offer anyone “an approach,” I offer mathematical tools to do what researchers say they wish to do, only with less effort and greater clarity; researchers may choose to use or ignore these tools. By analogy, the invention of the microscope was not “an approach” but a new tool.

Second, I do not know a single econometrician who tried my microscope and thought it is “fundamentally wrong”, the dismissals I often hear come invariably from those who refuse to look at the microscope for religious reasons.

Finally, since you went through the trouble of interpreting hearsay and labeling me “purposefully snarky,” I think you owe readers of this blog ONE concrete example where I criticize an economist for reasons that you judge to be unjustified. You be the judge.

3. An Anonymous discussant writes:
Yes, the problem with the econometrics approach is that it lumps together identification, estimation, and probability, so papers look like a Xmas tree. It all starts with chapter 1 in econometrics textbooks and all those assumptions about the disturbance, linearity, etc. Yet most discussions in causality oriented papers revolve around identification and for that you can mostly leave out functional forms, estimation, and probability.

Why carry around reams of parametric notation when it ain’t needed? One wonders how Galileo, Newton, or Franklin ever discovered anything without X’X^(-1)X’Y?

Judea Pearl replies:
To all discussants:
I hear many voices agreeing that statistics education needs a shot of relevancy, and that causality is one area where statistics education has stifled intuition and creativity. I therefore encourage you to submit nominations for the causality in statistics prize, as described in and

Please note that the criteria for the prize do not require fancy formal methods; they are problem-solving oriented. The aim is to build on the natural intuition that students bring with them, and leverage it with elementary mathematical tools so that they can solve simple problems with comfort and confidence (not like their professors). The only skills they need to acquire are: (1) Articulate the question, (2) Specify the assumptions needed to answer it and (3) Determine if the assumptions have testable implications. The reasons we cannot totally dispose of mathematical tools are: (1) scientists have local intuitions about different parts of a problem and only mathematics can put them all together coherently, (2) eventually, these intuitions will need to be combined with data to come up with assessments of strengths and magnitudes (e.g., of effects). We do not know how to combine data with intuition in any other way, except through mathematics.

Recall, Pythagoras theorem served to amplify, not stifle the intuitions of ancient geometers.

May 17, 2007

More on Where Economic Modeling is Heading

Filed under: Discussion,Economics — judea @ 1:00 am

Judea Pearl writes:

My previous posting in this forum raised questions regarding Jim Heckman's analysis of causal effects, as described in his article, "The Scientific Model of Causality" (Sociological Methodology, Vol. 35 (1) page 40.)

To help answer these questions, Professor Heckman was kind enough to send me a more recent paper entitled: "Econometric Evaluation of Social Programs," by Heckman and Vytlacil (Draft of Dec. 12, 2006. Prepared for The Handbook of Econometrics, Vol. VI, ed by J. Heckman and E. Leamer, North Holland, 2006.)

This paper indeed clarifies some of my questions, yet raises others. I will share with readers my current thoughts on Heckman's approach to causality and on where causality is heading in econometrics.

(Post edited 5/4: revisions in red, thanks to feedback from David Pattison)
(Post edited 5/17: correction and new comments by LeRoy and Pearl)


March 21, 2007

Where is economic modelling today?

Filed under: Economics,Opinion — judea @ 8:30 am

In his 2005 article "The Scientific Model of Causality" (Sociological Methodology, vol. 35 (1) page 40,) Jim Heckman reviews the historical development of causal notions in econometrics, and paints an extremely complimentary picture of the current state of this development.

As an illustration of econometric methods and concepts, Heckman discusses the classical problem of estimating the causal effect of Y2 on Y1 in the following systems of equations

Y1 = a1 + c12Y2 + b11X1 + b12 X2 + U1     (16a)
Y2 = a2 + c21Y1 + b21X1 + b22 X2 + U2     (16b)

where Y1 and Y2 represent, respectively, the consumption levels of two interacting agents, and X1, X2, the levels of their income.

Unexpectedly, on page 44, Heckman makes a couple of remarks that almost threw me off my chair; here they are:

"Controlled variation in external (forcing) variables is the key to defining causal effects in nonrecursive models. It is of some interest to readers of Pearl (2000) to compare my use of the standard simultaneous equations model of econometrics in defining causal parameters to his. In the context of equations (16a) and (16b), Pearl defines a causal effect by "shutting one equation down" or performing "surgery" in his colorful language."

"He implicitly assumes that "surgery," or shutting down an equation in a system of simultaneous equations, uniquely fixes one outcome or internal variable (the consumption of the other person in my example). In general, it does not. Putting a constraint on one equation places a restriction on the entire set of internal variables. In general, no single equation in a system of simultaneous equation uniquely determines any single outcome variable. Shutting down one equation might also affect the parameters of the other equations in the system and violate the requirements of parameter stability."

I wish to bring up for blog discussion the following four questions:

  1. Is Heckman right in stating that in nonrecursive systems one should not define causal effect by surgery?
  2. What is the causal effect of Y2 on Y1 in the model of Eqs. (16a -16b) ??
  3. What does Heckman mean when he objects to surgery as the basis for defining causal parameters?
  4. What did he have in mind when he offered "… the standard simultaneous equations model of econometrics" as an alternative to surgery "in defining causal parameters"?

The following are the best answers I could give to these questions, but I would truly welcome insights from other participants, especially economists and social scientists (including Jim Heckman, of course).


Powered by WordPress