Causal Analysis in Theory and Practice

December 27, 2012

Causal Inference Symposium: Heckman and Pearl

Filed under: Discussion,Economics,General — eb @ 2:30 pm

Judea Pearl Writes:

Last week I attended a causal inference symposium at the University of Michigan, and had a very lively discussion with James Heckman (Chicago, economics) on causal reasoning in econometrics, statistics and computer science. Video and slides of the two lectures can be watched here: http://www.psc.isr.umich.edu/pubs/video-tapes.html

In the QA session (not in the video), I described the problems of transportability and external validity, and their solutions according to:
http://ftp.cs.ucla.edu/pub/stat_ser/r372.pdf
http://ftp.cs.ucla.edu/pub/stat_ser/r390.pdf

Heckman asked: What makes this problem different from the one that economists solve routinely — when they find a new distribution that differs from the one they estimated, they simply re-estimate the parameters by which the two differ and keep those on which they agree.

My answer stressed three facts that should be kept in mind when dealing with “transporatability”:
1. We cannot speak here about differing “distributions” because transportability is a causal, not statistical problem. In other words, what needs to be re-estimated depends not on the two “distributions” but on the causal story behind the distributions. (This is shown vividly in Example 2 of R-372).

2. We are now dealing with the task of transporting “experimental findings” (e.g., causal effects), not distributions, from a place where they are available to a place where they are not estimable.

3. We cannot even speak about re-estimating “parameters” because the problem is entirely non-parametric.

More comments on audience questions will follow.

December 17, 2012

Blog discussion on Causality in Econometric and Statistical education

Filed under: Announcement,Discussion,Economics — moderator @ 1:30 am

A recent discussion on Andrew Gelman’s blog has touched on some interesting points concerning the teaching of causality in econometric and statistics classes (link here). I responded to some of the discussants and, below, I share my replies with readers of this blog.

1. Andrew Gelman asked why the review in http://ftp.cs.ucla.edu/pub/stat_ser/r395.pdf is critical of econometrics, “I thought that causality was central to econometrics; see, for example, Angrist and Pischke’s book .”

Judea Pearl replies:
Causality is indeed central to econometrics. Our survey of econometric textbooks http://ftp.cs.ucla.edu/pub/stat_ser/r395.pdf is critical of econometric education today, not of econometric methodology proper. Econometric models, from the time of Haavelmo (1943), have been and remained causal (see http://ftp.cs.ucla.edu/pub/stat_ser/r391.pdf ) despite two attempted hijacking, first by regressionists, and second by “quasi-experimentalists,” like Angrist and Paschke (AP). The six textbooks we reviewed reflect a painful recovery from the regressionist assault which more or less disappeared from serious econometric research, but is still obfuscating authors of econometric textbooks.

As to the debate between the structuralists and experimentalists, I address it in Section 4 of this article: (see http://ftp.cs.ucla.edu/pub/stat_ser/r391.pdf)

Your review of Angrist and Paschke book “Mostly Harmless Econometrics” leaves out what in my opinion is the major drawback of their methodology: sole reliance of instrumental variables and failure to express and justify the assumptions that underlie the choice of instruments. Since the choice of instruments rests on the same type of assumptions (ie.,exclusion and exogeneity) that Angrist and Paschke are determined to avoid (for being “unreliable,) readers are left with no discussion of what assumptions do go into the choice of instruments, how they are encoded in a model, what scientific knowledge can be used to defend them, and whether the assumptions have any testable implications.

In your review, you point out that Angrist and Pischke completely avoid the task of model-building; I agree. And I attribute this avoidance, not to lack of good intentions but to lacking mathematical tools necessary for model-building. Angrist and Pischke have deprived themselves of using such tools by making an exclusive commitment to the potential outcome language, while shunning the language of nonparametric structural models. This is something only he/she can appreciate who attempted to solve a problem, from start to end, in both languages, side by side. No philosophy, ideology, or hours of blog discussion can replace the insight one can gain by such an exercise.

2. A discussant named Jack writes:
An economist (econometrician) friend of mine often corresponds with Prof. Pearl, and what I understand is that Pearl believes the econometrics approach to causality is deeply, fundamentally wrong. (And econometricians tend to think Pearl’s approach is fundamentally wrong.) It sounds to me like Pearl was being purposefully snarky.

Judea Pearl replies:
Jack, I think you misunderstood what your friend told you. If you read my papers and books you will come to realize immediately that I believe the econometrics approach to causality is deeply an fundamentally right (I repeat: RIGHT, not WRONG). Though, admittedly, there have been two attempts to distort this approach by influx of researchers from adjacent fields (see my reply to Andrew on this page, or read http://ftp.cs.ucla.edu/pub/stat_ser/r391.pdf

Next, I think you are wrong in concluding that “econometricians tend to think Pearl’s approach is fundamentally wrong”. First, I do not offer anyone “an approach,” I offer mathematical tools to do what researchers say they wish to do, only with less effort and greater clarity; researchers may choose to use or ignore these tools. By analogy, the invention of the microscope was not “an approach” but a new tool.

Second, I do not know a single econometrician who tried my microscope and thought it is “fundamentally wrong”, the dismissals I often hear come invariably from those who refuse to look at the microscope for religious reasons.

Finally, since you went through the trouble of interpreting hearsay and labeling me “purposefully snarky,” I think you owe readers of this blog ONE concrete example where I criticize an economist for reasons that you judge to be unjustified. You be the judge.

3. An Anonymous discussant writes:
Yes, the problem with the econometrics approach is that it lumps together identification, estimation, and probability, so papers look like a Xmas tree. It all starts with chapter 1 in econometrics textbooks and all those assumptions about the disturbance, linearity, etc. Yet most discussions in causality oriented papers revolve around identification and for that you can mostly leave out functional forms, estimation, and probability.

Why carry around reams of parametric notation when it ain’t needed? One wonders how Galileo, Newton, or Franklin ever discovered anything without X’X^(-1)X’Y?

Judea Pearl replies:
To all discussants:
I hear many voices agreeing that statistics education needs a shot of relevancy, and that causality is one area where statistics education has stifled intuition and creativity. I therefore encourage you to submit nominations for the causality in statistics prize, as described in http://www.amstat.org/education/causalityprize/ and http://magazine.amstat.org/blog/2012/11/01/pearl/

Please note that the criteria for the prize do not require fancy formal methods; they are problem-solving oriented. The aim is to build on the natural intuition that students bring with them, and leverage it with elementary mathematical tools so that they can solve simple problems with comfort and confidence (not like their professors). The only skills they need to acquire are: (1) Articulate the question, (2) Specify the assumptions needed to answer it and (3) Determine if the assumptions have testable implications. The reasons we cannot totally dispose of mathematical tools are: (1) scientists have local intuitions about different parts of a problem and only mathematics can put them all together coherently, (2) eventually, these intuitions will need to be combined with data to come up with assessments of strengths and magnitudes (e.g., of effects). We do not know how to combine data with intuition in any other way, except through mathematics.

Recall, Pythagoras theorem served to amplify, not stifle the intuitions of ancient geometers.

December 7, 2012

On Structural Equations versus Causal Bayes Networks

Filed under: Counterfactual,structural equations — eb @ 6:00 pm

We received the following query from Jim Grace, (USGS – National Wetlands Research Center) :
Hi Judea,

In your 2009 edition of Causality on pages 26-27 you explain your reasoning for now preferring to express causal rules from a Laplacian quasi-deterministic perspective rather than stay with the stochastic conceptualization associated with Bayesian Networks. It seems to me that a practical matter here is the reliance of traditional graph theory on discrete mathematics and the constraints that places on functional forms and, therefore, counterfactual arguments. Despite that clear logic, one sees the occasional discussion of “causal Bayes nets” and I wondered if you would dissuade people (if people can be dissuaded) from trying to evolve a causal modeling methodology with discrete Bayes nets as their starting point?

Judea Pearl answers:
Dear Jim,

I would not dissuade people from using either causal Bayesian causal networks or structural equation models, because the difference between the two is so minute that it is not worth the dissuasion. The question is only what question you ask yourself when you construct the diagram. If you feel more comfortable asking: What factors determine the value of this variable” then you construct a structural equation model. If on the other hand you prefer to ask: “If I intervene and wiggle this variable, would the probability of the other variable change?” then the outcome would be a causal Bayes network. Rarely do they differ (but see example on page 35 of Causality).

December 4, 2012

Neyman-Rubin’s model and ASA Causality Prize

We received the following query from Megan Murphy (ASA):
Dr. Pearl,
I received the following question regarding the Causality in Statistics Education prize on twitter. I’m not sure how to answer this, perhaps you can help?

Would entries using Neyman-Rubin model even be considered? RT @AmstatNews: Causality in Statistics Education #prize magazine.amstat.org/blog/2012/11/0…

Judea Answers:
“Of course! The criteria for evaluation specifically state: ‘in some mathematical language (e.g., counterfactuals, equations, or graphs)’ giving no preference to any of the three notational systems. The criteria stress capabilities to perform specific inference tasks, regardless of the tools used in performing the tasks.

For completeness, I re-list below the evaluation criteria:

• The extent to which the material submitted equips students with skills needed for effective causal reasoning. These include:

—1a. Ability to correctly classify problems, assumptions, and claims into two distinct categories: causal vs. associational

—1b. Ability to take a given causal problem and articulate in some mathematical language (e.g., counterfactuals, equations, or graphs) both the target quantity to be estimated and the assumptions one is prepared to make (and defend) to facilitate a solution

—1c. Ability to determine, in simple cases, whether control for covariates is needed for estimating the target quantity, what covariates need be controlled, what the resulting estimand is, and how it can be estimated using the observed data

—1d. Ability to take a simple scenario (or model), determine whether it has statistically testable implications, and apply data to test the assumed scenario

• The extent to which the submitted material assists statistics instructors in gaining an understanding of the basics of causal inference (as outlined in 1a-d) and prepares them to teach these basics in undergraduate and lower-division graduate classes in statistics.

Those versed in the Neyman-Rubin model are most welcome to submit nominations.

Note, however, that nominations will be evaluated on ALL four skills, 1a – 1d.
Judea

December 3, 2012

Judea Pearl on Potential Outcomes

Filed under: Counterfactual,Discussion,General — eb @ 7:30 pm

I recently attended a seminar presentation by Professor Tom Belin, (AQM RAC seminar, UCLA, November 30, 2012) who spoke on the relationships between the potential outcome model of Neyman, Rubin and Holland, and the structural equation and graphical models which I have been advocating since 1995.

In the last part of the seminar, I made a few comments which led to a lively discussion, as well as clarification ( I hope) of some basic issues which are rarely discussed in the mainstream literature.

Below is a concise summary of my remarks which I present to encourage additional discussion, questions, objections and, of course, new ideas.

Judea Pearl

Summary of my views on the relationships between the potential-outcome (PO) and Structural Causal Models (SCM) frameworks.

Formally, the two frameworks are logically equivalent; a theorem in one is a theorem in the other, and every assumption in one can be translated into an equivalent assumption in the other.

Therefore, the two frameworks can be used interchangeably and symbiotically, as it is done in the advanced literature in the health and social sciences.

However, the PO framework has also spawned an ideological movement that resists this symbiosis and discourages its faithfuls from using SCM or its graphical representation.

This ideological movement (which I call “arrow-phobic”) can be recognized by a total avoidance of causal diagrams or structural equations in research papers, and an exclusive use of “ignorability” type notation for expressing the assumptions that (must) underlie causal inference studies. For example, causal diagrams are meticulously excluded from the writings of Rubin, Holland, Rosenbaum, Angrist, Imbens, and their students who, by and large, are totally unaware of the inferential and representational powers of diagrams.

Formally, this exclusion is harmless because, based on the logical equivalence mentioned above, it is always possible to replace assumptions made in SCM with equivalent, albeit cumbersome assumptions in PO language, and eventually come to the correct conclusions. But practically, the exclusion forces investigators to articulate assumptions whose meaning they do not comprehend, whose plausibility they cannot judge, and whose statistical implications they cannot predict.

The arrow-phobic exclusion can be compared to a prohibition against the use of ‘multiplication’ in arithmetics. Formally, it is harmless, because one can always replace multiplication with addition (e.g., adding a number to itself n times). Yet practically, those who shun multiplication will not get very far in science.

The rejection of graphs and structural models leaves investigators with no process-model guidance and, not surprisingly, it has resulted in a number of blunders which the PO community is not very proud of.

One such blunder is Rosenbaum (2002) and Rubin’s (2007) declaration that “there is no reason to avoid adjustment for a variable describing subjects before treatment”
http://www.cs.ucla.edu/~kaoru/r348.pdf

Another is Hirano and Imbens’ (2001) method of covariate selection, which prefers bias-amplifying variables in the propensity score.
http://ftp.cs.ucla.edu/pub/stat_ser/r356.pdf

The third is the use of ‘principal stratification’ to assess direct and indirect effects in mediation problems. which lead to paradoxical and unintended results.
http://ftp.cs.ucla.edu/pub/stat_ser/r382.pdf

In summary, the PO framework offers a useful analytical tool (i.e.. an algebra of counterfactuals) when used in the context of a symbiotic SCM analysis. It may be harmful however when used as an exclusive and restrictive subculture that discourages the use of process-based tools and insights.

Additional background and technical details on the PO vs. SCM tradeoffs can be found in Section 4 of a tutorial paper (Statistics Surveys)
http://ftp.cs.ucla.edu/pub/stat_ser/r350.pdf
and in a book chapter on the Eight Myths of SEM:
http://ftp.cs.ucla.edu/pub/stat_ser/r393.pdf

Readers might also find it instructive to compare how the two paradigms frame and solve a specific problem from start to end. This comparison is given in Causality (Pearl 2009) pages 81-88, 232-234.

Powered by WordPress