Causal Analysis in Theory and Practice

September 15, 2016

Summer-end Greeting from the UCLA Causality Blog

Filed under: Uncategorized — bryantc @ 4:39 am

Dear friends in causality research,
This greeting from UCLA Causality blog contains news and discussion on the following topics:

1. Reflections on 2016 JSM meeting.
2. The question of equivalent representations.
3. Simpson’s Paradox (Comments on four recent papers)
4. News concerning Causal Inference Primer
5. New books, blogs and other frills.

1. Reflections on JSM-2016
For those who missed the JSM 2016 meeting, my tutorial slides can be viewed here:

As you can see, I argue that current progress in causal inference should be viewed as a major paradigm shift in the history of statistics and, accordingly, nuances and disagreements are merely linguistic realignments within a unified framework. To support this view, I chose for discussion six specific achievements (called GEMS) that should make anyone connected with causal analysis proud, empowered, and mighty motivated.

The six gems are:
1. Policy Evaluation (Estimating “Treatment Effects”)
2. Attribution Analysis (Causes of Effects)
3. Mediation Analysis (Estimating Direct and Indirect Effects)
4. Generalizability (Establishing External Validity)
5. Coping with Selection Bias
6. Recovering from Missing Data

I hope you enjoy the slides and appreciate the gems.

2. The question of equivalent representations
One challenging question that came up from the audience at JSM concerned the unification of the graphical and potential-outcome frameworks. “How can two logically equivalent representations be so different in actual use?”. I elaborate on this question in a separate post titled “Logically equivalent yet way too different.”

3. Simpson’s Paradox: The riddle that would not die
(Comments on four recent papers)
If you search Google for “Simpson’s paradox”, as I did yesterday, you would get 111,000 results, more than any other statistical paradox that I could name. What elevates this innocent reversal of associations to “paradoxical” status, and why it has captured the fascination of statisticians, mathematicians and philosophers for over a century are questions that we discussed at length on this (and other) blogs. The reason I am back to this topic is the publication of four recent papers that give us a panoramic view at how the understanding of causal reasoning has progressed in communities that do not usually participate in our discussions.

4. News concerning Causal Inference – A Primer
We are grateful to Jim Grace for his in-depth review on Amazon:

For those of you awaiting the solutions to the study questions in the Primer, I am informed that the Solution Manual is now available (to instructors) from Wiley. To obtain a copy, see page 2 of: However, rumor has it that a quicker way to get it is through your local Wiley representative, at

If you encounter difficulties, please contact us at and we will try to help. Readers tell me that the solutions are more enlightening than the text. I am not surprised, there is nothing more invigorating than seeing a non-trivial problem solved from A to Z.

5. New books, blogs and other frills
We are informed that a new book by Joseph Halpern, titled “Actual Causality”, is available now from MIT Press. ( Readers familiar with Halpern’s fundamental contributions to causal reasoning will not be surprised to find here a fresh and comprehensive solution to the age-old problem of actual causality. Not to be missed.

Adam Kelleher writes about an interesting math-club and causal-minded blog that he is orchestrating. See his post,

Glenn Shafer just published a review paper: “A Mathematical Theory of Evidence turn 40” celebrating the 40th anniversary of the publication of his 1976 book “A Mathematical Theory of Evidence” I have enjoyed reading this article for nostalgic reasons, reminding me of the stormy days in the 1980’s, when everyone was arguing for another calculus of evidential reasoning. My last contribution to that storm, just before sailing off to causality land, was this paper: Section 10 of Shafer’s article deals with his 1996 book “The Art of Causal Conjecture” My thought: Now, that the causal inference field has matured, perhaps it is time to take another look at the way Shafer views causation.

Wishing you a super productive Fall season.

J. Pearl

September 12, 2016

Logically equivalent yet way too different

Filed under: Uncategorized — bryantc @ 2:50 am

Contributor: Judea Pearl

In comparing the tradeoffs between the structural and potential outcome frameworks, I often state that the two are logically equivalent yet poles apart in terms of transparency and computational efficiency. (See Slide #34 of the JSM tutorial). Indeed, anyone who examines how the two frameworks solve a specific problem from begining to end (See, e.g., Slides #35-36 ) would find the differences astonishing.

The question naturally arises: How can two equivalent frameworks differ so substantially in actual use.

The answer is that epistemic equivalence does not mean representational equivalence. Two representations of the same information may highlight different aspects of the problem and thus differ substantially in how easy it is to solve a given problem.  This is a recurrent theme in complexity analysis, but is not generally appreciated outside computer science. We saw it in our discussions with Guido Imbens who could not accept the fact that the use of graphical models is a mathematical necessity not just a matter of taste. (

The examples usually cited in complexity analysis are combinatorial problems whose solution times depend critically on the initial representation. I hesitated from bringing up these examples, fearing that they will not be compelling to readers on this blog who are more familiar with classical mathematics.

Last week I stumbled upon a very simple example that demonstrates representational differences in no ambiguous terms; I would like to share it with readers.

Consider the age-old problem of finding a solution to an algebraic equation, say
y(x) = x3 + ax2 + bx + c = 0

This is a tough problem for those of us who do not remember Tartalia’s solution of the cubic.  (It can be made much tougher once we go to quintic equation.)

But there are many syntactic ways of representing the same function y(x) . Here is one equivalent representation:
y(x) = x(x2+ax) + b(x+c/b) = 0
and here is another:
y(x) = (x-x1)(x-x2)(x-x3) = 0,
where x1, x2, and x3 are some functions of a, b, c.

The last representation permits an immediate solution, which is:
x=x1, x=x2, x=x3.

The example may appear trivial, and some may even call it cheating, saying that finding x1, x2, and x3 is as hard as solving the original problem. This is true, but the purpose of the example was not to produce an easy solution to the cubic. The purpose was to demonstrate that different syntactic ways of representing the same information (i.e., the same polynomial) may lead to substantial differences in the complexity of computing an answer to a query (i.e., find a root).

A preferred representation is one that makes certain desirable aspects of the problem explicit, thus facilitating a speedy solution. Complexity theory is full of such examples.

Note that the complexity is query-dependent. Had our goal been to find a value x that makes the polynomial y(x) equal 4, not zero, the representation above y(x) = (x-x1)(x-x2)(x-x3) would offer no help at all. For this query, the representation
y(x) = (x-z1)(x-z2)(x-z3) + 4  
would yield an immediate solution
x=z1, x=z2, x=z3,
where z1, z2, and z3 are the roots of another polynomial:
x3 + ax2 + bx + (c-4) = 0

This simple example demonstrates nicely the principle that makes graphical models more efficient than alternative representations of the same causal information, say a set of ignorability assumptions. What makes graphical models efficient is the fact that they make explicit the logical ramifications of the conditional-independencies conveyed by the model. Deriving those ramifications by algebraic or logical means takes substantially more work. (See for the logic of counterfactual independencies)

A typical example of how nasty such derivations can get is given in Heckman and Pinto’s paper on “Causal Inference after Haavelmo” (Econometric Theory, 2015). Determined to avoid graphs at all cost, Heckman and Pinto derived conditional independence relations directly from Dawid’s axioms and the Markov condition (See The results are pages upon pages of derivations of independencies that are displayed explicitly in the graph.

Of course, this and other difficulties will not dissuade econometricians to use graphs; that would rake a scientific revolution of Kuhnian proportions. (see Still, awareness of these complexity issues should give inquisitive students the ammunition to hasten the revolution and equip econometrics with modern tools of causal analysis.

They eventually will.

September 11, 2016

An interesting math and causality-minded club

Filed under: Announcement — bryantc @ 6:08 pm

from Adam Kelleher:

The math and algorithm reading group ( is based in NYC, and was founded when I moved here three years ago. It’s a very casual group that grew out of a reading group I was in during graduate school. Some friends who were math graduate students were interested in learning more about general relativity, and I (a physicist) was interested in learning more math. Together, we read about differential geometry, with the goal of bringing our knowledge together. We reasoned that we could learn more as a group, by pooling our different perspectives and experience, than we could individually. That’s the core motivation of our reading group: not only are we there to help resolve each other get through the material if anyone gets stuck, but we’re also there to add what else we know (in the format of a group discussion) to the content of the material.

We’re currently reading Causality cover to cover. We’ve paused to implement some of the algorithms, and plan on pausing again soon for a review session. We intend to do a “hacking session”, to try our hands at causal inference and analysis on some open data sets.

Inspired by reading Causality, and realizing that the best open implementations of causal inference were packaged in the (old, relatively inaccessible) Tetrad package, I’ve started a modern implementation of some tools for causal inference and analysis in the causality package in Python. It’s on pypi (pip install causality, or check the tutorial on, but it’s still a work in progress. The IC* algorithm is implemented, along with a small suite of conditional independence tests. I’m adding some classic methods for causal inference and causal effects estimation, aimed at making the package more general-purpose. I invite new contributions to help build out the package. Just open an issue, and label it an “enhancement” to kick of the discussion!

Finally, to make all of the work more accessible to people without more advanced math background, I’ve been writing a series of blog posts aimed at introducing anyone with an intermediate background in probability and statistics to the material in Causality! It’s aimed especially at practitioners, like data scientists. The hope is that more people, managers included (the intended audience for the first 3 posts), will understand the issues that come up when you’re not thinking causally. I’d especially recommend the article about understanding bias, but the whole series (still in progress) is indexed here:

Powered by WordPress