Causal Analysis in Theory and Practice

June 20, 2016

Recollections from the WCE conference at Stanford

Filed under: Counterfactual,General,Mediated Effects,structural equations — bryantc @ 7:45 am

On May 21, Kosuke Imai and I participated in a panel on Mediation, at the annual meeting of the West Coast Experiment Conference, organized by Stanford Graduate School of Business The following are some of my recollections from that panel.

We began the discussion by reviewing causal mediation analysis and summarizing the exchange we had on the pages of Psychological Methods (2014)

My slides for the panel can be viewed here:

We ended with a consensus regarding the importance of causal mediation and the conditions for identifying of Natural Direct and Indirect Effects, from randomized as well as observational studies.

We proceeded to discuss the symbiosis between the structural and the counterfactual languages. Here I focused on slides 4-6 (page 3), and remarked that only those who are willing to solve a toy problem from begining to end, using both potential outcomes and DAGs can understand the tradeoff between the two. Such a toy problem (and its solution) was presented in slide 5 (page 3) titled “Formulating a problem in Three Languages” and the questions that I asked the audience are still ringing in my ears. Please have a good look at these two sets of assumptions and ask yourself:

a. Have we forgotten any assumption?
b. Are these assumptions consistent?
c. Is any of the assumptions redundant (i.e. does it follow logically from the others)?
d. Do they have testable implications?
e. Do these assumptions permit the identification of causal effects?
f. Are these assumptions plausible in the context of the scenario given?

As I was discussing these questions over slide 5, the audience seemed to be in general agreement with the conclusion that, despite their logical equivalence, the graphical language  enables  us to answer these questions immediately while the potential outcome language remains silent on all.

I consider this example to be pivotal to the comparison of the two frameworks. I hope that questions a,b,c,d,e,f will be remembered, and speakers from both camps will be asked to address them squarely and explicitly .

The fact that graduate students made up the majority of the participants gives me the hope that questions a,b,c,d,e,f will finally receive the attention they deserve.

As we discussed the virtues of graphs, I found it necessary to reiterate the observation that DAGs are more than just “natural and convenient way to express assumptions about causal structures” (Imbens and Rubin , 2013, p. 25). Praising their transparency while ignoring their inferential power misses the main role that graphs play in causal analysis. The power of graphs lies in computing complex implications of causal assumptions (i.e., the “science”) no matter in what language they are expressed.  Typical implications are: conditional independencies among variables and counterfactuals, what covariates need be controlled to remove confounding or selection bias, whether effects can be identified, and more. These implications could, in principle, be derived from any equivalent representation of the causal assumption, not necessarily graphical, but not before incurring a prohibitive computational cost. See, for example, what happens when economists try to replace d-separation with graphoid axioms

Following the discussion of representations, we addressed questions posed to us by the audience, in particular, five questions submitted by Professor Jon Krosnick (Political Science, Stanford).

I summarize them in the following slide:

Krosnick’s Questions to Panel
1) Do you think an experiment has any value without mediational analysis?
2) Is a separate study directly manipulating the mediator useful? How is the second study any different from the first one?
3) Imai’s correlated residuals test seems valuable for distinguishing fake from genuine mediation. Is that so? And how it is related to traditional mediational test?
4) Why isn’t it easy to test whether participants who show the largest increases in the posited mediator show the largest changes in the outcome?
5) Why is mediational analysis any “worse” than any other method of investigation?
My answers focused on question 2, 4 and 5, which I summarize below:

Q. Is a separate study directly manipulating the mediator useful?
Answer: Yes, it is useful if physically feasible but, still, it cannot give us an answer to the basic mediation question: “What percentage of the observed response is due to mediation?” The concept of mediation is necessarily counterfactual, i.e. sitting on the top layer of the causal hierarchy (see “Causality” chapter 1). It cannot be defined therefore in terms of population experiments, however clever. Mediation can be evaluated with the help of counterfactual assumptions such as “conditional ignorability” or “no interaction,” but these assumptions cannot be verified in population experiments.

Q. Why isn’t it easy to test whether participants who show the largest increases in the posited mediator show the largest changes in the outcome?
Answer: Translating the question to counterfactual notation the test suggested requires the existence of monotonic function f_m such that, for every individual, we have Y_1 – Y_0 =f_m (M_1 – M_0)

This condition expresses a feature we expect to find in mediation, but it cannot be taken as a DEFINITION of mediation. This condition is essentially the way indirect effects are defined in the Principal Strata framework (Frangakis and Rubin, 2002) the deficiencies of which are well known. See

In particular, imagine a switch S controlling two light bulbs L1 and L2. Positive correlation between L1 and L2 does not mean that L1 mediates between the switch and L2. Many examples of incompatibility are demonstrated in the paper above.

The conventional mediation tests (in the Baron and Kenny tradition) suffer from the same problem; they test features of mediation that are common in linear systems, but not the essence of mediation which is universal to all systems, linear and nonlinear, continuous as well as categorical variables.

Q. Why is mediational analysis any “worse” than any other method of investigation?
Answer: The answer is closely related to the one given to question 3). Mediation is not a “method” but a property of the population which is defined counterfactually, and therefore requires counterfactual assumption for evaluation. Experiments are not sufficient; and in this sense mediation is “worse” than other properties under investigation, eg., causal effects, which can be estimated entirely from experiments.

About the only thing we can ascertain experimentally is whether the (controlled) direct effect differs from the total effect, but we cannot evaluate the extent of mediation.

Another way to appreciate why stronger assumptions are needed for mediation is to note that non-confoundedness is not the same as ignorability. For non-binary variables one can construct examples where X and Y are not confounded ( i.e., P(y|do(x))= P(y|x)) and yet they are not ignorable, (i.e., Y_x is not independent of X.) Mediation requires ignorability in addition to nonconfoundedness.

Overall, the panel was illuminating, primarily due to the active participation of curious students. It gave me good reasons to believe that Political Science is destined to become a bastion of modern causal analysis. I wish economists would follow suit, despite the hurdles they face in getting causal analysis to economics education.


July 14, 2014

Who Needs Causal Mediation?

Filed under: Discussion,Mediated Effects — eb @ 7:45 pm

A recent discussion that might be of interest to readers took place on SEMNET, a Structural Equation Modeling Discussion Group, which appeals primarily to traditional SEM researchers who, generally speaking, are somewhat bewildered by the recent fuss about modern causal analysis. This particular discussion focused on “causal mediation”.

An SEMNET user, Emil Coman, asked (my paraphrasing):
“Who needs causal mediation (CM)?”
All it gives us is: (a) the ability to cope with confounders of the M—>Y relation and (b) the ability to handle interactions. Both (a) and (b) are SEM-fixable; (a) by adjusting for those confounders and (b) by using Bengt Muthen’s software (Mplus), whenever we suspect interactions.

To continue, click here.

November 19, 2013

The Key to Understanding Mediation

Filed under: Definition,General,Mediated Effects — moderator @ 3:46 am

Judea Pearl Writes:

For a long time I could not figure out why SEM researchers find it hard to embrace the “causal inference approach” to mediation, which is based on counterfactuals. My recent conversations with David Kenny and Bengt Muthen have opened my eyes, and I am now pretty sure that I have found both the obstacle and the key to making causal mediation an organic part of SEM research.

Here is the key:

Why are we tempted to “control for” the mediator M when we wish to estimate the direct effect of X on Y? The reason is that, if we succeed in preventing M from changing then whatever changes we measure in Y are attributable solely to variations in X and we are justified then in proclaiming the effect observed as “direct effect of X on Y”. Unfortunately , the language of probability theory does not possess the notation to express the idea of “preventing M from changing” or “physically holding M constant”. The only operation probability allows us to use is “conditioning” which is what we do when we “control for M” in the conventional way (i.e., let M vary, but ignore all samples except those that match a specified value of M). This habit is just plain wrong, and is the mother of many confusions in the practice of SEM.

To find out why, you are invited to visit:, paragraph starting with “In the remaining of this note, …”, on page 2.


October 26, 2013

Comments on Kenny’s Summary of Causal Mediation

Filed under: Counterfactual,Indirect effects,Mediated Effects — moderator @ 12:00 am

David Kenny’s website <> has recently been revised to include a section on the Causal Inference Approach to Mediation. As many readers know, Kenny has pioneered mediation analysis in the social sciences through his seminal papers with Judd (1981) and Baron(1986) and has been an active leader in this field. His original approach, often referred to as the “Baron and Kenny (BK) approach,” is grounded in conservative Structural Equation Modeling (SEM) analysis, in which causal relationships are asserted with extreme caution and the boundaries between statistical and causal notions vary appreciably among researchers.

It is very significant therefore that Kenny has decided to introduce causal mediation analysis to the community of SEM researchers which, until very recently, felt alienated from recent advances in causal mediation analysis, primarily due to the counterfactual vocabulary in which it was developed and introduced. With Kenny’s kind permission, I am posting his description below, because it is one of the few attempts to explain causal inference in the language of traditional SEM mediation analysis and, thus, it may serve to bridge the barriers between the two communities.

Next you can find Kenny’s new posting, annotated with my comments. In these comments, I have attempted to further clarify the bridges between the two cultures; the “traditional” and the “causal.” I will refer to the former as “BK” (for Baron and Kenny) and to the latter as “causal” (for lack of a better word) although, conceptually, both BK and SEM are fundamentally causal.

Click here for the full post.

September 4, 2011

Comments on an article by Grice, Shlimgen and Barrett (GSB): “Regarding Causation and Judea Pearl’s Mediation Formula”

Filed under: Discussion,Mediated Effects,Opinion — moderator @ 3:00 pm

Stan Mulaik called my attention to a recent article by Grice, Shlimgen and Barrett (GSB) (linked here ) which is highly critical of structural equation modeling (SEM) in general, and of the philosophy and tools that I presented in “The Causal Foundation of SEM” (Pearl 2011) (  In particular, GSB disagree with the conclusions of the Mediation Formula — a tool for assessing what portion of a given effect is mediated through a specific pathway.

I responded with a detailed account of the disagreements between us (copied below), which can be summarized as follows:


1. The “OOM” analysis used by GSB is based strictly on frequency tables (or “multi-grams”) and, as such, cannot assess cause-effect relations without committing to some causal assumptions. Those assumptions are missing from GSB account, possibly due to their rejection of SEM.

2. I define precisely what is meant by “the extent to which the effect of X on Y is mediated by a third variable, say Z,” and demonstrate both, why such questions are important in decision making and model building and why they cannot be captured by observation-oriented methods such as OOM.

3. Using the same data and a slightly different design, I challenge GSB to answer a simple cause-effect question with their method (OOM), or with any method that dismisses SEM or causal algebra as unnecessary.

4. I further challenge GSB to present us with ONE RESEARCH QUESTION that they can answer and that is not answered swiftly, formally and transparently by the SEM methodology presented in Pearl (2011). (starting of course with the same assumptions and same data.)

5. I explain what gives me the assurance that no such research question will ever be found, and why even the late David Friedman, whom GSB lionize for his staunch critics of SEM, has converted to SEM thinking at the end of his life.

6. I alert GSB to two systematic omissions from their writings and posted arguments, without which no comparison can be made to other methodologies:
(a) A clear statement of the research question that the investigator attempts to answer, and
(b) A clear statement of the assumptions that the investigator is willing to make about reality.

Click here for the full response.


August 7, 2007

Mediated Effects

Filed under: Discussion,JSM,Mediated Effects — moderator @ 10:28 am

David Judkins writes:

I just saw Dylan Small give a very interesting talk in Salt Lake City on mediation analysis using random assignment interacted with baseline covariates as instrumental variables. He mentioned that Albert (2007) just established a formal definition for mediated effects with Neyman-Rubin causal language. Anyone know which Albert? Is it James Albert at Bowling Green? Any rival formal definitions for mediated effects? Page 165 of Pearl's 2000 text has a definition of indirect effects, but I didn't find it quite as satisfying as the version that Small put on the screen last week.

Powered by WordPress