Causal Analysis in Theory and Practice

May 3, 2010

On Mediation, counterfactuals and manipulations

Filed under: Discussion,Opinion — moderator @ 9:00 pm

Opening remarks

A few days ago, Dan Sharfstein posed a question regarding the “well-defineness” of “direct effects” in situations where the mediating variables cannot be manipulated. Dan’s question triggered a private email discussion that has culminated in a posting by Thomas Richardson and Jamie Robins (below) followed by Judea Pearl’s reply.

We urge more people to join this important discussion.

Thomas Richardson and James Robins’ discussion:


There has recently been some discussion of mediation and direct effects.

There are at least two issues here:

(1) Which counterfactuals are well defined.

(2) Even when counterfactuals are well defined, should we include assumptions that identify effects (ie the natural direct effect) that could never be confirmed even in principle by a Randomized Controlled Trial (RCT).

As to (1) it is clear to most that all counterfactuals are vague to a certain extent and can be made more precise by carefully describing the (quite possibly only hypothetical) intervention you want the counterfactual to represent. For this reason,  whether you take manipulation or causality as ontologically primary, we need to relate causation to manipulation to clarify and make more precise which counterfactual world we are considering.

On (2) we have just finished a long paper on the issue, fleshing out considerably an argument I (Jamie) made at the American Statistical Association (in 2005) discussing a talk by Judea on natural (pure and total) direct effects.

“Alternative Graphical Causal Models and the Identification of Direct Effects”

It is available at

Here is a brief summary:

Click here for the full post.

Best wishes,

Jamie Robins  and Thomas Richardson

Judea Pearl’s reply:

As to the which counterfactuals are “well defined”, my position is that counterfactuals attain their “definition” from the laws of physics and, therefore, they are “well defined” before one even contemplates any specific intervention. Newton concluded that tides are DUE to lunar attraction without thinking about manipulating the moon’s position; he merely envisioned how water would react to gravitaional force in general.

In fact, counterfactuals (e.g., f=ma) earn their usefulness precisely because they are not tide to specific manipulation, but can serve a vast variety of future inteventions, whose details we do not know in advance; it is the duty of the intervenor to make precise how each anticipated manipulation fits into our store of counterfactual knowledge, also known as “scientific theories”.

Regarding identifiability of mediation, I have two comments to make; ‘ one related to your Minimal Causal Models (MCM) and one related to the role of structural equations models (SEM) as the logical basis of counterfactual basis of counterfactual analysis.

Click here for Judea’s reply.

Best regards,

Judea Pearl

1 Comment »

  1. Dear Judea,

    Thank you very much for posting our contribution together with your rapid

    We will provide a full response in a subsequent journal paper.

    I take this opportunity simply to point out that the Minimal Counterfactual
    Models (MCMs) we introduce are not the same as causal Bayesian networks
    (CBNs). In our paper we call what you call CBNs, “agnostic causal DAGs”
    (following Spirtes et al. 1993).

    Though CBNs and MCMs are similar in some respects, there are clear
    differences (see Table 1 on p.47):

    (1) MCMs posit counterfactuals, in fact all of those counterfactuals present
    in the NPSEM for the same graph. CBNs do not.

    (2) The MCM identifies the effect of treatment on the treated (ETT) for
    binary treatment; this contrast does not even exist within a CBN.

    (3) We can obtain bounds on natural directs under an MCM. Again,
    these contrasts do not exist in the CBN framework.

    (4) If one wishes, an MCM may be written as a structural equation model with
    a particular dependence structure among the error terms / counterfactual

    (5) Viewed as “models” in the statistical sense, i.e. sets of distributions,
    both the MCM and the NPSEM for a given graph define models over (the same)
    counterfactuals. Under this (statistical) usage of the word ‘model’ the
    NPSEM for G is a sub-model of the MCM for G because the NPSEM imposes more
    independencies among counterfactuals and thus represents fewer
    distributions. Thus any distribution in the NPSEM model is also a
    distribution in the MCM model. [In contrast computer scientists often use
    the term ‘model’ to refer to a particular distribution; under this usage,
    every NPSEM *is* an MCM!]

    All of these points are made in some detail in our paper.

    Consequently your assertion that Figure 1.6(a) in your book, which depicts a
    CBN, ‘is’ an MCM is incorrect, as is your suggestion that the ETT is not
    defined in an MCM.

    Thus for the purposes of assisting discussion and attempting to avoid future
    misunderstandings I think it would be helpful if you refrain from referring
    to CBNs and MCMs as if they were one and the same.


    Finally, to give an example of a counterfactual, I feel that your reply
    would have been more relevant to someone who was arguing against the
    existence of counterfactuals, or against reasoning with counterfactuals.

    As the above hopefully makes very clear, neither of these is true of Jamie
    nor me, hence I must confess some disappointment that few of the points you
    raise have much relevance to our position. We are merely arguing against
    making assumptions that are intrinsically untestable, yet lead to
    identification claims.

    We do not know of many contexts or examples, e.g. in Epidemiology or Social
    Sciences, where people have anything close to the level of knowledge needed
    to vouch for a not wholly testable assumption, such as your assumption that
    Z is independent of ‘U’ the structural error term for Y. In our opinion, in
    most analyses of observational data, it is a very high bar merely to attempt
    to obtain those inferences that one could have made had one performed
    randomized experiments (even simply defining the relevant randomized
    experiment is often very challenging).

    The fact that a mis-specified NPSEM may lead to erroneous inferences
    regarding direct effects that cannot be detected from any randomized
    experiment (intervening on the variables in the graph) should be of
    considerable concern to most researchers in the Social Sciences and
    Epidemiology. They should be aware that whenever they write down an NPSEM
    you are expecting them to vouch for a level of knowledge that even their
    colleagues who are lucky enough to perform randomized experiments on the
    variables concerned would not have access to.

    Best wishes,


    Comment by Thomas Richardson — May 4, 2010 @ 11:58 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment

Powered by WordPress