Causal Analysis in Theory and Practice

April 24, 2000

Causality and the mystical error terms

Filed under: General,structural equations — moderator @ 12:00 am

From David Kenny (University of Connecticut) 

Let me just say that it is very gratifying to see a philosopher give the problem of causality some serious attention. Moreover, you discuss the concept as it used in contemporary social sciences. I have bothered by the fact that all too many social scientist try to avoid saying "cause" when that is clearly what they mean to say. Thank you!

I have not finished your book, but I cannot resist making one point to you. In 5.4, you discuss the meaning of structural coefficients, but you spend a good deal of time discussing the meaning of epsilon or e. It seems to me that e has a very straight-forward meaning in SEM. If the true equation for y is

y = Bx + Cz + Dq + etc + r where is r is meant to allow for some truly random component, then e = Cz + Dq + etc + r or the sum of the omitted variables. The difficulty in SEM is that usually, though not always, for identification purposes it must be assumed that e and x have a zero correlation. Perhaps this is the standard "omitted variables" explanation of e that you allude to, but it does not seem at all mysterious, at least to me.


  1. We are in perfect agreement on the error terms. If we choose to represent the equation for y with just one independent variable, say y = cx + e, then e is indeed none other but e = Cz + Dq + etc + r This is precisely what I meant by "the standard omitted variables explanation of e", namely, the sum of all omitted "true" factors.

    This explanation is not "mysterious" to you and me, but it is mysterious to many researchers who confuse this interpretation with the error terms in regression analysis, which are orthogonal to x by definition (see Section 5.1.2 and footnote 25, p. 244). It is also mysterious to a casual reader of the current SEM literature, where the distinction is not made clear, and where the orthogonality condition is imposed either as a matter of mathematical convenience or, worse yet, "for identification purposes" (rather than by substantive considerations.) This concept of making "identifying assumptions", which seems to have loomed from and nurtured in the econometrics literature, has always been a mystery to me. How can one hope to get something useful from a model that is distorted for purposes of identification, and thus may conflict with one's perception of reality??? On page 163 I argue that the interpretation of e as a summary of omitted factors should help an investigator sharpen his/her perception of reality, and critically evaluate if it makes sense to assume that e and x are orthogonal.

    Best wishes,
    ========Judea Pearl

    Comment by judea — April 24, 2000 @ 12:00 pm

  2. I came across this discussion in an attempt to check my understanding of how to generate potential outcomes, as outlined in J. Pearl’s “Structural Theory of Causation”; I hope I am not saying something entirely stupid. I understood that error terms were independent from the other parents of Y and thus could be omitted from the DAG. To consider a simple setting, suppose that X is binary, U is latent and discrete and Y is also discrete and X,U are the only parents of Y. Let p(i,x,u) = P(Y <= i | U=u,X=x); let e be sampled from the uniform in (0,1), then I would set Y(x) = i if p(i-1,x,u) < e <= p(i,x,u). Does this makes sense ? If so, it seems to imply that the joint distribution of Y(0), Y(1) is exactly the distribution with maximum positive association in the Frechet class with margins given by p(y,x,u).

    Comment by antonio forcina — April 7, 2010 @ 10:36 am

RSS feed for comments on this post. TrackBack URI

Leave a comment

Powered by WordPress