Causal Analysis in Theory and Practice

March 21, 2007

Where is economic modelling today?

Filed under: Economics,Opinion — judea @ 8:30 am

In his 2005 article "The Scientific Model of Causality" (Sociological Methodology, vol. 35 (1) page 40,) Jim Heckman reviews the historical development of causal notions in econometrics, and paints an extremely complimentary picture of the current state of this development.

As an illustration of econometric methods and concepts, Heckman discusses the classical problem of estimating the causal effect of Y2 on Y1 in the following systems of equations

Y1 = a1 + c12Y2 + b11X1 + b12 X2 + U1     (16a)
Y2 = a2 + c21Y1 + b21X1 + b22 X2 + U2     (16b)

where Y1 and Y2 represent, respectively, the consumption levels of two interacting agents, and X1, X2, the levels of their income.

Unexpectedly, on page 44, Heckman makes a couple of remarks that almost threw me off my chair; here they are:

"Controlled variation in external (forcing) variables is the key to defining causal effects in nonrecursive models. It is of some interest to readers of Pearl (2000) to compare my use of the standard simultaneous equations model of econometrics in defining causal parameters to his. In the context of equations (16a) and (16b), Pearl defines a causal effect by "shutting one equation down" or performing "surgery" in his colorful language."

"He implicitly assumes that "surgery," or shutting down an equation in a system of simultaneous equations, uniquely fixes one outcome or internal variable (the consumption of the other person in my example). In general, it does not. Putting a constraint on one equation places a restriction on the entire set of internal variables. In general, no single equation in a system of simultaneous equation uniquely determines any single outcome variable. Shutting down one equation might also affect the parameters of the other equations in the system and violate the requirements of parameter stability."

I wish to bring up for blog discussion the following four questions:

  1. Is Heckman right in stating that in nonrecursive systems one should not define causal effect by surgery?
  2. What is the causal effect of Y2 on Y1 in the model of Eqs. (16a -16b) ??
  3. What does Heckman mean when he objects to surgery as the basis for defining causal parameters?
  4. What did he have in mind when he offered "… the standard simultaneous equations model of econometrics" as an alternative to surgery "in defining causal parameters"?

The following are the best answers I could give to these questions, but I would truly welcome insights from other participants, especially economists and social scientists (including Jim Heckman, of course).

(more…)

March 19, 2007

JMLR Special Topic on Causality: Call for papers

Filed under: Announcement — moderator @ 5:09 pm

For those who may be interested:

The Journal of Machine Learning Research (www.JMLR.org) is soliciting papers on all aspects of causality in machine learning, including theory, algorithms and applications. Papers making conceptual advances, making connections between various frameworks or disciplines, presenting novel methods appropriately supported by experiments and/or theory, or novel applications are particularly encouraged.

More details, including a list of suggested topics of interest and important dates are found at:
http://discover.mc.vanderbilt.edu/discover/public/jmlr_special_topic_causality.html

Guest editors are:
Constantin Aliferis, Vanderbilt University
Gregory Cooper, University of Pittsburgh
Andre Elisseeff, IBM Research
Isabelle Guyon, Clopinet
Peter Spirtes, Carnegie Mellon University

February 27, 2007

Counterfactuals in linear systems

Filed under: Counterfactual,Linear Systems — judea @ 4:08 pm

What do we know about counterfactuals in linear models?

Here is a neat result concerning the testability of counterfactuals in linear systems.
We know that counterfactual queries of the form P(Yx=y|e) may or may not be empirically identifiable, even in experimental studies. For example, the probability of causation, P(Yx=y|x',y') is in general not identifiable from experimental data (Causality, p. 290, Corollary 9.2.12) when X and Y are binary.1 (Footnote-1: A complete graphical criterion for distinguishing testable from nontestable counterfactuals is given in Shpitser and Pearl (2007, upcoming)).

This note shows that things are much friendlier in linear analysis:

Claim A. Any counterfactual query of the form E(Yx |e) is empirically identifiable in linear causal models, with e an arbitrary evidence.

Claim B. E(Yx|e) is given by

E(Yx|e) = E(Y|e) + T [xE(X|e)]      (1)

where T is the total effect coefficient of X on Y, i.e.,

T = d E[Yx]/dx = E(Y|do(x+1)) – E(Y|do(x))      (2)

Thus, whenever the causal effect T is identified, E(Yx|e) = is identified as well.

(more…)

February 22, 2007

Moderator’s Update

Filed under: Announcement — moderator @ 10:20 pm

Thanks for visiting! We have been overwhelmed by the interest in our new blog, and we hope to continue improving this forum with your suggestions and comments (please keep them coming!) To our new visitors: we welcome you to browse through our archives and invite you to contribute any questions or opinions that you may have on the topic. Please keep in mind that while previous posts have been taken from discussions regarding the book "Causality" by Judea Pearl, the scope of this blog is not intended to be limited to a particular book or view. As we continue to develop the blog, you can expect to see new content including regular commentaries and topics of interest, conference/workshop announcements, abstracts/reviews of recent articles, and most importantly, your contributions. We hope to hear from you!

Back-door criterion and epidemiology

Filed under: Back-door criterion,Book (J Pearl),Epidemiology — moderator @ 9:03 am

The definition of the back-door condition (Causality, page 79, Definition 3.3.1) seems to be contrived. The exclusion of descendants of X (Condition (i)) seems to be introduced as an after fact, just because we get into trouble if we dont. Why cant we get it from first principles; first define sufficiency of Z in terms of the goal of removing bias and, then, show that, to achieve this goal, you neither want nor need descendants of X in Z.

October 15, 2006

The validity of G-estimation

Filed under: Book (J Pearl),G-estimation — moderator @ 12:00 pm

From a previous correspondence with Eliezer S. Yudkowsky, Research Fellow, Singularity Institute for Artificial Intelligence, Santa Clara, CA

The following paragraph appears on p. 103, shortly after eq. 3.63 in my copy of Causality:

"To place this result in the context of our analysis in this chapter, we note that the class of semi-Markovian models satisfying assumption (3.62) corresponds to complete DAGs in which all arrowheads pointing to Xk originate from observed variables."

It looks to me like this is a sufficient, but not necessary, condition to satisfy 3.62. It appears to me that the necessary condition is that no confounder exist between any Xi and Lj with i < j and that no confounder exist between any Xi and the outcome variable Y. However, a confounding arc between any Xi and Xj, or a confounding arc between Li and Xj with i <= j, should not render the causal effect non-identifiable. For example, even if a confounding arc exists between X2 and X3 (but no other confounding arcs exist in the model), the causal effect on Y of setting X2=x2 and X3=x3 should be the same as the distribution on Y if we observe x2 and x3.

It is also not necessary that the DAG be complete.

September 1, 2006

Welcome to the Causality Blog

Filed under: Announcement — moderator @ 8:53 pm

Thank you for visiting. This blog is devoted specifically to the topic of causation to be driven by visitors with interest in the field.

  • To find out about the purpose of this blog, please click here .
  • To submit a topic or question for discussion, please click here and complete the simple form.
  • To see our most recent breakthroughs, please click here .

We appreciate your interest in causality and hope to see your views on this subject!

May 8, 2006

Identifying conditional plans

Filed under: Book (J Pearl),Plans — moderator @ 12:00 am

Section 4.2 of the book (p. 113) gives an identification condition and estimation formula for the effect of a conditional action, namely, the effect of an action do(X=g(z)) where Z is a measurement taken prior to the action. Is this equation generalizable to the case of several actions, i.e., conditional plan?

The difficulty seen is that this formula was derived on the assumption that X does not change the value of Z. However, in a multi-action plan, some actions in X could change observations Z that are used to guide future actions. We do not have notation for distinguishing post-intevention from pre-intevention observations.

February 16, 2006

The meaning of counterfactuals

Filed under: Counterfactual,Definition — moderator @ 12:00 am

From Dr. Patrik Hoyer (University of Helsinki, Finland):

I have a hard time understanding what counterfactuals are actually useful for. To me, they seem to be answering the wrong question. In your book, you give at least a couple of different reasons for when one would need the answer to a counterfactual question, so let me tackle these separately:

  1. Legal questions of responsibility. From your text, I infer that the American legal system says that a defendant is guilty if he or she caused the plaintiff's misfortune. You take this to mean that if the plaintiff had not suffered misfortune had the defendant not acted the way he or she did, then the defendant is to be sentenced. So we have a counterfactual question that needs to be determined to establish responsibility. But in my mind, the law is clearly flawed. Responsibility should rest with the predicted outcome of the defendant's action, not with what actually happened. Let me take a simple example: say that I am playing a simple dice-game for my team. Two dice are to be thrown and I am to bet on either (a) two sixes are thrown, or (b) anything else comes up. If I guess correctly, my team wins a dollar, if I guess wrongly, my team loses a dollar. I bet (b), but am unlucky and two sixes actually come up. My team loses a dollar. Am I responsible for my team's failure? Surely, in the counterfactual sense yes: had I bet differently my team would have won. But any reasonable person on the team would thank me for betting the way I did. In the same fashion, a doctor should not be held responsible if he administers, for a serious disease, a drug which cures 99.99999% of the population but kills 0.00001%, even if he was unlucky and his patient died. If the law is based on the counterfactual notion of responsibility then the law is seriously flawed, in my mind.

    A further example is that on page 323 of your book: the desert traveler. Surely, both Enemy-1 and Enemy-2 are equally 'guilty' for trying to murder the traveler. Attempted murder should equal murder. In my mind, the only rationale for giving a shorter sentence for attempted murder is that the defendant is apparently not so good at murdering people so it is not so important to lock him away… (?!)

  2. The use of context in decision-making. On page 217, you write "At this point, it is worth emphasizing that the problem of computing counterfactual expectations is not an academic exercise; it represents in fact the typical case in almost every decision-making situation." I agree that context is important in decision making, but do not agree that we need to answer counterfactual questions.

    In decision making, the things we want to estimate is P(future | do(action), see(context) ). This is of course a regular do-probability, not a counterfactual query. So why do we need to compute counterfactuals?

    In your example in section 7.2.1, your query (3): "Given that the current price is P=p0, what would be the expected value of the demand Q if we were to control the price at P=p1?". You argue that this is counterfactual. But what if we introduce into the graph new variables Qtomorrow and Ptomorrow, with parent sets (U1, I, Ptomorrow) and (W,U)2,Qtomorrow), respectively, and with the same connection-strengths d1, d2, b2, and b1. Now query (3) reads: "Given that we observe P=p0, what would be the expected value of the demand Qtomorrow if we perform the action do(Ptomorrow=p1)?" This is the same exact question but it is not counterfactual, it is just P(Qtomorrow | do(Ptomorrow=p1), see(P=P0)). Obviously, we get the correct answer by doing the counterfactual analysis, but the question per se is no longer counterfactual and can be computed using regular do( )-machinery. I guess this is the idea of your 'twin network' method of computing counterfactuals. In this case, why say that we are computing a counterfactual when what we really want is prediction (i.e. a regular do-expression)?

  3. In the latter part of your book, you use counterfactuals to define concepts such as 'the cause of X' or 'necessary and sufficient cause of Y'. Again, I can understand that it is tempting to mathematically define such concepts since they are in use in everyday language, but I do not think that this is generally very helpful. Why do we need to know 'the cause' of a particular event? Yes, we are interested in knowing 'causes' of events in the sense that they allows us to predict the future, but this is again a case of point (2) above.

    To put it in the most simplified form, my argument is the following: Regardless of if we represent individuals, businesses, organizations, or government, we are constantly faced with decisions of how to act (and these are the only decisions we have!). What we want to know is, what will likely happen if we act in particular ways. So we want to know is P(future | do(action), see(context) ). We do not want nor need the answers to counterfactuals.

Where does my reasoning go wrong?

February 16, 2004

Submodel for subgraphs of direct effect removal

Filed under: do-calculus — moderator @ 12:00 am

From Susan Scott, Australia

In the do-calculus inference rules, I understand how the subgraph is generated from the submodel do(X = x), Gx, the removal of direct causes and therefore d-separation is a valid test for conditional independence. However I don't understand the submodel for subgraphs representing the removal of direct effects. Would you please explain the submodel I could use to explain this subgraph and what distribution it represents.

« Previous PageNext Page »

Powered by WordPress