The meaning of counterfactuals
From Dr. Patrik Hoyer (University of Helsinki, Finland):
I have a hard time understanding what counterfactuals are actually useful for. To me, they seem to be answering the wrong question. In your book, you give at least a couple of different reasons for when one would need the answer to a counterfactual question, so let me tackle these separately:
- Legal questions of responsibility. From your text, I infer that the American legal system says that a defendant is guilty if he or she caused the plaintiff's misfortune. You take this to mean that if the plaintiff had not suffered misfortune had the defendant not acted the way he or she did, then the defendant is to be sentenced. So we have a counterfactual question that needs to be determined to establish responsibility. But in my mind, the law is clearly flawed. Responsibility should rest with the predicted outcome of the defendant's action, not with what actually happened. Let me take a simple example: say that I am playing a simple dice-game for my team. Two dice are to be thrown and I am to bet on either (a) two sixes are thrown, or (b) anything else comes up. If I guess correctly, my team wins a dollar, if I guess wrongly, my team loses a dollar. I bet (b), but am unlucky and two sixes actually come up. My team loses a dollar. Am I responsible for my team's failure? Surely, in the counterfactual sense yes: had I bet differently my team would have won. But any reasonable person on the team would thank me for betting the way I did. In the same fashion, a doctor should not be held responsible if he administers, for a serious disease, a drug which cures 99.99999% of the population but kills 0.00001%, even if he was unlucky and his patient died. If the law is based on the counterfactual notion of responsibility then the law is seriously flawed, in my mind.
A further example is that on page 323 of your book: the desert traveler. Surely, both Enemy-1 and Enemy-2 are equally 'guilty' for trying to murder the traveler. Attempted murder should equal murder. In my mind, the only rationale for giving a shorter sentence for attempted murder is that the defendant is apparently not so good at murdering people so it is not so important to lock him away… (?!)
- The use of context in decision-making. On page 217, you write "At this point, it is worth emphasizing that the problem of computing counterfactual expectations is not an academic exercise; it represents in fact the typical case in almost every decision-making situation." I agree that context is important in decision making, but do not agree that we need to answer counterfactual questions.
In decision making, the things we want to estimate is P(future | do(action), see(context) ). This is of course a regular do-probability, not a counterfactual query. So why do we need to compute counterfactuals?
In your example in section 7.2.1, your query (3): "Given that the current price is P=p0, what would be the expected value of the demand Q if we were to control the price at P=p1?". You argue that this is counterfactual. But what if we introduce into the graph new variables Qtomorrow and Ptomorrow, with parent sets (U1, I, Ptomorrow) and (W,U)2,Qtomorrow), respectively, and with the same connection-strengths d1, d2, b2, and b1. Now query (3) reads: "Given that we observe P=p0, what would be the expected value of the demand Qtomorrow if we perform the action do(Ptomorrow=p1)?" This is the same exact question but it is not counterfactual, it is just P(Qtomorrow | do(Ptomorrow=p1), see(P=P0)). Obviously, we get the correct answer by doing the counterfactual analysis, but the question per se is no longer counterfactual and can be computed using regular do( )-machinery. I guess this is the idea of your 'twin network' method of computing counterfactuals. In this case, why say that we are computing a counterfactual when what we really want is prediction (i.e. a regular do-expression)?
- In the latter part of your book, you use counterfactuals to define concepts such as 'the cause of X' or 'necessary and sufficient cause of Y'. Again, I can understand that it is tempting to mathematically define such concepts since they are in use in everyday language, but I do not think that this is generally very helpful. Why do we need to know 'the cause' of a particular event? Yes, we are interested in knowing 'causes' of events in the sense that they allows us to predict the future, but this is again a case of point (2) above.
To put it in the most simplified form, my argument is the following: Regardless of if we represent individuals, businesses, organizations, or government, we are constantly faced with decisions of how to act (and these are the only decisions we have!). What we want to know is, what will likely happen if we act in particular ways. So we want to know is P(future | do(action), see(context) ). We do not want nor need the answers to counterfactuals.
Where does my reasoning go wrong?
"In decision making, the things we want to estimate is P(future | do(action), see(context)). This is of course a regular do-probability, not a counterfactual query. So why do we need to compute counterfactuals?"
The answer is that, in certain cases, the variables entering into "context" are CONSEQUENCES of the "action", and the expression P(y|do(x), z) is defined as the probability of y given that we do X=x and LATER observe Z=z, which is not the probability of y given that we first observe Z=z and then do X=x.
This confusion disappears of course when we have a sequential, time-indexed model. But, working with static models as in my book, we we we do not have the language to express the probability P of Y=y given that we first observe Z=z and then do X=x. Counterfactuals give us a way of expressing this probability, by writing
P = P (yx | z). Note that P = P (yx | z) = P(y|do(x), z) if z is a non descendant of x.
I have elaborated on this point in
Pearl, J., “The logic of counterfactuals in causal inference (Discussion of `Causal inference without counterfactuals' by A.P. Dawid),'' Journal of American Statistical Association, Vol. 95, No. 450, 428–435, June 2000.
Thanks for your illuminating questions. I hope that they, together with my attempted answers will help other readers with similar difficulties.
Best wishes,
========Judea Pearl
Comment by judea — November 2, 2006 @ 1:10 am
A new causality blog
A group of from University of California in Los Angeles, including the popular author of books on Bayesian networks (sometimes referred to as belief networks or as graphical models, as they aren’t Bayesian in the Bayesian statistics sense) and causali…
Trackback by Statistical Modeling, Causal Inference, and Social Science — January 24, 2007 @ 5:40 pm
For what are counterfactuals useful? Well, for saying what I am doing now: if you hadn't asked the question, I wouldn't be writing this. It is a matter of fact that the definition of culpability in English and American tort law requires a counterfactual determination of causality, whether you think that appropriate or not. So if you were to be a tort lawyer, you would have to reason with them; it would obviously be useful for your profession (there! Another two counterfactuals!). Another example: the definition of "Class A Mishap" (= most serious type of accident) for the U.S. Air Force explicitly uses a counterfactual formulation. So if you are a USAF accident investigator, counterfactuals are central to your work. Indeed, judgements of causal influence in aircraft accident investigation are often made using a counterfactual criterion. Two divisions of Siemens Transportation Systems, the Rail Automation Division, which is the world market leader in rail signalling systems, and the Mass Transit Division, which builds trams amongst other things, use a counterfactual criterion for determining causality when they hold internal investigations into product defects (they both use my causal analysis method WBA). Observe that accidents are once-only things. In our experience, no two accidents are causally identical. So it is mostly infeasible to use Bayesian-net techniques, or indeed any technique that requires a Humean "constant conjunction" interpretation of causality, to establish the accident causality pattern. You don't get to iterate and test, and the available statististics reflect far more human decisions on what is important than they reflect objective causal influences. WBA uses the Lewis counterfactual interpretation of causal factor to establish causality, and the counterfactuals themselves are expert judgments. Peter Ladkin
Comment by Peter Ladkin — February 23, 2007 @ 12:49 am
Dear Dr. Hoyer, I wanted to comment on your questions regarding the utility of counterfactuals in causal reasoning. One area in which I believe that counterfactuals are very useful and natural is in explanatory arguments, including legal ones. Counterfactuals provide a nice tool for dealing with the characteristic opacity of causal reports. However general causal knowledge is, causal reports refer to particulars in the world: that is, to such-and-such an event happening at a particular time and in a particular circumstance. As a very simple example, consider my driving to work in the morning and arriving late because of the traffic I encountered along Highway 101. The following are two possible causal reports: Driving on Highway 101 to work caused me to be late for work. Driving to work in my Nissan caused me to be late for work. The first is the desirable conclusion (i.e., I did not have any car problems, it was the traffic on 101). However, the event referred to in the antecedent of the second report is clearly a true description of what I did (that I drove my Nissan to work). Notice that we cannot assume that the underlying causal knowledge will be structured so as to be instantiated for every possible pairs of descriptions; i.e., Drive(Me,101,Work) CAUSES Arrive(Me,Late,Work). Counterfactual reasoning, however, provides a straitforward and intuitive way to deal with the problem of opacity: If I had NOT taken Highway 101 to work I would not have been late for work. (Here negation is assumed to have narrow scope=> I take some other Highway). A further use for causal reasoning in an explanatory setting involves the proper formalization of causal connectives often made use of in causal reports. These include terms such as: prevents, enables, helps, hinders, etc. All of these have natural interpretations in counterfactual terms, reflecting different levels of causal connectedness. I found your example in (1) quite interesting. I think that what is missing, that leads to the paradoxical conclusion you observe, is some attribution of agency (i.e., control over actions or background intention relative to the action in question) on the part of the person making the bet. So, while it seems completely reasonable to state that the agent intended to throw the dice, he did not intend to throw a pair of six's (at least if he is rational) and so is not responsible for the team losing the dollar because he threw two six's. I think, however, one could make an argument that by betting the way he did or by betting at all, he was responsible for the team loosing. If you are interested in some of my work in this area, please take a look at: Ortiz, Charles L., A commonsense language for reasoning about causation and rational action. AI Journal, vol. 111, no. 2, pp. 73, 1999. Ortiz, Charles L. Explanatory Update Theory: applications of counterfactual reasoning to causation, vol. 108, no. 1-2, pp. 125-178, 1999. Charlie Ortiz Artificial Intelligence Center SRI International
Comment by Charlie Ortiz — February 26, 2007 @ 12:41 pm