### In Defense of Unification (Comments on West and Koch’s review of *Causality*)

A new review of my book *Causality* (Pearl, 2009) has appeared in the Journal of Structural Equation Modeling (SEM), authored by Stephen West and Tobias Koch (W-K). See http://bayes.cs.ucla.edu/BOOK-2K/west-koch-review2014.pdf

I find the main body of the review quite informative, and I thank the reviewers for taking the time to give SEM readers an accurate summary of each chapter, as well as a lucid description of the key ideas that tie the chapters together. However, when it comes to accepting the logical conclusions of the book, the reviewers seem reluctant, and tend to cling to traditions that lack the language, tools and unifying perspective to benefit from the chapters reviewed.

The reluctance culminates in the following paragraph:

“We value Pearl’s framework and his efforts to show that other frameworks can be translated into his approach. Nevertheless we believe that there is much to be gained by also considering the other major approaches to causal inference.”

W-K seem to value my “efforts” toward unification, but not the unification itself, and we are not told whether they doubt the validity of the unification, or whether they doubt its merits.

Or do they accept the merits and still see “much to be gained” by pre-unification traditions? If so, what is it that can be gained by those traditions and why can’t these gains be achieved within the unified framework presented in *Causality*?

In their explanation of why they do not embrace the unification whole-heartedly, they seem to be speaking about another book entirely, not the one I have written. In their words:

“[In Pearl’s framework] there are relatively few unknown causal influences in the system, the units tend to be homogeneous, causal effects are assumed to be stable across settings and time, and levels of treatments can be set to specified values as reflected in the do(x) operator….”

This description would come as a total surprise to readers of *Causality*, because none of the listed shortcomings apply to the methods described in the book, as I will soon elaborate, and as would be attested by anyone who tries to solve concrete problems using several alternative methods. To the best of my knowledge, chapter 7 of *Causality* (especially pages 231-234) is still the only text available in which concrete problems are solved side by side in both the structural and potential outcome approaches. I invite readers to examine, for example, the smoking-tar-cancer story on page 83 of *Causality* and determine, using “other major approaches” whether the effect of smoking on cancer can be estimated from the assumed model. It is a simple example, only a few lines of text, but truly instructive; Heckman and Pinto (2014) have labored over 20 full pages trying to fit and solve it within “traditional econometrics” See Econometric Theory, doi:10.1017/S026646661400022X

Thus, can the statement “much to be gained by other major approaches” be defended without trying out those approaches on concrete examples?

I assume W-K reached their first conclusion (that the structural approach assumes few unknown causal influences) by counting variables in the examples illustrated in *Causality* and comparing them to the hundreds of “unknown causal influences” that haunt the complex real-life problems that they and other researchers are undoubtedly “very concerned about.” But if one accepts the logical equivalence of the structural and potential outcome approaches, then surely the same assumptions that permit a potential outcome practitioner to ignore the many “unknown causal influences” in a messy practical problem can also be invoked in the structural approach, and achieve identical results. The only difference would be the transparency of the justification.

In fact, I am not aware of a system with many causal influences that can be analyzed through any other representation language except structural equations. Specifically, I have not seen ignorability assumptions ever justified in a system with more than 3-4 variables. (Skeptics are invited to try it on Fig. 7.5 on page 232 of *Causality.*) What, then, is the utility of boasting about being “very concerned about” many “unknown causal influences” if one cannot handle even a toy problem with 3-4 causal influences? And what is the utility of comparing variable counts in textbook exercises versus unknown variables in ill-understood real life problems?

I am similarly puzzled by W-C’s other findings. For example: “[In Pearl’s framework] the units tend to be homogeneous.” The opposite is in fact the case; units heterogeneity is assumed a priori in nonparametric models. See http://ftp.cs.ucla.edu/pub/stat_ser/r406.pdf

Or, “[In Pearl’s framework] causal effects are assumed to be stable across settings and time.” Not so. Causal effects are allowed to vary violently across settings (see page 113 and pages 354-355) and, when dynamic treatments are concerned (as in chapter 4), they are allowed to vary across time as well.

Or, “levels of treatments can be set [externally] to specified values.” Not so! Levels of treatments need not be set at all, as in the case of noncompliance (chapter 8 ) where the exposure X=x may not result from direct intervention. The do(x) operator does not require one to imagine the physical manipulation of X (as is required in the potential outcome approach). It is merely a notational device that distinguishes counterfactual conditionals from probabilistic conditionalization. I am yet to see a policy question that cannot be expressed using the do(x) operator or its unit-level companions – counterfactual conditionals.

W-K present the following arguments in favor of Rubin’s potential outcomes approach:

(i) “Rubin’s potential outcomes model defines a causal effect as the difference between the response of a single unit given treatment or control at the same time and in the same context.”

Since the structural definition of Y_x(u) is logically equivalent to Rubin’s Y_x(u), why should not the former inherit all the merits of the latter? It indeed does. Chapter 9 and pp.396-398 define and deal with unit-level counterfactuals, and show that, contrary to conventional wisdom, properties of unit-level counterfactuals can be inferred from population data. (See also http://ftp.cs.ucla.edu/pub/stat_ser/r431.pdf) and http://ftp.cs.ucla.edu/pub/stat_ser/r375.pdf)

(ii) “Rubin does not make homogeneity assumptions.”

Nor does the structural approach, as explained above.

(iii) “Rubin’s perspective also offers procedures for addressing situations in which assumptions fail (e.g., treatment noncompliance).”

Treatment noncompliance is not a situation in which assumptions fail. Rather, it is a situation in which the assumptions are different than those made in controlled randomized trials and, not surprisingly, by being explicit about the new set of assumptions, the structural framework has produced some of the most ambitious conclusions that partially identified situations would permit. For example, chapter 8 of *Causality* derives universal bounds for ACE and ETT under noncompliance. It then supplements them with instrumental inequalities, as well as with vivid illustrations of how the Bayesian posterior of ACE varies with prior assumptions and with sample size. These developments were hardly shy of “addressing situations in which assumptions fail (e.g., treatment noncompliance).”

(iv) “Rubin often works in the environment of health research.”

My latest readings of the health research literature assure me that epidemiologists and bio-statisticians have adopted DAGs as a preferred communication language (See e.g., Glymour (2006), Wilcox (2006), Rothman etal (2008); Howards etal. (2012).). Social and behavioral scientists are not far behind (see, e.g., Morgan (2013), Lee, 2012). I am surprised that W-C are not aware of this trend.

As another alternative approach to causal inference, W-K hail Campbell’s perspective which I, confessedly, have not been able to use in my research. True, Campbell “offers researchers lists of threats to internal [and external] validity.” These lists were probably very useful three decades ago, prior to the mathematization of causality. But I am curious to learn how they can be used today, when new tools are available, which permit us to detect and neutralize threats with mathematical precision. Of course, I would not rule out the possibility that Campbell’s lists are still used by researchers who seek heuristic shortcuts to circumvent formal analysis. Such shortcuts benefit many sciences, and can be practiced in harmony with the formal approach, as long as we understand the boundaries of their applicability. (I am reminded of the laws of thermodynamics that were formulated and practiced prior to their derivation from the more fundamental principles of statistical mechanics, and are still used today by engineers and physicists). What we need therefore is a detailed analysis of how Campbell’s heuristics follow from the structural theory of causation. and in what types of situations they provide good approximations to the latter. Such analysis would be illuminating, and I hope someone from Campbell’s camp undertakes this task.

At the same time, we must acknowledge that it was graphical methods that produced solutions to the two problems that Campbell and Rubin set out to solve three or four decades ago: external validity, in the case of Campbell, and internal validity (or control

of confounding) in the case of Rubin. The first is mathematized in (Pearl and Bareinboim, 2014) and the second in (Shpitser and Pearl, 2006). Threats may warn us of problems — they do not solve problems.

Finally, W-K’s review ends with a recommendation that is both puzzling and revealing. It states: “We encourage SEM researchers following Pearl’s framework to be skeptical, and to fiercely confront their preferred model with strong alternative models in the tradition of economics.”

The puzzling part is that this recommendation is addressed to “followers of Pearl’s framework” and not to those following the alternative perspectives of Rubin or Campbell. After all, Rubin too makes modeling assumptions (in W-K’s words: “Rubin assumes strong ignorability…”), so why do we not hear an encouragement to be skeptical of these ignorability assumptions as well?

I know the answer, and I think W-K know it too — one cannot be skeptical of assumptions that one does not comprehend. Ignorability assumptions are cognitively formidable, hence they are not meant to be understood and to be judged for plausibility. They are made “because they justify the use of available statistical methods and not because they are truly believed” (Joffe, 2010). By contrast, these same assumptions are shining loud and clear in the structural approach, and can therefore be submitted to skeptical scrutiny. Moreover, the structural representation explicates the testable implications of its modeling assumptions, while the potential outcome representation does not.

To summarize: If you are a follower of the structural approach, you should take W-K recommendation with great pride. You were singled out to be skeptical of your model, because you are uniquely capable of explicating and interpreting your modelling assumptions and uniquely capable of testing their implications — your rivals can’t.

I invite Stephen West and Tobias Koch to an open discussion on whether any other interpretation of their recommendation should be considered, given the unique capabilities or the methodology advanced in *Causality*

**Refs.**

Glymour, M M (2006) Using causal diagrams to understand common problems in social epidemiology. In Methods in Social Epidemiology. John Wiley and Sons, San Francisco. CA, 393-428.

Howards, Schisterman, Poole, Kaufman and Weinberg (2012).) Toward a Clearer Definition of Confounding” Revisited with Directed Acyclic Graphs. American Journal of Epidemiology (2012) 176 (6): 506-511

Joffe, Marshall M.; Yang, Wei Peter; and Feldman, Harold I. (2010) “Selective Ignorability Assumptions in Causal Inference,” The International Journal of Biostatistics: Vol. 6: Iss. 2, Article 11. DOI: 10.2202/1557-4679.1199 Available at: http://www.bepress.com/ijb/vol6/iss2/11

Lee, J. J. 2012 Correlation and causation in the study of personality. European Journal of Personality 26 372-390.

Morgan, S.L., (Ed) Handbook of Causal Analysis for Social Research, Springer, 2013.

Pearl,J.( 2009) Causality: models, reasoning and inference, Cambridge University Press (2nd Edition) 2009.

Pearl and Bareinboim (2014) External validity: From do-calculus to transportability across populations. http://ftp.cs.ucla.edu/pub/stat_ser/r400.pdf. Forthcoming Statistical Science.

Rothman, K.J., Greenland S. & Lash T.I., eds. (2008) Modern Epidemiology, 3rd ed. Philadelphia: Lippincott.

Shpitser and Pearl (2008) Complete identification methods for the causal hierarchy Journal of Machine Learning Research 9 1941-1979. http://ftp.cs.ucla.edu/pub/stat_ser/r336-published.pdf

Wilcox, Allen (2006), “The Perils of Birth Weight – A Lesson from Directed Acyclic Graphs” American Journal of Epidemiology 164(11):1121-1123