# Causal Analysis in Theory and Practice

## November 29, 2014

### On the First Law of Causal Inference

Filed under: Counterfactual,Definition,Discussion,General — judea @ 3:53 am

In several papers and lectures I have used the rhetorical title “The First Law of Causal Inference” when referring to the structural definition of counterfactuals:

The more I talk with colleagues and students, the more I am convinced that the equation deserves the title. In this post, I will explain why.

As many readers of Causality (Ch. 7) would recognize, Eq. (1) defines the potential-outcome, or counterfactual, Y_x(u) in terms of a structural equation model M and a submodel, M_x, in which the equations determining X is replaced by a constant X=x. Computationally, the definition is straightforward. It says that, if you want to compute the counterfactual Y_x(u), namely, to predict the value that Y would take, had X been x (in unit U=u), all you need to do is, first, mutilate the model, replace the equation for X with X=x and, second, solve for Y. What you get IS the counterfactual Y_x(u). Nothing could be simpler.

So, why is it so “fundamental”? Because from this definition we can also get probabilities on counterfactuals (once we assign probabilities, P(U=u), to the units), joint probabilities of counterfactuals and observables, conditional independencies over counterfactuals, graphical visualization of potential outcomes, and many more. [Including, of course, Rubin’s “science”, Pr(X,Y(0),(Y1))]. In short, we get everything that an astute causal analyst would ever wish to define or estimate, given that he/she is into solving serious problems in causal analysis, say policy analysis, or attribution, or mediation. Eq. (1) is “fundamental” because everything that can be said about counterfactuals can also be derived from this definition.
[See the following papers for illustration and operationalization of this definition:
http://ftp.cs.ucla.edu/pub/stat_ser/r431.pdf
http://ftp.cs.ucla.edu/pub/stat_ser/r391.pdf
http://ftp.cs.ucla.edu/pub/stat_ser/r370.pdf
also, Causality chapter 7.]

However, it recently occurred on me that the conceptual significance of this definition is not fully understood among causal analysts, not only among “potential outcome” enthusiasts, but also among structural equations researchers who practice causal analysis in the tradition of Sewall Wright, O.D. Duncan, and Trygve Haavelmo. Commenting on the flood of methods and results that emerge from this simple definition, some writers view it as a mathematical gimmick that, while worthy of attention, need to be guarded with suspicion. Others labeled it “an approach” that need be considered together with “other approaches” to causal reasoning, but not as a definition that justifies and unifies those other approaches.

Even authors who advocate a symbiotic approach to causal inference — graphical and counterfactuals — occasionally fail to realize that the definition above provides the logic for any such symbiosis, and that it constitutes in fact the semantical basis for the potential-outcome framework.

I will start by addressing the non-statisticians among us; i.e., economists, social scientists, psychometricians, epidemiologists, geneticists, metereologists, environmental scientists and more, namely, empirical scientists who have been trained to build models of reality to assist in analyzing data that reality generates. To these readers I want to assure that, in talking about model M, I am not talking about a newly invented mathematical object, but about your favorite and familiar model that has served as your faithful oracle and guiding light since college days, the one that has kept you cozy and comfortable whenever data misbehaved. Yes, I am talking about the equation

that you put down when your professor asked: How would household spending vary with income, or, how would earning increase with education, or how would cholesterol level change with diet, or how would the length of the spring vary with the weight that loads it. In short, I am talking about innocent equations that describe what we assume about the world. They now call them “structural equations” or SEM in order not to confuse them with regression equations, but that does not make them more of a mystery than apple pie or pickled herring. Admittedly, they are a bit mysterious to statisticians, because statistics textbooks rarely acknowledge their existence [Historians of statistics, take notes!] but, otherwise, they are the most common way of expressing our perception of how nature operates: A society of equations, each describing what nature listens to before determining the value it assigns to each variable in the domain.

Why am I elaborating on this perception of nature? To allay any fears that what is put into M is some magical super-smart algorithm that computes counterfactuals to impress the novice, or to spitefully prove that potential outcomes need no SUTVA, nor manipulation, nor missing data imputation; M is none other but your favorite model of nature and, yet, please bear with me, this tiny model is capable of generating, on demand, all conceivable counterfactuals: Y(0),Y(1), Y_x, Y_{127}, X_z, Z(X(y)) etc. on and on. Moreover, every time you compute these potential outcomes using Eq. (1) they will obey the consistency rule, and their probabilities will obey the laws of probability calculus and the graphoid axioms. And, if your model justifies “ignorability” or “conditional ignorability,” these too will be respected in the generated counterfactuals. In other words, ignorability conditions need not be postulated as auxiliary constraints to justify the use of available statistical methods; no, they are derivable from your own understanding of how nature operates.

In short, it is a miracle.

Not really! It should be self evident. Couterfactuals must be built on the familiar if we wish to explain why people communicate with counterfactuals starting at age 4 (“Why is it broken?” “Lets pretend we can fly”). The same applies to science; scientists have communicated with counterfactuals for hundreds of years, even though the notation and mathematical machinery needed for handling counterfactuals were made available to them only in the 20th century. This means that the conceptual basis for a logic of counterfactuals resides already within the scientific view of the world, and need not be crafted from scratch; it need not divorce itself from the scientific view of the world. It surely should not divorce itself from scientific knowledge, which is the source of all valid assumptions, or from the format in which scientific knowledge is stored, namely, SEM.

Here I am referring to people who claim that potential outcomes are not explicitly represented in SEM, and explicitness is important. First, this is not entirely true. I can see (Y(0), Y(1)) in the SEM graph as explicitly as I see whether ignorability holds there or not. [See, for example, Fig. 11.7, page 343 in Causality]. Second, once we accept SEM as the origin of potential outcomes, as defined by Eq. (1), counterfactual expressions can enter our mathematics proudly and explicitly, with all the inferential machinery that the First Law dictates. Third, consider by analogy the teaching of calculus. It is feasible to teach calculus as a stand-alone symbolic discipline without ever mentioning the fact that y'(x) is the slope of the function y=f(x) at point x. It is feasible, but not desirable, because it is helpful to remember that f(x) comes first, and all other symbols of calculus, e.g., f'(x), f”(x), [f(x)/x]’, etc. are derivable from one object, f(x). Likewise, all the rules of differentiation are derived from interpreting y'(x) as the slope of y=f(x).

First, I would have liked to convince potential outcome enthusiasts that they are doing harm to their students by banning structural equations from their discourse, thus denying them awareness of the scientific basis of potential outcomes. But this attempted persuasion has been going on for the past two decades and, judging by the recent exchange with Guido Imbens (link), we are not closer to an understanding than we were in 1995. Even an explicit demonstration of how a toy problem would be solved in the two languages (link) did not yield any result.

Second, I would like to call the attention of SEM practitioners, including of course econometricians, quantitative psychologists and political scientists, and explain the significance of Eq. (1) in their fields. To them, I wish to say: If you are familiar with SEM, then you have all the mathematical machinery necessary to join the ranks of modern causal analysis; your SEM equations (hopefully in nonparametric form) are the engine for generating and understanding counterfactuals.; True, your teachers did not alert you to this capability; it is not their fault, they did not know of it either. But you can now take advantage of what the First Law of causal inference tells you. You are sitting on a gold mine, use it.

Finally, I would like to reach out to authors of traditional textbooks who wish to introduce a chapter or two on modern methods of causal analysis. I have seen several books that devote 10 chapters on SEM framework: identification, structural parameters, confounding, instrumental variables, selection models, exogeneity, model misspecification, etc., and then add a chapter to introduce potential outcomes and cause-effect analyses as useful new comers, yet alien to the rest of the book. This leaves students to wonder whether the first 10 chapters were worth the labor. Eq. (1) tells us that modern tools of causal analysis are not new comers, but follow organically from the SEM framework. Consequently, one can leverage the study of SEM to make causal analysis more palatable and meaningful.

Please note that I have not mentioned graphs in this discussion; the reason is simple, graphical modeling constitutes The Second Law of Causal Inference.

Enjoy both,
Judea

## November 16, 2014

### On DAGs, Instruments and Social Networks

Filed under: General — eb @ 4:26 pm

Apropos the lively discussion we have had here on graphs and IV models, Felix Elwert writes:

“Dear Judea, here’s an IV paper using DAGs that we recently published. We use DAGs to evaluate genes as instrumental variables for causal peer effects in social networks. The substantive question is whether obesity is contagious among friends. We evaluated both single-IV and IV-set candidates. We found DAGs especially helpful for evaluating variations within a large class of qualitative data-generating processes by playing through a lots of increasingly realistic variations of our main DGPs (Table 1). Being able to discuss identification purely in terms of qualitative causal statements (e.g., “fat genes may affect latent causes of friendship formation”) was very helpful. One of our goals was to check far one can relax the DGP before identification would break down. After permitting a host of potential hurdles (i.e., eliminating exclusions left and right by drawing lots of additional arrows), we concluded that genes alone won’t realistically work, but that time-varying gene expression might give valid IVs for peer effects. Under linearity, we found suggestive evidence for the transmission of obesity in social networks.”

All the best,

Felix

## November 9, 2014

### Causal inference without graphs

Filed under: Counterfactual,Discussion,Economics,General — moderator @ 3:45 am

In a recent posting on this blog, Elias and Bryant described how graphical methods can help decide if a pseudo-randomized variable, Z, qualifies as an instrumental variable, namely, if it satisfies the exogeneity and exclusion requirements associated with the definition of an instrument. In this note, I aim to describe how inferences of this type can be performed without graphs, using the language of potential outcome. This description should give students of causality an objective comparison of graph-less vs. graph-based inferences. See my exchange with Guido Imbens [here].

Every problem of causal inference must commence with a set of untestable, theoretical assumptions that the modeler is prepared to defend on scientific grounds. In structural modeling, these assumptions are encoded in a causal graph through missing arrows and missing latent variables. Graphless methods encode these same assumptions symbolically, using two types of statements:

1. Exclusion restrictions, and
2. Conditional independencies among observable and potential outcomes.

For example, consider the causal Markov chain which represents the structural equations:

with and being omitted factors such that X, , are mutually independent.

These same assumptions can also be encoded in the language of counterfactuals, as follows:

(3) represents the missing arrow from X to Z, and (4)-(6) convey the mutual independence of X, , and .
[Remark: General rules for translating graphical models to counterfactual notation are given in Pearl (2009, pp. 232-234).]

Assume now that we are given the four counterfactual statements (3)-(6) as a specification of a model; What machinery can we use to answer questions that typically come up in causal inference tasks? One such question is, for example, is the model testable? In other words, is there an empirical test conducted on the observed variables X, Y, and Z that could prove (3)-(6) wrong? We note that none of the four defining conditions (3)-(6) is testable in isolation, because each invokes an unmeasured counterfactual entity. On the other hand, the fact the equivalent graphical model advertises the conditional independence of X and Z given Y, X _||_ Z | Y, implies that the combination of all four counterfactual statements should yield this testable implication.

Another question often posed to causal inference is that of identifiability, for example, whether the
causal effect of X on Z is estimable from observational studies.

Whereas graphical models enjoy inferential tools such as d-separation and do-calculus, potential-outcome specifications can use the axioms of counterfactual logic (Galles and Pearl 1998, Halpern, 1998) to determine identification and testable implication. In a recent paper, I have combined the graphoid and counterfactual axioms to provide such symbolic machinery (link).

However, the aim of this note is not to teach potential outcome researchers how to derive the logical consequences of their assumptions but, rather, to give researchers the flavor of what these derivation entail, and the kind of problems the potential outcome specification presents vis a vis the graphical representation.

As most of us would agree, the chain appears more friendly than the 4 equations in (3)-(6), and the reasons are both representational and inferential. On the representational side we note that it would take a person (even an expert in potential outcome) a pause or two to affirm that (3)-(6) indeed represent the chain process he/she has in mind. More specifically, it would take a pause or two to check if some condition is missing from the list, or whether one of the conditions listed is redundant (i.e., follows logically from the other three) or whether the set is consistent (i.e., no statement has its negation follows from the other three). These mental checks are immediate in the graphical representation; the first, because each link in the graph corresponds to a physical process in nature, and the last two because the graph is inherently consistent and non-redundant. As to the inferential part, using the graphoid+counterfactual axioms as inference rule is computationally intractable. These axioms are good for confirming a derivation if one is proposed, but not for finding a derivation when one is needed.

I believe that even a cursory attempt to answer research questions using (3)-(5) would convince the reader of the merits of the graphical representation. However, the reader of this blog is already biased, having been told that (3)-(5) is the potential-outcome equivalent of the chain X—>Y—>Z. A deeper appreciation can be reached by examining a new problem, specified in potential- outcome vocabulary, but without its graphical mirror.

Assume you are given the following statements as a specification.

It represents a familiar model in causal analysis that has been throughly analyzed. To appreciate the power of graphs, the reader is invited to examine this representation above and to answer a few questions:

a) Is the process described familiar to you?
b) Which assumption are you willing to defend in your interpretation of the story.
c) Is the causal effect of X on Y identifiable?
d) Is the model testable?

I would be eager to hear from readers
1. if my comparison is fair.
2. which argument they find most convincing.

## October 29, 2014

### Fall Greetings from UCLA Causality Blog

Filed under: Announcement,General — eb @ 6:10 am

Friends in causality research,
This Fall greeting from UCLA Causality blog contains:

A. News items concerning causality research,
B. New postings, new problems and new solutions.

A. News items concerning causality research
A1. The American Statistical Association has announced an early submission deadline for the 2015 “Causality in Statistics Education Award” — February 15, 2015.
For details and selection criteria, see http://www.amstat.org/education/causalityprize/

A2. Vol. 2 Issue 2 of the Journal of Causal Inference (JCI) is now out, and can be viewed here:
http://www.degruyter.com/view/j/jci.2014.2.issue-2/issue-files/jci.2014.2.issue-2.xml
As always, submissions are welcome on all aspects of causal analysis, especially those deemed methodological.

A3. New Tutorial: Causality for Policy Assessment and Impact Analysis, is offered by BayesiaLab , see here.

A4. A Conference on Counterfactual anaysis for Policy Evaluation will take place at USC, November 20, 2014
http://dornsife.usc.edu/conferences/cafe-conference-2014/

A5. A Conference focused on Causal Inference will take place at Kyoto, Japan, November 17-18, 2014
Kyoto International Conference on Modern Statistics in the 21st Century
General info: http://www.kakenhyoka.jp/conference/index_en.html
Program: http://www.kakenhyoka.jp/conference/file/program.pdf

B. New postings, new problems and new solutions.
B1. A confession of a graph-avoiding econometrician.

Guido Imbens explains why some economists do not find causal graphs to be helpful. Miquel Porta describes the impact of causal graphs in epidemiology as a “revolution”. The question naturally arises: “Are economists smarter than epidemiologists?” or, “What drives epidemiologists to seek the light of new tools while graph-avoiding economists resign to parial blindness?”

B2. Lord’s Paradox Revisited — (Oh Lord! Kumbaya!)

This is a historical journey which traces back Lord’s paradox from its original formulation (1967), resolves it using modern tools of causal analysis, explains why it presented difficulties in previous attempts at resolution and, finally, addresses the general issue of whether adjustments for pre-existing conditions is justified in group comparison applications.

B3. “Causes of Effects and Effects of Causes”
http://ftp.cs.ucla.edu/pub/stat_ser/r431.pdf

An expansion of a previous note with same title, including additional demonstration that “causes of effects” are not metaphysical (Dawid, 2000) and a simple visualization of how the probability of necessity (PN) is shaped by experimental and observational findings. It comes together with “A note on Causes of Effects” link a rebuttal to recent attempts at mystification.

## October 27, 2014

### Are economists smarter than epidemiologists? (Comments on Imbens’s recent paper)

Filed under: Discussion,Economics,Epidemiology,General — eb @ 4:45 pm

In a recent survey on Instrumental Variables (link), Guido Imbens fleshes out the reasons why some economists “have not felt that graphical models have much to offer them.”

His main point is: “In observational studies in social science, both these assumptions [exogeneity and exclusion] tend to be controversial. In this relatively simple setting [3-variable IV setting] I do not see the causal graphs as adding much to either the understanding of the problem, or to the analyses.” [page 377]

What Imbens leaves unclear is whether graph-avoiding economists limit themselves to “relatively simple settings” because, lacking graphs, they cannot handle more than 3 variables, or do they refrain from using graphs to prevent those “controversial assumptions” from becoming transparent, hence amenable to scientific discussion and resolution.

When students and readers ask me how I respond to people of Imbens’s persuasion who see no use in tools they vow to avoid, I direct them to the post “The deconstruction of paradoxes in epidemiology”, in which Miquel Porta describes the “revolution” that causal graphs have spawned in epidemiology. Porta observes: “I think the “revolution — or should we just call it a renewal”? — is deeply changing how epidemiological and clinical research is conceived, how causal inferences are made, and how we assess the validity and relevance of epidemiological findings.”

So, what is it about epidemiologists that drives them to seek the light of new tools, while economists (at least those in Imbens’s camp) seek comfort in partial blindness, while missing out on the causal revolution? Can economists do in their heads what epidemiologists observe in their graphs? Can they, for instance, identify the testable implications of their own assumptions? Can they decide whether the IV assumptions (i.e., exogeneity and exclusion) are satisfied in their own models of reality? Of course the can’t; such decisions are intractable to the graph-less mind. (I have challenged them repeatedly to these tasks, to the sound of a pin-drop silence)

Or, are problems in economics different from those in epidemiology? I have examined the structure of typical problems in the two fields, the number of variables involved, the types of data available, and the nature of the research questions. The problems are strikingly similar.

I have only one explanation for the difference: Culture.

The arrow-phobic culture started twenty years ago, when Imbens and Rubin (1995) decided that graphs “can easily lull the researcher into a false sense of confidence in the resulting causal conclusions,” and Paul Rosenbaum (1995) echoed with “No basis is given for believing” […] “that a certain mathematical operation, namely this wiping out of equations and fixing of variables, predicts a certain physical reality” [ See discussions here. ]

Lingering symptoms of this phobia are still stifling research in the 2nd decade of our century, yet are tolerated as scientific options. As Andrew Gelman put it last month: “I do think it is possible for a forward-looking statistician to do causal inference in the 21st century without understanding graphical models.” (link)

I believe the most insightful diagnosis of the phenomenon is given by Larry Wasserman:
“It is my impression that the “graph people” have studied the Rubin approach carefully while the reverse is not true.” (link)

## September 2, 2014

### In Defense of Unification (Comments on West and Koch’s review of *Causality*)

Filed under: Discussion,General,Opinion — moderator @ 3:05 am

A new review of my book *Causality* (Pearl, 2009) has appeared in the Journal of Structural Equation Modeling (SEM), authored by Stephen West and Tobias Koch (W-K). See http://bayes.cs.ucla.edu/BOOK-2K/west-koch-review2014.pdf

I find the main body of the review quite informative, and I thank the reviewers for taking the time to give SEM readers an accurate summary of each chapter, as well as a lucid description of the key ideas that tie the chapters together. However, when it comes to accepting the logical conclusions of the book, the reviewers seem reluctant, and tend to cling to traditions that lack the language, tools and unifying perspective to benefit from the chapters reviewed.

The reluctance culminates in the following paragraph:
“We value Pearl’s framework and his efforts to show that other frameworks can be translated into his approach. Nevertheless we believe that there is much to be gained by also considering the other major approaches to causal inference.”

W-K seem to value my “efforts” toward unification, but not the unification itself, and we are not told whether they doubt the validity of the unification, or whether they doubt its merits.
Or do they accept the merits and still see “much to be gained” by pre-unification traditions? If so, what is it that can be gained by those traditions and why can’t these gains be achieved within the unified framework presented in *Causality*?

## August 1, 2014

### Mid-Summer Greetings from UCLA Causality Blog

Filed under: Announcement,General — moderator @ 3:35 pm

Dear friends in causality research,

This greeting from UCLA Causality blog contains:
A. News items concerning causality research,
B. New postings, publications, slides and videos,
C. New scientific questions and some answers.

A. News items concerning causality research
A.1 The American Statistical Association has announced the 2014 winners of the “Causality in Statistics Education Award.” See http://www.amstat.org/newsroom/pressreleases/2014-CausalityinStatEdAward.pdf

Congratulations go to the honorees, Maya Peterson and Laura B. Balzer (UC Berkeley, biostatistics department), who will each receive a \$5000 and a plaque at the 2014 Joint Statistical Meetings (JSM 2014) in Boston.

A.2 Vol. 2 Issue 2 of the Journal of Causal Inference (JCI) is scheduled to appear September, 2014. The TOC can be viewed here: http://degruyter.com/view/j/jci (click on READ CONTENT, under the cover picture)
As always, submissions are welcome on all aspects of causal analysis, especially those deemed heretical.

A.3 The 2014 World Congress on Epidemiology (IEA) will include a pre-conference program with two short courses dedicated to causal inference.
http://www.iea-course.org/index.php/pre-conference-course/program/program
IEA-2014, Anchorage , Alaska, August 16, 2014,

B. New postings, publications, slides and videos
B1. An interesting blog page dedicated to Sewall Wright’s 1921 paper “Correlation and causation” can be viewed here http://evaluatehelp.blogspot.com/2014/05/wright1st.html

It is intruiging to see how the first causal diagram came to the attention of the scientific community, in 1921. (It was immediately attacked, of course, by students of Karl Pearson.)

B.2 A video of my recent interview with professor Nick Jewell (UC Berkeley) concerning Causal Inference in Statistics, can now be watched by going to www.statisticsviews.com and clicking on the link next to the image.

B.3 A new review of Causality (Cambridge, 2009) has appeared in the Journal of Structural Equation Models, authored by Stephen West and Tobias Koch. See http://bayes.cs.ucla.edu/BOOK-2K/west-koch-review2014.pdf
My comments on this review will be posted here in a few days; stay tuned.

B.4 The paper “Trygve Haavelmo and the Emergence of Causal Calculus” is now available online on Econometric Theory, (10 June 2014), see here.

To the best of my knowledge, this is the first article on modern causal analysis that managed to penetrate the walls of mainstream econometric literature. Only time will tell whether this publication would help soften the enigmatic resistance of traditional economists to modern tools of causal analysis. Oddly, even those economists who have came to accept the structural reading of counterfactuals (e.g., Heckman and Pinto, 2013) still find it difficult to accept the second principle of causal inference: reading independencies from the model’s structure. See http://ftp.cs.ucla.edu/pub/stat_ser/r420.pdf

At any rate, the editors, Olav Bjerkholt and Peter Phillips, deserve a medal of courage for their heroic effort to create a dialogue between two civilizations.

B.5 To further facilitate this dialogue, Bryant Chen and I wrote a survey paper http://ftp.cs.ucla.edu/pub/stat_ser/r428.pdf which summarizes and illustrates the benefits of graphical tools in the context of linear models, where most economists feel secure and comfortable.

C. New scientific questions and some answers

R-425 “Recovering from Selection Bias in Causal and Statistical Inference,” with E. Bareinboim and J. Tian,
We ask: Is there a general, non-parametric solution to the selection-bias problem posed by Berkson
The answer is: Yes. The problem is illuminated, generalized and solved using graphical models — the language where knowledge resides.
(The article just received the Best Paper Award at the Annual Conference of the American Association for Artificial Intelligence (AAAI-2014), July 30, 2014.)
http://ftp.cs.ucla.edu/pub/stat_ser/r425.pdf

R-431. “Causes of effects and Effects of Causes”.
Question: Is it really the case that modern methods of causal analysis have neglected to deal with “causes of effects”, as claimed by a recent paper of Dawid, Fienberg and Faigman (2013)?.
Answer: Quite the contrary! See here:
http://ftp.cs.ucla.edu/pub/stat_ser/r431.pdf

R-428. “Testable Implications of Linear Structural Equation Models” with Bryant Chen and Jin Tian.
We ask: Is there a systematic way of unveiling the testable implications of a linear model with latent variables?
Answer: We provide an algorithm for doing so.
http://ftp.cs.ucla.edu/pub/stat_ser/r428.pdf

Finally, dont miss previous postings on our blog, for example:
2. Who Needs Causal Mediation?
3. On model-based vs. ad-hoc methods

Wishing you a productive summer,
Judea

## July 14, 2014

Filed under: Discussion,General,Simpson's Paradox — eb @ 9:10 pm

Simpson’s paradox must have an unbounded longevity, partly because traditional statisticians, so it seems, are still refusing to accept the fact that the paradox is causal, not statistical (link to R-414).

This was demonstrated recently in an April discussion on Gelman’s blog where the paradox was portrayed again as one of those typical cases where conditional associations are different from marginal associations. Strangely, only one or two discussants dared call: “Wait a minute! This is not what the paradox is about!” — to little avail.

To watch the discussion more closely, click http://andrewgelman.com/2014/04/08/understanding-simpsons-paradox-using-graph/ .

### On model-based vs. ad-hoc methods

Filed under: Definition,Discussion,General — eb @ 7:30 pm

A lively discussion flared up early this month on Andrew Gelman’s blog (garnering 114 comments!) which should be of some interest to readers of this blog.

The discussion started by a quote from George Box (1979) on the advantages of model-based approaches, and drifted into related topics such as

(1) What is a model-based approach,

(2) Whether mainstream statistics encourages this approach,

(3) Whether statistics textbooks and education have given face to reality,

(4) Whether a practicing statistician should invest time learning causal modeling,

or wait till it “proves itself” in the real messy world?

I share highlights of this discussion here, because I believe many readers have faced similar disputations and misunderstandings in conversations with pre-causal statisticians.

## April 1, 2014

### Spring Greetings from UCLA Causality Blog

Filed under: General — eb @ 5:00 pm

Dear friends in causality research,

This greeting from UCLA Causality blog contains:
A. News items concerning causality research,
B. New postings, publications, slides and videos,
C. Debates, controversies and strange articles,
D. New scientific questions and some answers.

1. Nominations are invited for the 2nd ASA “Causality in Statistical Education” Award.
The deadline is April 15, and the background information can be viewed here:
http://magazine.amstat.org/blog/2012/11/01/pearl/
http://magazine.amstat.org/blog/2013/08/01/causality-in-stat-edu/.

Nominations and questions should be sent to the ASA office at .
Visit http://www.amstat.org/education/causalityprize/ for nomination information.

Note: This year, the Award carries a \$10,000 prize, which may be split into two \$5,000 prizes.

2. Journal of Causal Inference – Vol. 2, Issue 1
The third issue of the Journal of Causal Inference is on its way, and a posting date has been set for April 15th, 2014.
The table of content can be viewed here:
http://tiny.cc/jci_2_1 ,
while the first two issues are here:
http://www.degruyter.com/view/j/jci
(click on READ CONTENT, under the cover picture)

As always, submissions are welcome on all aspects of causal analysis, especially those deemed heretical.

3. Causality book – 2nd Edition, 3rd printing
Many have been asking how to ensure that the copy they get is the latest, and not some earlier printing of Causality (2009).
The trick is to examine the copyright page and make sure it says: “Reprinted with corrections 2013”

Again, if you have an older printing and do not wish to buy another copy, all changes are marked in red here:
http://bayes.cs.ucla.edu/BOOK-09/causality2-errata-updated7_3_13.pdf

If we thought that Bertrand Russell’s dismissal of causality as “a relic of a bygone age” was a passing episode — we were
wrong. Danny Hillis has a new essay nominating causality as the one scientific tenet that ought to be discarded.
http://www.edge.org/response-detail/25435

His bottom line: “We will come to appreciate that causes and effects do not exist in nature, that they are just convenient creations of our own minds.”

I for one would rather explore the cognitive and computational advantages of these “convenient creations” than speculate on their non-existence in nature (see Causality page 419-420). The same goes for “free will”, “explanation”, “responsibility”, “agency”, “credit and blame” and other convenient creations that make up what we call “the understanding.”

5. Causality is Alive
Contrasting Hillis non-existence theory, we were delighted last month to get an existence proof from DARPA (Defence Advanced Research Projects Agency), announcing a new research program entitled Big Mechanism, or, Big Mechanism Seeks the “Whys” Hidden in Big Data”
http://www.darpa.mil/NewsEvents/Releases/2014/02/20.aspx

In a nutshell, this program aims to “leapfrog state-of-the-art big data analytics by developing automated technologies to help explain the causes and effects that drive complicated systems.” At the end of the announcement we read a familiar and visionary prediction: “By emphasizing causal models and explanation, Big Mechanism may be the future of science.”

I dont think many on this list would object to this prediction, though we are perhaps in the best position to appreciate the difficulties.

6. Simpson’s paradox, a new debate
A lively debate on Simpson’s paradox broke out again last month on Andrew Gelman’s blog (95 comments),
triggered by four papers on the subject published in The American Statistician (February, 2014).

The debate raged among three camps.
a) Those who think Simpson’s paradox occurs when “regression coefficients change if you add more predictors,” Therefore, no causality is needed, except that some regressors are “somehow wrong” and others are somehow right.

b) Those who think that “peeling away the paradox is as easy (or hard) as avoiding a comparison of apples and oranges, a concept requiring no mention of causality.”

c) Those (including this writer) who believe that intuitive notions such as “somehow wrong” and “apples and oranges” emanate from the causal structure of the story behind the data and, therefore, are all derivable mechanically from the causal graph. See
http://ftp.cs.ucla.edu/pub/stat_ser/r414.pdf
http://ftp.cs.ucla.edu/pub/stat_ser/R264.pdf

As an aside, Johannes Textor informs me that the Simpson’s machine described in r414.pdf is now available on
http://dagitty.net/learn/simpson/
for users to play with for fun and profit. Enjoy!

7. Who is a Bayesian?
Another lively debate (105 comments) addressed the 250 year old question: “Who is a Bayesian?”
http://andrewgelman.com/2014/01/16/22571/

Some think that “Bayes is the analysis of subjective beliefs” and some think that “Bayes is using Bayes rule”, be it with beliefs or with frequencies. My own opinion is summarized as:

“Bayes means:
(1) using knowledge we possess prior to obtaining data,
(2) encoding such knowledge in the language of probabilities
(3) combining those probabilities with data and
(4) accepting the combined results as a basis for decision making and performance evaluation.”
More in http://ftp.cs.ucla.edu/pub/stat_ser/r284-reprint.pdf

However, my main point was that, rather than arguing about who deserves the honor of being a “Bayesian,” we should discuss what methods better utilize prior knowledge, regardless of whether it is encoded as probabilities or as causal stories.

8. New slides and videos available
* Richard Scheines informed me that slides and videos for the workshop on graphical causal model search at CMU (Oct. 2013) are now available at:
http://www.hss.cmu.edu/philosophy/casestudiesworkshop.php

* Video of a tutorial on “Causes and Counterfactuals” presented at NIPS-2013 (by Pearl and Bareinboim) is available here:
http://research.microsoft.com/apps/video/default.aspx?id=206977

* Video of a lecture presented at Columbia University Institute for Data Sciences is available here:
http://idse.columbia.edu/seminarvideo_judeapearl

* Video of a public lecture presented at NYU-Poly is available here:

9. New scientific questions and some of their solutions.
http://bayes.cs.ucla.edu/csl_papers.html
that might earn your attention. Among them:

R-415 “On the Testability of Models with Missing Data”
in which we address the question of whether any data-generating model can be submitted to statistical test, once data are corrupted by missingness. The answer turns out to be positive, and we present sufficient conditions for testability in all three categories: MCAR, MAR and NMAR.
http://ftp.cs.ucla.edu/pub/stat_ser/r415.pdf

R-421 “Reply to Commentary by Imai, Keele, Tingley and Yamamoto, concerning Causal Mediation Analysis”.
It clarifies how Structural Causal Models (SCM) unify the graphical and potential outcome frameworks, and why ignorability-based assumptions require graphical interpretations before they can be judged for plausibility. It also explains why traditional mediation analysts are so reluctant to adopt modern methods of causal mediation; I blame habitual addiction to Bayes conditionalization for this resistance.
http://ftp.cs.ucla.edu/pub/stat_ser/r421.pdf

R-422 “Is Scientific Knowledge useful for Policy Analysis? A Peculiar Theorem says: No”
We ask: Why is it that knowing the effect of smoking on cancer does not help us assess the merits of of banning cigarette advertisement.
We speculate on the ramification of this peculiarity in nonparametric analysis.
http://ftp.cs.ucla.edu/pub/stat_ser/r422.pdf

10. Wishing you a happy and productive spring,
and may your deeds go for a good cause.

Judea

« Previous PageNext Page »