Judea Pearl Writes:
Researchers using causal diagrams have surely noticed that, despite a tremendous progress in causal modeling in the past three decades, editors and reviewers persist in raising questions about the usefulness of causal diagrams, noting that their structure is based largely on untested or untestable assumptions and, hence, that they could not serve as a basis for policy evaluation or personal decision-making.
Questions such as:
“What if I do not have the graph?”
“What if I am not sure about the absence of this or that confounder?”
“What if I do not have the scientific knowledge required to construct the graph,”
and more, come up again and again, especially from editors and reviewers who have not had first-hand experience in causal inference research.
As a service to readers of this blog, I would like to share the way I usually answer such questions.
1. First and foremost, I remind critics that ALL causal conclusions in nonexperimental settings must be based on untested, judgmental assumptions that investigators are prepared to defend on scientific grounds. Investigators therefore must choose between five options:
(1) confine studies to randomized experiments,
(2) abandon causal analysis altogether and resign to descriptive statistics,
(3) use popular software packages and hope that the assumptions embedded in them match reality,
(4) use opaque, yet scientific-sounding jargon to hide the assumptions*
(5) make those assumptions explicit and represent them in a graph.
* This group has managed to garner scientific respectability (of editors and reviewers) through the use of vocabulary such as: “assuming ignorability,” or “assuming data are MAR,” though rarely can one tell (without graphs) whether those conditions hold in any given problem.
2. A second argument I use reminds critics that to understand what the world should be like for a given procedure to work is of no lesser scientific value than seeking evidence for how the world works; the latter cannot advance without the former. Phrased differently, I argue that going from a model of reality to its consequences and going from the consequences to our model of reality are two sides of the coin named science, and that methods for constructing models of reality can only be developed once we understand what the implications of those models are.
In support of this argument I mention the fact that, if we examine carefully the developments of ideas in the data-intensive sciences we find that most innovative works germinated from model-based questions, assuming that a model is known and asking what its consequences are, what algorithms can either exploit or stand robust to its idiosyncratic features, whether any algorithm can produce the desired result and whether the model has testable implications. Not surprisingly, the testable implications that mathematics can deduce from a given graph are used nowadays to help researchers learn causal models from raw data, reinforcing the thesis that deduction preceeds induction.
3. Finally, I am occasionally pressed to explain why it is that authors using causal diagrams draw more criticism than say, authors who hide their assumptions under the carpet, or those who tell us, in chapter and verse, what threats may await those who ignore them. My explanation:
“… threats are safer to cite than assumptions. He who cites “threats” appears prudent, cautious, and thoughtful, whereas he who seeks licensing assumptions risks suspicions of attempting to endorse those assumptions, or of pretending to know when they hold true.
Second, assumptions are self-destructive in their honesty. The more explicit the assumption, the more criticism it invites, for it tends to trigger a richer space of alternative scenarios in which the assumption may fail. Researchers prefer therefore to declare threats in public and make assumptions in private.”
(Pearl and Bareinboim, 2011) http://ftp.cs.ucla.edu/pub/stat_ser/r372.pdf
In short, causal diagrams invite the harshest criticism because they make assumptions more explicit and more transparent than other representation schemes. As such, they do not offer an easy ride for seekers of comfort, but a faster and safer road for seekers of truth.
Addendum (specifically for missing-data researchers).
Missing-data research provides a vivid example for understanding where skepticism for graphs comes from. For over three decades, the field has been dominated by Rubin’s (1976) classification into three so called “missingness mechanisms”: (1) Missing Completely At Random (MCAR), (2) Missing At Random (MAR), and (3) Missing Not At Random (MNAR). Classifying a given problem into any one of these categories cannot be made on the basis of data alone and must rely on human judgment. Moreover, the information needed for making this classification is more judgmentally demanding than that required for constructing a graphical model of the missingness process. In particular, the classification requires knowledge of the probability distribution of the unobserved variables behind the data. Because the definition of MAR invokes relationships among events, rather than among variables, the classification cannot rely on generic scientific knowledge but must remain a matter of faith. Researchers are instructed to use rules of thumb in classifying problems, and then place their trust in procedures for which no performance guarantees are available unless a problem falls in the (untestable) MAR category.
Worst yet, the performance guarantees provided by traditional methodologies are rather feeble. All we have today is a guarantee that, if data are MAR, then the ML (as well as EM and MI) estimator is consistent.** Most remarkable, almost no guarantees have been found for the NMAR category, where most problem instances reside.
** Practically, this means that there exists an iterative algorithm such as EM which, if it converges to a global optimum, would also converge to a consistent estimate. In the case of non-convergence, however, researchers receive no indication of whether non-covergence is due to theoretical impediments to estimation, obstacles due to local maxima, or mis-parameterization of the problem.}
For over three and a half decades, data analysts have grown accustomed to accepting this unhappy state of affairs as inevitable and have rarely questioned whether it is the best that one can do. Yet, no sooner did graph-based results begin to shed light on the missing data landscape, ( http://ftp.cs.ucla.edu/pub/stat_ser/r410.pdf, http://ftp.cs.ucla.edu/pub/stat_ser/r417.pdf) than skeptics were there to ask: “And where does the graph come from?” or: “One can never tell if the model is correct”. In other words, investigators would rather endure another decade of darkness under the uninterpretable and untestable assumptions of MAR than enjoy the light that shines from graphical models, where assumptions are meaningful and results extend well into the NMAR territory. There is, evidently, something terribly scary in the honesty of confessed
assumptions, regardless of how necessary those are.
I hope readers can use some of these arguments in future encounters with the perennial question: “But where does the graph come from”, and share their experience with us.