{"id":1824,"date":"2018-01-10T09:15:24","date_gmt":"2018-01-10T09:15:24","guid":{"rendered":"http:\/\/causality.cs.ucla.edu\/blog\/?p=1824"},"modified":"2018-01-10T22:03:16","modified_gmt":"2018-01-10T22:03:16","slug":"facts-and-fiction-in-the-missing-data-framework","status":"publish","type":"post","link":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/2018\/01\/10\/facts-and-fiction-in-the-missing-data-framework\/","title":{"rendered":"Facts and Fiction from the &#8220;Missing Data Framework&#8221;"},"content":{"rendered":"<p>Last month, Karthika Mohan and I received a strange review from a prominent Statistical Journal. Among other comments, we found the following two claims about a conception called &#8220;missing data framework.&#8221;<\/p>\n<p><strong>Claim-1:\u00a0<\/strong>&#8220;The role of missing data analysis in causal inference is well understood (eg causal inference theory based on counterfactuals relies on the missing data framework).<br \/>\nand<br \/>\n<strong>Claim-2:\u00a0<\/strong>&#8220;While missing data methods can form tools for causal inference, the converse cannot be true.&#8221;<\/p>\n<p>I am sure that you have seen similar claims made in the literature, in lecture notes, in reviews of technical papers,\u00a0or informal conversations in the cafeteria. Oddly, based on everything that we have read and researched\u00a0about missing data we came to believe that both\u00a0statements are false. Still, these claims are being touted widely, routinely, and \u00a0unabashedly, with only scattered attempts to explicate\u00a0their content in open discussions.<\/p>\n<p>Below, we venture to challenge the two claims, hoping to elicit your comments, and to come to some understanding\u00a0of what actually is meant by the phrase &#8220;missing data framework;&#8221;\u00a0what is being &#8220;framed&#8221; and what remains &#8220;un-framed.&#8221;<\/p>\n<p><strong>Challenging Claim-1<br \/>\n<\/strong><\/p>\n<hr \/>\n<p>It is incorrect to suppose that the role of missing data analysis in causal inference is &#8220;well understood.&#8221; Quite the opposite. Researchers adhering to missing data analysis invariably invoke an ad-hoc assumption called &#8220;conditional ignorability,&#8221; often decorated as &#8220;ignorable treatment assignment mechanism&#8221;, which is far from being &#8220;well understood&#8221; by those who make it, let alone those who need to judge its plausibility.<\/p>\n<p>For readers versed in graphical modeling, &#8220;conditional ignorability&#8221; is none other than the back-door criterion that students\u00a0learn in the second class on causal inference, and which &#8220;missing-data&#8221; advocates have vowed to avoid at all cost. As we know, this criterion can easily be interpreted and verified when background knowledge is presented in graphical form but, as you can imagine, it turns into a frightening enigma\u00a0for those who shun the light of graphs.\u00a0Still, the simplicity of reading this criterion off a graph makes it easy to test whether those who rely heavily on\u00a0ignorability assumptions know what they are assuming.\u00a0The results of this test are discomforting.<\/p>\n<p>Marshall Joffe, at John Hopkins University, summed up his frustration with the practice and &#8220;understanding&#8221; of ignorability in these words: &#8220;Most attempts at causal inference in observational studies are\u00a0based on assumptions that treatment assignment is ignorable.\u00a0Such assumptions are usually made casually, largely because they\u00a0justify the use of available statistical methods and not because\u00a0they are truly believed.&#8221;\u00a0[Joffe, etal 2010, &#8220;Selective Ignorability Assumptions in Causal Inference,&#8221;\u00a0The International Journal of Biostatistics: Vol. 6: Iss. 2,\u00a0Article 11. \u00a0DOI: 10.2202\/<span id=\"OBJ_PREFIX_DWT179_com_zimbra_phone\" class=\"Object\" role=\"link\"><a href=\"callto:1557-4679.1199\">1557-4679.1199<\/a>\u00a0<\/span>Available at:\u00a0<span id=\"OBJ_PREFIX_DWT180_com_zimbra_url\" class=\"Object\" role=\"link\"><span id=\"OBJ_PREFIX_DWT181_com_zimbra_url\" class=\"Object\" role=\"link\"><span id=\"OBJ_PREFIX_DWT187_com_zimbra_url\" class=\"Object\" role=\"link\"><a href=\"http:\/\/www.bepress.com\/ijb\/vol6\/iss2\/11\" target=\"_blank\" rel=\"noopener\">http:\/\/www.bepress.com\/ijb\/vol6\/iss2\/11<\/a><\/span><\/span><\/span>\u00a0]<\/p>\n<p>My personal conversations with leaders of the missing data approach to causation (these include seasoned researchers, educators and prolific authors)\u00a0concluded with an even darker picture. None of those leaders was able to take a toy-example of 3-4 variables and determine whether conditional ignorability holds in the examples presented. It is not their fault, or course; determining<br \/>\nconditional ignorability is a hard cognitive and computational task that ordinary mortals cannot accomplish in their head, without the aids of graphs. (I base this assertion both on first-hand experience with students and colleagues and on intimate familiarity with issues of problem complexity and cognitive loads.)<\/p>\n<p>Unfortunately, the mantra: &#8220;missing data analysis in causal inference is well understood&#8221; continues to be chanted at an ever increasing intensity,\u00a0building faith among the faithful, and luring chanters to assume ignorability as self evident.\u00a0Worse yet, the mantra blinds researchers from seeing how an improved level of understanding\u00a0can emerge by abandoning the missing-data prism altogether, and conducting causal analysis in its natural habitat, using\u00a0scientific models of reality rather than unruly patterns of missingness in the data.<\/p>\n<p>A typical example of this trend is a recent article by\u00a0Ding and Fan titled: &#8220;Causal Inference: A missing data perspective&#8221;.<br \/>\n<span id=\"OBJ_PREFIX_DWT182_com_zimbra_url\" class=\"Object\" role=\"link\"><span id=\"OBJ_PREFIX_DWT188_com_zimbra_url\" class=\"Object\" role=\"link\"><a href=\"https:\/\/arxiv.org\/pdf\/1712.06170.pdf\" target=\"_blank\" rel=\"noopener\">https:\/\/arxiv.org\/pdf\/1712.06170.pdf<\/a><\/span><\/span><br \/>\nSure enough, already on the ninth line of the abstract, the authors assume away non-ignorable treatments and, then,\u00a0having \u00a0reached the safety zone of classical statistics,\u00a0launch statistical estimation exercises on a variety of estimands. This creates the impression that &#8220;missing data perspective&#8221; is sufficient for \u00a0conducting &#8220;causal inference&#8221; when, in fact, the entire\u00a0analysis rests on the assumption of ignorability, the one assumption\u00a0that the missing data perspective lacks the tools to address.<\/p>\n<p>The second part of Claim-1 is equally false: &#8220;causal inference theory based on counterfactuals relies on the missing data framework&#8221;. This may be true for the causal inference theory developed<br \/>\nby Rubin (1974) and expanded in Imbens and Rubin book (2015), but certainly not for the causal inference theory developed in (Pearl, 2000 2009) which is also based on counterfactuals, yet in no way relies on\u00a0&#8220;the missing data framework&#8221;. On the contrary, page after page of (Pearl, 2000, 2009) emphasizes that counterfactuals are natural derivatives of the causal model used, and do not<br \/>\nrequire the artificial interpolation tools (eg imputations\u00a0or matching) advocated by the missing data paradigm. Indeed,\u00a0model-blind imputation can be shown to invite disasters in\u00a0the class of &#8220;non ignorable&#8221; problems, something that is rarely acknowledged in the imputation-addicted\u00a0literature. The very idea that\u00a0certain parameters are not estimable, regardless of how clever the imputation is foreign to the missing data way of thinking. The same goes for the idea that some parameters are estimable while others are not.<\/p>\n<p>In the past five years, we have done extensive reading into the missing data literature. [For a survey, see: <a href=\"http:\/\/ftp.cs.ucla.edu\/pub\/stat_ser\/r473-L.pdf\">http:\/\/ftp.cs.ucla.edu\/pub\/stat_ser\/r473-L.pdf<\/a>] It has become clear to us that this framework falls\u00a0short of addressing three fundamental problems of modern causal analysis (1) To find if there exist sets of covariates that render treatments &#8220;ignorable&#8221;, (2) To estimate causal effects in cases where such sets do not exist, and (3) To decide if one&#8217;s modeling assumptions are compatible with the observed data.<\/p>\n<p>It takes a theological leap of faith to imagine that a framework avoiding these fundamental problems\u00a0can serve as an intellectual basis for a general theory of causal inference, a theory that has tackled\u00a0those problems head on, and successfully so. Causal inference theory has advanced significantly beyond this stage &#8211; nonparametric estimability conditions have been established for causal and counterfactual relationships\u00a0in both ignorable and non-ignorable problems. Can a framework bound to ignorability assumptions serve as a basis for one that has emancipated itself from such assumptions? We doubt it.<\/p>\n<p><strong>Challenging Claim 2.<\/strong><\/p>\n<hr \/>\n<p>We come now to claim (2), concerning the possibility of causality-free interpretation of missing data problems. It is possible indeed to pose a missing data problem in purely statistical terms,\u00a0totally void of &#8220;missingness mechanism&#8221; vocabulary, void even\u00a0of conditional independence assumptions. But this is rarely done, because the answer is trivial: none of the parameters of interest would be estimable without such assumptions (i.e, the likelihood function is flat). In theory, one can argue that there is really nothing causal about &#8220;missingness mechanism&#8221; as conceptualized by Rubin (1976), since it is defined in terms of conditional independence relations, a purely statistical notion that requires no reference to causation.<\/p>\n<p>Not quite! The conditional independence relations\u00a0that define missingness mechanisms are fundamentally different from those invoked in standard statistical analysis. In standard statistics, independence assumptions are presumed\u00a0to hold in the distribution that governs the observed data,\u00a0whereas in missing-data problems, the needed independencies are assumed to hold in the distribution of\u00a0variables which are only partially observed.\u00a0In other words, the independence assumptions\u00a0invoked in missing data analysis are necessarily judgmental, and only rarely do they have<br \/>\ntestable implications in the available data. [Fully developed in:\u00a0<a href=\"http:\/\/ftp.cs.ucla.edu\/pub\/stat_ser\/r473-L.pdf\">http:\/\/ftp.cs.ucla.edu\/pub\/stat_ser\/r473-L.pdf<\/a>]<\/p>\n<p>This behooves us to ask what kind of knowledge\u00a0is needed for making reliable conditional independence\u00a0judgments about a specific, yet partially observed problem domain. The graphical models literature has an unambiguous answer to this question: our judgment about statistical dependencies stems from our knowledge about causal dependencies, and the latters are organized in graphical form. The non-graphical literature has thus far avoided this question, presumably because it is a psychological issue that resides outside the scope of statistical analysis.<\/p>\n<p>Psychology or not, the evidence from behavioral sciences is overwhelming that judgments about statistical dependence emanate from causal intuition. [see D. Kahneman &#8220;Thinking, Fast and Slow&#8221;<br \/>\nChapter 16: Causes Trump Statistics]<\/p>\n<p>In light of these considerations we would dare\u00a0call for re-examination of the received mantra: 2. \u00a0&#8220;while missing data methods can form tools for causal inference, the converse cannot be true.&#8221; and reverse it, to read:<\/p>\n<p>2&#8242;. \u00a0&#8220;while causal inference methods provide tools for solving missing data problems, the converse cannot be true.&#8221;<\/p>\n<p>We base this claim on the following observations: 1. The assumptions needed to define the various\u00a0types of missing data mechanisms are causal in nature. Articulating those assumption in causal vocabulary is natural, and results therefore in model transparency and credibility. 2. Estimability analysis based on causal modeling of missing data problems has charted new territories, including problems in the MNAR category (ie, Missing\u00a0Not At Random), which were inaccessible to conventional missing-data analysis.\u00a0In comparison, imputation-based approaches to missing data<br \/>\ndo not provide guarantees of convergence (to consistent estimates)\u00a0except for the narrow and unrecognizable class of problems in\u00a0which ignorability holds. 3. Causal modeling of missing data problems has uncovered new ways of testing assumptions, which are infeasible in conventional missing-data analysis.<\/p>\n<p>Perhaps even more convincingly, we were able to prove that no algorithm exists which decides if a parameter is estimable, without examining the causal structure of the model; statistical information is insufficient.<\/p>\n<p>We hope these arguments convince even the staunchest missing data enthusiast to switch mantras and treat missing data problems for what they are: causal inference problems.<\/p>\n<p>Judea Pearl, UCLA,<br \/>\nKarthika Mohan, UC Berkeley<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last month, Karthika Mohan and I received a strange review from a prominent Statistical Journal. Among other comments, we found the following two claims about a conception called &#8220;missing data framework.&#8221; Claim-1:\u00a0&#8220;The role of missing data analysis in causal inference is well understood (eg causal inference theory based on counterfactuals relies on the missing data [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[39],"tags":[],"class_list":["post-1824","post","type-post","status-publish","format-standard","hentry","category-missing-data"],"_links":{"self":[{"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/1824","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=1824"}],"version-history":[{"count":3,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/1824\/revisions"}],"predecessor-version":[{"id":1830,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/1824\/revisions\/1830"}],"wp:attachment":[{"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=1824"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=1824"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=1824"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}