NIPS 2017: Q&A Follow-up
http://ftp.cs.ucla.edu/pub/stat_ser/r475.pdf
NIPS 17 – What If? Workshop Slides (PDF)
NIPS 17 – What If? Workshop Slides (PPT [zipped])
I have also received interesting questions at the end of my talk, which I could not fully answer in the short break we had. I will try to answer them below.
Q.1. What do you mean by the “Causal Revolution”?
Ans.1: “Revolution” is a poetic word to summarize Gary King’s observation: “More has been learned about causal inference in the last few decades than the sum total of everything that had been learned about it in all prior recorded history” (see cover of Morgan and Winship’s book, 2015). It captures the miracle that only three decades ago we could not write a formula for: “Mud does not
cause Rain” and, today, we can formulate and estimate every causal or counterfactual statement.
Q2: Are the estimates produced by graphical models the same as those produced by the potential outcome approach?
Ans.2: Yes, provided the two approaches start with the same set of assumptions. The assumptions in the graphical approach are advertised in the graph, while those in the potential outcome approach are articulated separately by the investigator, using counterfactual vocabulary.
Q3: The method of imputing potential outcomes to individual units in a table appears totally different from the methods used in the graphical approach. Why the difference?
Ans.3: Imputation works only when certain assumptions of conditional ignorability hold. The table itself does not show us what the assumption are, nor what they mean. To see what they mean we need a graph, since no mortal can process such assumptions in his/her head. The apparent difference in procedures reflects the insistence (in the graphical framework) on seeing the assumptions, rather than wishing them away.
Q4: Some say that economists do not use graphs because their problems are different, and they cannot afford to model the entire economy. Do you agree with this explanation?
Ans.4: No way! Mathematically speaking, economic problems are no different from those faced by epidemiologists (or other social scientists) for whom graphical models have become a second language. Moreover, epidemiologists have never complained that graphs force them to model the entirety of the human anatomy. Graph-avoidance among (some) economists is a cultural phenomenon, reminiscent of telescope-avoidance among Church astronomers in 17th century Italy. Bottom line: epidemiologists can judge the plausibility of their assumptions — graph-avoiding economists cannot. (I have offered them many opportunities to demonstrate it in public, and I don’t blame them for remaining silent; it is not a problem that can be managed by an unaided intellect)
Q.5: Isn’t deep-learning more than just glorified curve-fitting? After all, the objective of curve-fitting is to maximize “fit”, while in deep-learning much effort goes into minimizing “over-fit”.
Ans.5: No matter what acrobatics you go through to minimize overfitting or other flaws in your learning strategy, you are still optimizing some property of the observed data while making no reference to the world outside the data. This puts you right back on rung-1 of the Ladder of Causation with all the limitations that rung-1 entails.
If you have additional questions on these or other topics, feel free to post them here on our blog causality.cs.ucla.edu/blog, (anonymity will be respected), and I will try my best to answer them.
Enjoy,
Judea
———————————————–