{"id":2024,"date":"2019-08-14T23:26:46","date_gmt":"2019-08-14T23:26:46","guid":{"rendered":"http:\/\/causality.cs.ucla.edu\/blog\/?p=2024"},"modified":"2021-03-18T16:10:55","modified_gmt":"2021-03-18T16:10:55","slug":"a-crash-course-in-good-and-bad-control","status":"publish","type":"post","link":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/2019\/08\/14\/a-crash-course-in-good-and-bad-control\/","title":{"rendered":"A Crash Course in Good and Bad Control"},"content":{"rendered":"<p style=\"text-align: right;\"><em><strong>Carlos Cinelli, Andrew Forney and Judea Pearl<\/strong><\/em><\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/ftp.cs.ucla.edu\/pub\/stat_ser\/r493.pdf\"><span style=\"text-decoration: underline;\"><span style=\"background-color: #ffffff; color: #ff0000;\"><em><strong>Update: check the updated and extended version of the crash course here.<\/strong><\/em><\/span><\/span><\/a><\/p>\n<h2 style=\"text-align: left;\"><strong>Introduction<\/strong><\/h2>\n<p><span style=\"font-weight: 400;\">If you were trained in traditional regression pedagogy, chances are that you have heard about the problem of &#8220;bad controls&#8221;. The problem arises when we need to decide whether the addition of a variable to a regression equation helps getting estimates closer to the parameter of interest. Analysts have long known that some variables, when added to the regression equation, can produce unintended discrepancies between the regression coefficient and the effect that the coefficient is expected to represent. Such variables have become known as &#8220;bad controls&#8221;, to be distinguished from &#8220;good controls&#8221; (also known as &#8220;confounders&#8221; or &#8220;deconfounders&#8221;) which are variables that must be added to the regression equation to eliminate what came to be known as &#8220;omitted variable bias&#8221; (OVB).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Recent advances in graphical models have produced a simple criterion to distinguish good from bad controls, and the purpose of this note is to provide practicing analysts a concise and visible summary of this criterion through illustrative examples. We will assume that readers are familiar with the notions of &#8220;path-blocking&#8221; (or d-separation) and back-door paths. For a gentle introduction, see<\/span><a href=\"https:\/\/ucla.in\/2KlPWPc\"><i><span style=\"font-weight: 400;\"> d-Separation without Tears<\/span><\/i><\/a><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In the following set of models,\u00a0 the target of the analysis is the average causal effect (ACE) of a treatment X on an outcome Y, which stands for the expected increase of Y per unit of a controlled increase in X. Observed variables will be designated by black dots and unobserved variables by white empty circles. Variable Z (highlighted in red) will represent the variable whose inclusion in the regression is to be decided, with &#8220;good control&#8221; standing for <\/span><i><span style=\"font-weight: 400;\">bias reduction<\/span><\/i><span style=\"font-weight: 400;\">, &#8220;bad control&#8221; standing for <\/span><i><span style=\"font-weight: 400;\">bias increase<\/span><\/i><span style=\"font-weight: 400;\"> and \u201cnetral control\u201d when the addition of Z <\/span><i><span style=\"font-weight: 400;\">does not increase nor reduce bias<\/span><\/i><span style=\"font-weight: 400;\">. For this last case, we will also make a brief remark about how Z could affect the <\/span><i><span style=\"font-weight: 400;\">precision<\/span><\/i><span style=\"font-weight: 400;\"> of the ACE estimate.<\/span><\/p>\n<h2 style=\"text-align: left;\"><strong>Models<\/strong><\/h2>\n<p><strong><i>Models 1, 2 and 3 \u2013 Good Controls\u00a0<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2007\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_1-229x300.png\" alt=\"\" width=\"229\" height=\"300\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_1-229x300.png 229w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_1-768x1007.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_1-781x1024.png 781w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_1.png 850w\" sizes=\"auto, (max-width: 229px) 100vw, 229px\" \/><\/a> <a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2045 size-medium\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_2-229x300.png\" alt=\"\" width=\"229\" height=\"300\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_2-229x300.png 229w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_2-768x1007.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_2-781x1024.png 781w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_2.png 850w\" sizes=\"auto, (max-width: 229px) 100vw, 229px\" \/><\/a> <a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2044 size-medium\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_3-228x300.png\" alt=\"\" width=\"228\" height=\"300\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_3-228x300.png 228w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_3-768x1012.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_3-777x1024.png 777w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_3.png 850w\" sizes=\"auto, (max-width: 228px) 100vw, 228px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">In model 1,\u00a0 Z stands for a common cause of both X and Y. Once we control for Z, we block the back-door path from X to Y, producing an unbiased estimate of the ACE.<\/span><i><span style=\"font-weight: 400;\">\u00a0<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">In models 2 and 3, Z is not a common cause of both X and Y, and therefore, not a traditional \u201cconfounder\u201d as in model 1. Nevertheless, controlling for Z blocks the back-door path from X to Y due to the unobserved confounder U, and again, produces an unbiased estimate of the ACE.<\/span><\/p>\n<p><strong><i>Models 4, 5 and 6 &#8211; Good Controls<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2021\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_4-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_4-300x249.png 300w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_4-768x637.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_4-1024x850.png 1024w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_4.png 1350w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> <a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2020\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_5-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_5-300x249.png 300w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_5-768x637.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_5-1024x850.png 1024w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_5.png 1350w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> <a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_6.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2019\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_6-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_6-300x249.png 300w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_6-768x637.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_6-1024x850.png 1024w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_6.png 1350w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">When thinking about possible threats of confounding, one needs to keep in mind that common causes of X and any <\/span><i><span style=\"font-weight: 400;\">mediator<\/span><\/i><span style=\"font-weight: 400;\"> (between X and Y) also confound the effect of X on Y. Therefore, models 4, 5 and 6 are analogous to models 1, 2 and 3 &#8212; controlling for Z blocks the backdoor path from X to Y and produces an unbiased estimate of the ACE.<\/span><\/p>\n<p><strong><i>Model 7 &#8211; Bad Control<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2018\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_7-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_7-300x249.png 300w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_7-768x637.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_7-1024x850.png 1024w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_7.png 1350w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">We now encounter our first \u201cbad control\u201d. Here Z is correlated with the treatment and the outcome and it is also a \u201cpre-treatment\u201d variable. Traditional econometrics textbooks would deem Z a \u201cgood control\u201d. The backdoor criterion, however, reveals that Z is a \u201cbad control\u201d. Controlling for Z will <\/span><i><span style=\"font-weight: 400;\">induce bias<\/span><\/i><span style=\"font-weight: 400;\"> by opening the backdoor path X \u2190 U<\/span><sub><span style=\"font-weight: 400;\">1<\/span><\/sub><span style=\"font-weight: 400;\">\u2192 Z\u2190 U<\/span><sub><span style=\"font-weight: 400;\">2<\/span><\/sub><span style=\"font-weight: 400;\">\u2192Y, thus spoiling a previously unbiased estimate of the ACE.<\/span><\/p>\n<p><strong><i>Model 8 &#8211; Neutral Control (possibly good for precision)<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_8.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2017\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_8-261x300.png\" alt=\"\" width=\"261\" height=\"300\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_8-261x300.png 261w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_8-768x882.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_8-891x1024.png 891w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_8.png 975w\" sizes=\"auto, (max-width: 261px) 100vw, 261px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Here Z is not a confounder nor does it block any backdoor paths. Likewise, controlling for Z does not open any backdoor paths from X to Y. Thus, in terms of <\/span><i><span style=\"font-weight: 400;\">bias<\/span><\/i><span style=\"font-weight: 400;\">, Z is a \u201cneutral control\u201d. Analysis shows, however, that controlling for Z <\/span><i><span style=\"font-weight: 400;\">reduces the variation of the outcome<\/span><\/i> <i><span style=\"font-weight: 400;\">variable Y<\/span><\/i><span style=\"font-weight: 400;\">, and helps improve the <\/span><i><span style=\"font-weight: 400;\">precision<\/span><\/i><span style=\"font-weight: 400;\"> of the ACE estimate in finite samples.<\/span><\/p>\n<p><strong><i>Model 9 &#8211; Neutral control (possibly bad for precision)<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_9.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2016\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_9-262x300.png\" alt=\"\" width=\"262\" height=\"300\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_9-262x300.png 262w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_9-768x878.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_9-895x1024.png 895w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_9.png 975w\" sizes=\"auto, (max-width: 262px) 100vw, 262px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Similar to the previous case, here Z is &#8220;neutral&#8221; in terms of bias reduction. However, controlling for Z <\/span><i><span style=\"font-weight: 400;\">will reduce the variation of treatment variable X<\/span><\/i><span style=\"font-weight: 400;\"> and so may <\/span><i><span style=\"font-weight: 400;\">hurt <\/span><\/i><span style=\"font-weight: 400;\">the <\/span><i><span style=\"font-weight: 400;\">precision<\/span><\/i><span style=\"font-weight: 400;\"> of the estimate of the ACE in finite samples.\u00a0\u00a0<\/span><\/p>\n<p><strong><i>Model 10 &#8211; Bad control<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_10.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2015\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_10-261x300.png\" alt=\"\" width=\"261\" height=\"300\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_10-261x300.png 261w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_10-768x882.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_10-891x1024.png 891w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_10.png 975w\" sizes=\"auto, (max-width: 261px) 100vw, 261px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">We now encounter our second \u201cpre-treatment\u201d \u201cbad control\u201d, due to a phenomenon called \u201cbias amplification\u201d <\/span><a href=\"https:\/\/arxiv.org\/pdf\/1203.3503\"><span style=\"font-weight: 400;\">(read more here)<\/span><\/a><span style=\"font-weight: 400;\">. Naive control for Z in this model will not only fail to deconfound the effect of X on Y, but, in linear models, <\/span><i><span style=\"font-weight: 400;\">will amplify any existing bias.<\/span><\/i><\/p>\n<p><strong><i>Models 11 and 12 &#8211; Bad Controls<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_11.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2014\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_11-300x161.png\" alt=\"\" width=\"300\" height=\"161\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_11-300x161.png 300w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_11-768x412.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_11-1024x550.png 1024w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_11.png 1350w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> <a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_12.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2043 size-medium\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_12-300x229.png\" alt=\"\" width=\"300\" height=\"229\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_12-300x229.png 300w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_12-768x586.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_12-1024x781.png 1024w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_12.png 1350w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">If our target quantity is the ACE, we want to leave all channels through which the causal effect flows \u201cuntouched\u201d. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">In Model 11, Z is a mediator of the causal effect of X on Y. Controlling for Z will block the very effect we want to estimate, thus biasing our estimates.\u00a0 <\/span><\/p>\n<p><span style=\"font-weight: 400;\">In Model 12, although Z is not itself a mediator of the causal effect of X on Y, controlling for Z is equivalent to partially controlling for the mediator M, and will thus bias our estimates. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Models 11 and 12 violate the backdoor criterion, which excludes controls that are descendants of the treatment along paths to the outcome.<\/span><\/p>\n<p><strong><i>Model 13 &#8211; Neutral control (possibly good for precision)<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2042 size-medium\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_13-300x228.png\" alt=\"\" width=\"300\" height=\"228\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_13-300x228.png 300w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_13-768x583.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_13-1024x777.png 1024w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_13.png 1350w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">At first look, model 13 might seem similar to model 12, and one may think that adjusting for Z would bias the effect estimate, by restricting variations of the mediator M. However, the key difference here is that Z is a <\/span><i><span style=\"font-weight: 400;\">cause, not an effect, <\/span><\/i><span style=\"font-weight: 400;\">of the mediator (and, consequently, also a cause of Y). Thus, model 13 is analogous to model 8, and so controlling for Z will be neutral in terms of bias and may increase precision of the ACE estimate in finite samples.<\/span><\/p>\n<p><strong><i>Model 14 &#8211; Neutral controls (possibly helpful in the case of selection bias)<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_14.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2011\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_14-219x300.png\" alt=\"\" width=\"219\" height=\"300\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_14-219x300.png 219w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_14-768x1053.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_14-747x1024.png 747w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_14.png 850w\" sizes=\"auto, (max-width: 219px) 100vw, 219px\" \/><\/a> <a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_15.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2041 \" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_15-575x1024.png\" alt=\"\" width=\"205\" height=\"365\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_15-575x1024.png 575w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_15-168x300.png 168w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_15-768x1369.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_15.png 850w\" sizes=\"auto, (max-width: 205px) 100vw, 205px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Contrary to econometrics folklore, not all \u201cpost-treatment\u201d variables are inherently bad controls. In models 14 and 15 controlling for Z <\/span><i><span style=\"font-weight: 400;\">does not<\/span><\/i><span style=\"font-weight: 400;\"> open any confounding paths between X and Y. Thus, Z is neutral in terms of bias. However, controlling for Z <\/span><i><span style=\"font-weight: 400;\">does<\/span><\/i><span style=\"font-weight: 400;\"> reduce the variation of the treatment variable X and so may hurt the precision of the ACE estimate in finite samples. Additionally, in model 15, suppose one has only samples with W = 1 recorded (a case of selection bias). In this case, controlling for Z can help obtaining the W-specific effect of X on Y, by blocking the colliding path due to W.<\/span><\/p>\n<p><strong><i>Model 16 &#8211; Bad control<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_16.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2040 \" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_16-187x300.png\" alt=\"\" width=\"235\" height=\"377\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_16-187x300.png 187w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_16-768x1233.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_16-638x1024.png 638w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/redux_m_16.png 850w\" sizes=\"auto, (max-width: 235px) 100vw, 235px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Contrary to Models 14 and 15, here controlling for Z is no longer harmless, since it opens the backdoor path X \u2192 Z \u2190 U \u2192 Y and so biases the ACE.<\/span><\/p>\n<p><strong><i>Model 17 &#8211; Bad Control<\/i><\/strong><\/p>\n<p><a href=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_17.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2008\" src=\"http:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_17-219x300.png\" alt=\"\" width=\"219\" height=\"300\" srcset=\"https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_17-219x300.png 219w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_17-768x1053.png 768w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_17-747x1024.png 747w, https:\/\/causality.cs.ucla.edu\/blog\/wp-content\/uploads\/2019\/08\/clear_m_17.png 850w\" sizes=\"auto, (max-width: 219px) 100vw, 219px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Here, Z is not a mediator, and one might surmise that, as in Model 14, controlling for Z is harmless. However, controlling for the effects of the outcome Y will induce bias in the estimate of the ACE, making Z a \u201cbad control\u201d. A visual explanation of this phenomenon using \u201cvirtual colliders\u201d can be <\/span><a href=\"http:\/\/bayes.cs.ucla.edu\/BOOK-09\/ch11-3-1-final.pdf\"><span style=\"font-weight: 400;\">found here<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Model 17 is usually known as a \u201ccase-control bias\u201d or \u201cselection bias\u201d. Finally, although controlling for Z will generally bias<\/span><i><span style=\"font-weight: 400;\"> numerical estimates <\/span><\/i><span style=\"font-weight: 400;\">of the ACE, it does have an exception when X has <\/span><i><span style=\"font-weight: 400;\">no causal effect<\/span><\/i><span style=\"font-weight: 400;\"> on Y. In this scenario, X is still d-separated from Y even after conditioning on Z. Thus, adjusting for Z is valid for <\/span><i><span style=\"font-weight: 400;\">testing<\/span><\/i><span style=\"font-weight: 400;\"> whether the effect of X on Y <\/span><i><span style=\"font-weight: 400;\">is zero. <\/span><\/i><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Carlos Cinelli, Andrew Forney and Judea Pearl Update: check the updated and extended version of the crash course here. Introduction If you were trained in traditional regression pedagogy, chances are that you have heard about the problem of &#8220;bad controls&#8221;. The problem arises when we need to decide whether the addition of a variable to [&hellip;]<\/p>\n","protected":false},"author":9,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,47,48,13,18],"tags":[46,44,43,45],"class_list":["post-2024","post","type-post","status-publish","format-standard","hentry","category-back-door-criterion","category-bad-control","category-econometrics","category-economics","category-identification","tag-back-door-criterion","tag-bad-control","tag-confounding","tag-econometrics"],"_links":{"self":[{"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/2024","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/users\/9"}],"replies":[{"embeddable":true,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=2024"}],"version-history":[{"count":19,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/2024\/revisions"}],"predecessor-version":[{"id":2373,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/2024\/revisions\/2373"}],"wp:attachment":[{"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=2024"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=2024"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/causality.cs.ucla.edu\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=2024"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}