Causal Analysis in Theory and Practice

April 29, 2021

Personalized Decision Making

Filed under: Counterfactual — Scott Mueller @ 5:53 pm

Judea Pearl and Scott Mueller


The purpose of this note is to clarify the distinction between personalized and population-based decision making, the former concerns the behavior of a specific individual while the latter concerns a subpopulation resembling that individual. Technically, the former optimizes the individual causal effect

$$\begin{equation}\text{ICE}(u) = Y(1,u) – Y(0,u)\end{equation}$$

where \(Y(x,u)\) stands for the outcome that individual \(u\) would attain had decision \(x \in \{1, 0\}\) been taken. In contrast, population-based decision making attempts to optimize the Conditional Average Causal Effect

$$\begin{equation}\text{CACE}(u) = E[Y(1,u’) – Y(0,u’) | C(u)]\end{equation}$$

where \(C(u)\) stands for a vector of characteristics observed on individual \(u\), and the average is taken over all units \(u’\).

We will show that the two objective functions lead to different decision strategies and that, although \(\text{ICE}(u)\) is in general not identifiable1, informative bounds can nevertheless be obtained by combining experimental and observational studies. We will further demonstrate how these bounds can improve decisions that would otherwise be taken using \(\text{CACE}(u)\) as an objective function.

The example we will work out happened to be identifiable due to particular combinations of data, though, in general, the data may not permit point estimates of individual causal effects

Motivating Example

A new drug is introduced, aimed to help patients suffering from a deadly disease. A Randomized Controlled Trial (RCT) is conducted to evaluate the efficacy of the drug and is found to be \(28\%\) effective in both males and females. In other words \(\text{CACE}(\text{male}) = \text{CACE}(\text{female}) = 0.28\). The drug is approved and, after a year of use, a follow up randomized study is conducted yielding the same results; namely \(\text{CACE}\) remained \(0.28\), and men and women remained totally indistinguishable in their responses.

It thus appears reasonable to conclude that the drug has a net remedial effect on patients and that, every patient, be it male or female, should be advised to take the drug and benefit from its promise of increasing by \(28\%\) one’s chances of recovery.

Despite this advice, however, only \(70\%\) of men and \(70\%\) of the women actually chose to take the drug; problems with side effects and rumors of unexpected deaths may have caused the other \(30\%\) to avoid it.

Strangely, a detailed analysis of the observational study revealed differences in survival rates of men and women who chose to use the drug. The rate of recovery among drug-choosing men was exactly the same as that among the drug-avoiding men (\(70\%\) for each), but the rate of recovery among drug-choosing women was \(43\%\) lower than among drug-avoiding women (\(0.27\) vs \(0.70\)). It appears as though many women who chose the drug were already in an advanced stage of the disease which may account for their low recovery rate of \(27\%\).

At this point, having data from both experimental and observational studies we can estimate the individual treatment effects for both a typical man and a typical woman. Quantitative analysis shows (see “How Results Were Obtained”) that, with the data above, the drug affects men markedly differently from the way it affects women. Whereas a woman has a \(28\%\) chance of benefiting from the drug and no danger at all of being harmed by it, a man has a \(49\%\) chance of benefiting from it and as much as \(21\%\) chance of dying because of it — a serious cause for concern. Note that based on the experimental data alone, no difference at all can be noticed between men and women.

The ramifications of these findings on personal decision making are enormous. First, they tell us that the drug is not as safe as the RCT would have us believe, it may cause death in a sizable fraction of patients. Second, they tell us that a woman is totally clear of such dangers, and should have no hesitation to take the drug, unlike a man, who faces a decision; a \(21\%\) chance of being harmed by the drug is cause for concern. Physicians, likewise, should be aware of the risks involved before recommending the drug to a man. Third, the data tell policy makers what the overall societal benefit would be if the drug is administered to women only; \(28\%\) of the drug-takers would survive who would die otherwise. Finally, knowing the relative sizes of the benefiting vs harmed subpopulations swings open the door for finding the mechanisms responsible for the differences as well as identifying measurable markers that characterize those subpopulations.

For example:

  • In the same way that our analysis has identified “Sex” to be an important feature, separating those who are harmed from those saved by the drug, so we can leverage other measured features, say family history, a genetic marker, or a side-effect, and check whether they shrink the sizes of the susceptible subpopulations. The results would be a set of features that approximate responses at the individual level. Note again that absent observational data and a calculus for combining them with the RCT data, we would not be able to identify such informative features. A feature like “Sex”, for example, would be deemed irrelevant, since men and women were indistinguishable in our RCT studies.
  • Our ability to identify relevant informative features as described above can be leveraged to amplify the potential benefits of the drug. For example, if we identify a marker that characterizes men who would die only if they take the drug and prevent those patients from taking the drug, the drug would cure \(62\%\) of male patients who would be allowed to use it. This is because we don’t administer the drug to the \(21\%\) who would’ve been killed by the drug. Those patients will now survive, so a total of \(70\%\) of patients will be cured because of this combination of marker identification and drug administration. This unveils an enormous potential of the drug at hand, which was totally concealed by the \(28\%\) effectiveness estimated in the RCT studies.

How These Results Were Obtained

The following is a list of papers that analyze probabilities of causation and lead to the results reported above.

  • (Tian and Pearl, 2000) develops bounds on individual level causation by combining data from experimental and observational studies. This includes Probability of Sufficiency (PS), Probability of Necessity (PN), and Probability of Necessity and Sufficiency (PNS). \(\text{PNS}(u)\), the probability that individual \(U=u\) survives if treated and does not survive if not treated, is related to \(\text{ICE}(u)\) (Eq. (1)) via the equation: $$\begin{equation}\text{PNS}(u) = P(\text{ICE}(u’) > 0 | C(u))\end{equation}$$ In words, \(\text{PNS}(u)\) equals the proportion of units \(u’\) sharing the characteristics of \(u\) that would positively benefit from the treatment. The reason is as follows. Recall that (for binary variables) \(\text{ICE}(u)=Y(1,u)-Y(0,u)\) is \(1\) when the individual benefits from the treatment, \(\text{ICE}(u)\) is \(0\) when the individual responds the same to either treatment, and \(\text{ICE}(u)\) is \(-1\) when the individual is harmed by treatment. Thus, for any given population, \(\text{PNS} = P(\text{ICE}(u) > 0)\). Focusing on the sub-population of individuals \(u’\) that share the characteristics of \(u\), \(C(u’) = C(u)\), we obtain Eq. (3). In words, \(\text{PNS}(u)\) is the fraction of indistinguishable individuals that would benefit from treatment. Note that whereas Eq. (2) is can be estimated by controlled experiments over the population \(C(u’)=C(u)\), Eq. (3) is defined counterfactually, hence, it cannot be estimated solely by such experiments; it requires additional ingredients as described in the text below.
  • (Mueller and Pearl, 2020) provides an interactive visualization of individual level causation, allowing readers to observe the dynamics of the bounds as one changes the available data.
  • (Li and Pearl, 2019) optimizes societal benefit of selecting a unit u, when provided costs associated with the four different types of individuals, benefitting, harmed, always surviving, and doomed.
  • (Mueller, Li, and Pearl, 2021) takes into account the causal graph to obtain narrower bounds on PNS. The hypothetical study in this article was able to calculate point estimates of PNS, but often the best we can get are bounds.
  • (Pearl, 2015) demonstrates how combining observational and experimental data can be informative for determining Causes of Effects, namely, assessing the probability PN that one event was a necessary cause of an observed outcome.


  1. Jin Tian and Judea Pearl. Probabilities of causation: Bounds and identification. Annals of Mathematics and Artificial Intelligence, 28:287–313, 2000. [Online]. Available: [Accessed April 4, 2020].
  2. Scott Mueller and Judea Pearl. Which Patients are in Greater Need: A counterfactual analysis with reflections on COVID-19. [Online]. Available:
  3. Ang Li and Judea Pearl. Unit selection based on counterfactual logic. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, pages 1793–1799, 2019. AAAI Press. Available:
  4. Scott Mueller, Ang Li, and Judea Pearl. Causes of Effects: Learning individual responses from population data. [Online]. Available:
  5. Judea Pearl. Causes of Effects and Effects of Causes. Journal of Sociological Methods and Research, 44:149-164, 2015. [Online]. Available:

Powered by WordPress