### Logically equivalent yet way too different

Contributor: Judea Pearl

In comparing the tradeoffs between the structural and potential outcome frameworks, I often state that the two are logically equivalent yet poles apart in terms of transparency and computational efficiency. (See Slide #34 of the JSM tutorial). Indeed, anyone who examines how the two frameworks solve a specific problem from begining to end (See, e.g., Slides #35-36 ) would find the differences astonishing.

The question naturally arises: How can two equivalent frameworks differ so substantially in actual use.

The answer is that epistemic equivalence does not mean representational equivalence. Two representations of the same information may highlight different aspects of the problem and thus differ substantially in how easy it is to solve a given problem. This is a recurrent theme in complexity analysis, but is not generally appreciated outside computer science. We saw it in our discussions with Guido Imbens who could not accept the fact that the use of graphical models is a mathematical necessity not just a matter of taste. (http://causality.cs.

The examples usually cited in complexity analysis are combinatorial problems whose solution times depend critically on the initial representation. I hesitated from bringing up these examples, fearing that they will not be compelling to readers on this blog who are more familiar with classical mathematics.

Last week I stumbled upon a very simple example that demonstrates representational differences in no ambiguous terms; I would like to share it with readers.

Consider the age-old problem of finding a solution to an algebraic equation, say

y(x) = x^{3} + ax^{2} + bx + c = 0

This is a tough problem for those of us who do not remember Tartalia’s solution of the cubic. (It can be made much tougher once we go to quintic equation.)

But there are many syntactic ways of representing the same function y(x) . Here is one equivalent representation:

y(x) = x(x^{2}+ax) + b(x+c/b) = 0

and here is another:

y(x) = (x-x_{1})(x-x_{2})(x-x_{3}) = 0,

where x_{1}, x_{2}, and x_{3} are some functions of a, b, c.

The last representation permits an immediate solution, which is:

x=x_{1}, x=x_{2}, x=x_{3}.

The example may appear trivial, and some may even call it cheating, saying that finding x_{1}, x_{2}, and x_{3} is as hard as solving the original problem. This is true, but the purpose of the example was not to produce an easy solution to the cubic. The purpose was to demonstrate that different syntactic ways of representing the same information (i.e., the same polynomial) may lead to substantial differences in the complexity of computing an answer to a query (i.e., find a root).

A preferred representation is one that makes certain desirable aspects of the problem explicit, thus facilitating a speedy solution. Complexity theory is full of such examples.

Note that the complexity is query-dependent. Had our goal been to find a value x that makes the polynomial y(x) equal 4, not zero, the representation above y(x) = (x-x1)(x-x2)(x-x3) would offer no help at all. For this query, the representation

y(x) = (x-z_{1})(x-z_{2})(x-z_{3}) + 4

would yield an immediate solution

x=z_{1}, x=z_{2}, x=z_{3,}

where z_{1}, z_{2}, and z_{3} are the roots of another polynomial:

x^{3} + ax^{2} + bx + (c-4) = 0

This simple example demonstrates nicely the principle that makes graphical models more efficient than alternative representations of the same causal information, say a set of ignorability assumptions. What makes graphical models efficient is the fact that they make explicit the logical ramifications of the conditional-independencies conveyed by the model. Deriving those ramifications by algebraic or logical means takes substantially more work. (See http://ftp.cs.ucla.

A typical example of how nasty such derivations can get is given in Heckman and Pinto’s paper on “Causal Inference after Haavelmo” (Econometric Theory, 2015). Determined to avoid graphs at all cost, Heckman and Pinto derived conditional independence relations directly from Dawid’s axioms and the Markov condition (See https://en.wikipedia.org/

Of course, this and other difficulties will not dissuade econometricians to use graphs; that would rake a scientific revolution of Kuhnian proportions. (see http://ftp.

They eventually will.