Request for Collaboration
Thomas Colignatus writes:
I plan to write a book with the working title "Basics of causality, correlation, economics and epidemiology, using graphical models. Applications of Mathematica". This book would use Mathematica (www.wri.com) as an environment, so that the reader/user can directly experiment and simulate.
It would be handy to be in contact with other users of Mathematica and to have critical proof-readers in the process, to reduce confusion and increase user friendliness. If interested, send me an email at cool@dataweb.nl. If the input is important we might turn this into a collaboration.
I am currently following a course in graphical models, given by Richard Gill in Leiden, http://www.math.leidenuniv.nl/~gill/teaching/graphical/index.html, which together with Judea Pearl's "Causality" would give a good starting point.
My background is econometrics, using structural equations models (SEM) at the Central Planning Bureau. I also worked at Erasmus MC (Medical Center) using a MCMC model. The discussion of SEM by Pearl was an eyeopener for explaining the differences between the modeling communities, that were bugging me while I could not put the finger on it.
I already have a lot of software both in economics, statistics and epidemiology, collected in "The Economics Pack" (see my website http://www.dataweb.nl/~cool/). My book "Voting theory for democracy" (VTFD) http://www.dataweb.nl/~cool/Papers/VTFD/Index.html already uses adjacency matrices to translate vote margins into graphs. And the Simpson paradox can occur in voting too (districts).
Given all this material, a text should b e available in September. Again, it would concern the basics only.
I recently completed a book "A logic of exceptions" (ALOE) in Mathematica, see the PDF at http://www.dataweb.nl/~cool/Papers/ALOE/Index.html. Pages 135-138 discuss induction, using the example of the causal relation that when it rains the streets are wet. I intend to proceed from here.
I would like to approach the matter starting with the simplest case, e.g. a 2 by 2 table like in logic. While a structural equation is quantitative, the causal relations that also are captured in graphs might also be seen as 1/0 logical relations, giving some parallelism. Proceeding from that, a third variable can be introduced.
I already solved a next step of correlation, see the following paper for the correlation of nominal variables: http://mpra.ub.uni-muenchen.de/2383/. The paper discusses the Simpson paradox, using this measure of correlation.
The latter paper also mentions this parallel: in logic, the distinction between implication and inference can be clarified by the distinction between statics and dynamics (taken from economics). Which Pearl does with identity (equality) and assignment. Or the calculation order. In both we see Time's Arrow. What causality is to Nature, inference is to the Mind.
Thus, it would be logical to first discuss a y[t] = f[x[t-1], eps] model, and only later simultaneous equations; and solve the problem what the time index of eps is.
This link discusses ALOE and the 2nd edition of VTFD with some comments on causality, in anticipation of this planned other book: http://www.dataweb.nl/~cool/Papers/ALOE/CommunicationALOEandVTFD.pdf