Forensic Statistics and Graphical Models

Autumn Semester, 2011

"FS", Tuesdays, 11:15--13:00, Snellius 401, Nielsbohrweg 1, Leiden.

Advanced Bachelor's level --- Master's level.

Forensic statistics is the art and science of doing statistics in the context of criminal investigation or prosecution. Especially in the latter context, it makes particular demands on the statistician, who is called to communicate to the court the meaning of statistical data with respect to the questions of interest to the court. Judges, jury, defence, prosecution, ... all have different interests and different information. Testimony of a scientific expert, such as a statistician, has to be neutral and ... scientific. Many of the consumers (jury, public, lawyers) have no prior understanding of probability or statistics at all.

The present "dogma of forensic statistics", as I would call it, contends that the task of the statistician is to impart to the court the meaning of a piece of evidence, thought of as statistical, i.e., partly formed by chance processes, by stating its likelihood ratio with respect to typically two important and competing hypotheses, usually referred to as the hypothesis of the prosecution and the hypothesis of the defence. For instance, if we see a measured DNA profile found from some trace of human cells at the scene of a crime as the result of a chance process involving measurement errors, the probability laws of genetics, and so on, we might like to report the ratio:

Prob(observed profile | the organic material comes from the suspect) : Prob(observed profile | the organic material comes from an unknown person, thought of as a random member of the population at large)

Graphical models or Bayes nets are probability models for the dependence structure of a collection of random variables, thought to be related to one another through a directed acyclic graph, each node or vertex representing one of the random variables in question. Their joint probability distribution is built up as follows. Arrange the graph in two dimensions with arrows (connections between nodes) only pointing downwards. First generate all variables corresponding to root nodes (nodes with no connections to them) by drawing them independently according to some specified marginal distributions. Then move down the graph, each time drawing the random variable corresponding to a given node from some specified conditional probability distribution, conditional on the values of the variables corresponding to that node's graph parents -- the nodes with arrows pointing directly to it.

Graphical models turn out to have wonderful probabilistic and computational properties. A beautiful algorithm, which will be one of the highlights of the course, helps us to rapidly and highly accurately compute conditional probabity distributions of some of the variables in the model given values of some of the other variables.  Because of the graphical representation they lend themselves very well for communicating between experts from different fields, and laypersons, about the model for the phenomenon at hand.

They are being used more and more in the forensic statistical context, for many reasons (good and bad), as we will see.

The course will be based on two main books, one on graphical models, the other on forensic statistics. These are:

Cowell, Dawid, Lauritzen and Spiegelhalter, Probabilistic Networks and Expert Systems

Aitken and Taroni, Statistics and the Evaluation of Forensic Evidence (2nd edition).

Computation and data analysis are important issues. I will make use of the graphical models package in R, as well as the specialized GeNiE and Hugin Lite packages (some links can be found below). In fact, it will be important to move back and forth between the possibilities for graphical interaction of the latter specialized applications, and the general statistical computations possibilities of R. Participants may prefer to use Matlab or yet other systems.

We will often make use of Bayes rule in odds form: posterior odds equals prior odds times likelihood ratio, in the following way. Two separate graphical models are constructed which both contain nodes representing the evidence at hand. One of the models belongs to the prosection case, the other to the defense case. By adding an artificial root node corresponding to the binary variable: "is the prosecution right, or the defence?" we merge the two models into one. As a matter of convenience, we assign the just mentioned root node the marginal (prior) probability distribution of equal odds, 50-50. We next use the graphical model to compute the conditional distribution of this variable given the actual evidence observed in the case. Because of Bayes' rule and because of our artificial choice of a uniform prior, the posterior odds equals the likelihood ratio, and that is the number which we must communicate to the court.

Of course, it is all not at all as simple as this... The defence is not obliged to offer a detailed theory "explaining" the evidence which is brought to the court; and anyway, both prosecution and defence will hardly rarely find themselves in the situation that they "know" their models exactly. At best, there will be unknown nuisance parameters all over the place. How to deal with that problem? So far, there is no answer...

Exercise. Given two graphical models, where we identify some of the nodes as representing the same random variables according to two different probabilistic mechanics, show how to merge the two models into one, so that we can compute the likelihood ratio for evidence, namely the values of certain of the common variables, in the way just described.

Notes. The book of Cowell et al. does not contain the proof of the Hammersley-Clifford theorem but refers to Lauritzen's 1996 book. Because the proof is so neat I have written it out here.

* * * * * * * * * * * * * * * *

Below is some further material originally written for the 2007 version of the course, which concentrated on graphical models with some forensic applications. Now however I am going to start by explaining what the special nature of forensic statistics is in general, and then teach graphical models as just one popular and important tool in the field. But much of the information may still be useful.

Here are two sets of slides of introductory talks:

forensic_statistics.pdf, talk by RDG giving overview "what is forensic statistics"
Lauritzen_EMS.pdf, talk by Steffen Lauritzen at European Meeting of Statisticians at Toulouse, 2009, about graphical models for analyzing DNA mixtures.
In the first (introductory) lecture of the present course I referred to two specific recent Dutch cases where the analysis of DNA mixtures was crucial: "The Deventer Murder Case (the widow Wittenberg)", and "The case of Tamara Wolvers (Alphen aan den Rijn)". In both cases I am pretty sure that a miscarriage of justice followed from a wrong analysis of a DNA mixture. To be more precise: I believe that the wrong conclusions were drawn from the DNA evidence.

Here is a link to a nearly finished paper by Laurtizen and collaborators on DNA mixture analysis link to a nearly finished paper.

Old course description, to be rewritten. In the course we will study theory and applications of graphical models (wikipedia/Graphical_model). In statistics, a graphical model specifies conditional independence relations among a set of random variables, some observable, some unobservable. It thereby provides statistical models for the joint distribution of the observed variables. The graph not only provides an attractive visual representation of the model but also serves as a computational tool.

For applications, we will focus on genetics and forensic science, where graphical models have proven to be particularly effective, since the laws of genetic inheritance are very neatly expressed in graphical models.

From the point of view of probability theory, conditional independence is a Markov property, and graphical models are "just" Markov fields.

In computer science, the same graphs are used to represent causality and are there called Bayes nets.

Literature:

The definitive resource for the mathematical foundations of the theory of graphical models (a number of chapters of which are essential reading) is the book

S.L. Lauritzen (1996), Graphical Models, Clarendon Press, Oxford, United Kingdom.

A very nice introduction built around applications in genetics is

http://www.math.auc.dk/~steffen/papers/grgenet.ps, published as
Lauritzen and Sheehan (2003), Graphical Models for Genetic Analyses, Statistical Science 18, 489--514.

See also George and Thompson (2003), Discovering Disease Genes, Statistical Science 18, 515--531.

For a somewhat different but also very interesting approach see Judea Pearl (2000), Causality -- Models, Reasoning, and Inference, Cambridge University Press. Yet another excellent book is Probabilistic Networks and Expert Systems by Robert G. Cowell, A. Philip Dawid, Steffen L. Lauritzen and David J. Spiegelhalter (1999), Springer Verlag.

Slides of the lectures so far: html, pdf

Workform, examination:

The course will include assignments, papers, presentations by the students; the final evaluation will be in a "mondeling" (viva voce?) examination. Incidently, since the topic allows many different accents to be made (probabilistic, statistical, algorithmic, ...) the participants will also be able to influence the choice of topics.

Web resources:

On internet you will easily find a wealth of material on graphical models and/or Bayes nets. Here are just a few links.

Tutorial on Graphical Models: Kevin Murphy's tutorial.

Interesting course on graphical models, many useful links and resources: Helsinki course.

Free computer package for Bayes nets, unfortunately only available for Windoze, GeNIe.
GeNIe runs well via WINE on linux (intel machines), including the fantastic new intel macs -- you can use DARWINE and stay inside OS X if you are not into Parallels Virtual Desktop or dual booting with windoze.

Much work is being done to give us graphical models in R.

gill@math.leidenuniv.nl