Richard D. Gill’s home page

Mathematical Statistics

Mathematical Institute

Faculty of Mathematics and Natural Sciences

Leiden University

To contact me quickly, try email (surname at math dot leidenuniv dot nl), or mobile phone. Click here for my postal and visiting addresses, office and mobile phone numbers, email addresses, and further contact possibilities. If you are an Indian IT student looking for a summer interneeship then please read this.

Lucia de B at TEDxFlanders 2014 Naked; and remarks on Lucia de B: the movie

YouTube video. Slides of my talk Murder by Numbers. The naked truth about the case of Lucia de B.

A historical document: statement by Haga-Hospital, 2010, regarding the acquittal of Lucia de Berk: English translation, original.

Four days after the TEDx event, I saw the movie Lucia de B. on its premiere night in Amsterdam. Here is my personal film review: Splendid acting, very moving, beautifully told human story, centering around Lucia herself. Despite compression of the story line and focus on Lucia's personal experiences, it still contained such key features as: the close personal links between key people from hospital, justice and experts (image right); the mental illness and mental breakdown of the chief-paediatrician at JKZ ... There was a vain and ambitious hospital director. A bad statistician. Real life heroine Metta de Noo and hero Ton Derksen were concentrated in the film into the imaginary person of one imaginary whistleblower at the last place you might expect to find them: in the Public Prosecution service. But on the other hand: it wasn't black and white. There were good medics and bad medics, good nurses and bad nurses, good cops and bad cops ... Apparently, even some people in the Public Prosecution service found the witch hunt deeply disturbing.


Some years ago I offered a prize for the person who remasters the logo of the VVS: the Dutch statistical society (top image) in the most beautiful postscript. An exercise in curve fitting with splines, perhaps? Better still would be a mathematical/statistical story of the curves themselves, providing an elegant parametric family which reproduces the whole logo. Finally I decided to do it myself, and I think I am getting close with this perspective image of some very simple 3-dimensional curves, with indeed a statistical story behind them (bottom). The R script which draws this logo can be found here. It should generate a rotatable 3-d image...

See the slides of my Amst-R-dam R users group meetup 2011 (updated 2012) talk R-fun: part 1, the VVS logo in R; part 2, R on an iDevice. For the latest news on "R on an iDevice" see the 2014 talk R on an iDevice, given at a Data Science NL meetup.

VVS stands for "Vereniging voor Statistiek". SMS stands for "Section Mathematical Statistics". The VVS also has an OR section, hence the common alternative name VVS-OR.


Spring 2014

Forensic Statistics and Graphical Models

FS: Tuesdays 11:00--12:30 room 401

Master's level (or advanced Bachelor's)

Statistical Science for the Life and Behavioural Sciences

The master specialization Statistical Science for the Life and Behavioural Sciences is a collaboration of our group with others in biomedical statistics, biostatistics, and psychometrics.

Past courses

Here you can find links to various courses I have given in the past, in particular quantum statistics, statistics for astronomers, HOVO courses (adult education courses, in Dutch) on use and abuse of statistics, forensic science (Hovo-criminalistiek-statistiek-1, Hovo-criminalistiek-statistiek-2, Hovo-criminalistiek-statistiek-3).


Interests, most active marked *

  • causality, graphical models, forensic statistics, forensic DNA, statistics and law, scientific integrity and scientific fraud, science and society ***
  • quantum statistics, probability, information, foundations *
  • statistical and computational learning *
  • missing data, censoring *
  • survival analysis, semiparametric models, martingale methods, counting processes, non parametric maximum likelihood
  • product integration, compact differentiation, the delta method, the bootstrap, empirical processes in statistics
  • spatial statistics and image analysis
  • random number generation
  • mathematical typography
  • foundations of statistics, probability, mathematics, quantum theory (see middle of this page) *
  • Recent talks and papers

    Scientific integrity
  • Worst Practices in Statistical Data Analysis, talk at Willem Heiser farewell symposium (includes material on Smeesters affair, and on Geraerts "Memory paper" affair)

  • Forensic Statistics
  • What is the chance that the match is a coincidence?"Talk (Sept 2013) on two problems from forensic statistics: the rare haplotype problem, and mobile phone colocation analysis
  • Talk on forensic statistics at Nordic Meeting of Statisticians, Umea, 2012
  • Talk on forensic statistics at Statistics Day conference, 2010

  • Quantum foundations
  • Schroedinger's cat meets Occam's razor: lecture slides, draft paper: quant-ph/0905.2723
  • A proof of Bell’s inequality in quantum mechanics using causal interactions, with James Robins and Tyler VanderWeele
  • Refutation of Joy Christian's refutation of Bell's theorem
  • Statistics, causality and Bell's theorem, including a mathematical challenge in the design of Quantum Randi Challenges

  • Coarsening at Random
  • Algorithmic and Geometric characterization of CAR (slides)
  • Algorithmic and Geometric characterization of CAR, math.ST/0510276; appeared in Annals of Statistics, with Peter Grünwald

  • Generalized Bell Inequalities
  • On the maximal violation of the CGLMP inequality for infinite dimensional states, with Stefan Zohren
  • Perfect Passion at a Distance: How to win Polish Poker (Use Quantum Dice!), pdf, html
  • Better Bell Inequalities (slides), at NATO Advanced Research Workshop on Quantum Communication and Security, Gdansk (among other places)
  • Better Bell Inequalities: Maximal Passion at a Distance, math.ST/0610115, to appear in Festschrift for Piet Groeneboom, IMS monographs series

    Optimal Quantum State Estimation
  • Local Asymptotic Normality in Quantum Statistics, Limerick research seminar, 2009; Stolen from Madalin Guta's 2009 Lunteren lectures: Part I and Part II and Part III (there is no part III).
  • Madalin Guta's Magic quantum statistics course
  • Jonas Kahn's PhD thesis Quantum Local Asymptotic Normality (and other questions of quantum statistics)
  • Conciliation of Bayes and Pointwise Quantum State Estimation,math.ST/0512443: pp. 239-261 in Quantum Stochastics and Information: Statistics, Filtering and Control, V.P. Belavkin and M. Guta (eds.), World Scientific (2008)
  • Asymptotic information bounds in quantum statistics, math.ST/0512443, to be revised and extended for Annals of Statistics (much delayed by my activities in the case of Lucia de B. - so a preliminary version appeared as the previous item in this list)
  • Conciliation of Bayes and Pointwise Quantum State Estimation (slides), at QUantum PRocess ESTimation 06, Budmerice, Slovakia
  • Optimal adaptive measurement of mixed qubits, Phys. Rev. Lett. 97 130501 (2006), quant-ph/0512177, with Manuel Ballester and Catalan friends Emili Bagan, Ramon Muñoz-Tapia, Oriol Romero-Isart
  • Optimal collective measurement of mixed qubits Phys. Rev. (A) 73 032301 (2006), quant-ph/0510158, with Manuel Ballester and Catalan friends Emili Bagan, Alex Monras, Ramon Muñoz-Tapia

    Lucia de Berk
  • Elementary statistics on trial (the case of Lucia de Berk) (joint with P. Groeneboom and Peter de Jong, rejected by Statistica Neerlandica ... new version for different journal in preparation).
  • On the (ab)use of statistics in the legal case against the nurse Lucia de B, preprint at, final version published (with discussion by David Lucy) in Law, Probability and Risk, 2007, joint with Marieke Collins, Michiel van Lambalgen, Ronald Meester.
  • Lucia talk at Vierhouten hackers conferencee Lies damned lies and legal truths
  • Astin day presentation: a story in a story Statistics and Ethics (Dutch outside, English inside)
  • Lies, damned lies, and legal truths (2010), pp. 39–50 in: L. Mommers, H. Franken, J. van den Herik, F. van der Klaauw and G.J. Zwenne, Het Binnenste Buiten (Liber Amicorum ter Gelegenheid van het Emeritaat van Aernout Schmidt), eLaw@Leiden, Law Faculty, University Leiden.
  • Remarks on the Lucia data - why the numbers keep changing (2009)

    The probiotica affair
  • Slides for talk at CCMO workshop, 11 December 2009
  • Statistics, ethics, and probiotica, (2009), Statistica Neerlandica
  • careless statistics costs lives, slides of talk
  • meten is weten, slides of a talk at the Science Cafe, Nijmegen, in Dutch
  • Publications

  • Papers in quantum statistics (arXiv:quant-ph)
  • Recent papers in mathematical statistics (arXiv:stat)
  • Publication list, including prepublications and unpublished work
  • Links to many of my older papers can be found via the MC and CWI repository
  • Collaboration

    Mădălin Guţă (Nottingham), Ole Barndorff-Nielsen (Aarhus), Jonas Kahn (Orsay), Peter Jupp (St. Andrews), Peter Grünwald (Amsterdam), Erik van Zwet (Leiden), Jan-Åke Larsson (Linköping), Ramon Muñoz-Tapia (Barcelona), Emili Bagan (Barcelona), Luis Artiles (Cuba/EURANDOM), Stefan Zohren (Leiden), Ronald Meester (Amsterdam), Marek Żukowski (Gdansk, Vienna), ...

    Foundational issues in quantum theory

    WARNING: Richard P. Feynmann said that attempting to understand quantum mechanics causes you to fall into a black hole, never to be heard from again

    The past is particles, the future is a wave

    Bell’s fifth position

    During the academic year 2010-2011 I was Distinguished Lorentz Fellow (DLF) at the Netherlands Institute of Advanced Study in the Humanities and Social Science, NIAS. Here's my research proposal. The award ceremony was at NIAS, Wassenaar, late-afternoon of March 22, 2010. On the morning of the same day we held a complementary Breakfast symposium "Science, Media, Justice" at LUMC.

    The Smeesters affair (revised: July 4, 2013)

    Slides of talk on Smeesters case, and on the Geraerts-Merckelbach Memory paper affair. Talk originally given December 2012; slides updated March 2013; title "Integrity or fraud - or just questionable research practices?"
    Stimulated by media interest in the Geraerts-Merckelbach controversy on their "Memory" paper, I studied the published summary statistics in this paper using the same techniques as Simonsohn used for Smeesters, and found quite clear statistical evidence for "too good to be true". Without experimental protocols written up prior to the experiment, original data-sets, and laboratory log books detailing all the data selection and manipulation steps which resulted in the final data-set on which the summary statistics in the paper are based, one can only guess how these patterns arose. It certainly need not be fraud (fraud requires active intention to deceive).

    R-code for experiment with Simonsohn's fraud test (new version)
    Histogram of p-values of an honest researcher
    Histogram of p-values of a dishonest researcher

    Disclaimer: These notes formed an attempt to reconstruct the statistical analyses performed by Erasmus University Committee on Scientific Integrity, based intitially only on the censored version of the report of the committee which Erasmus released at the start of the affair. Later an uncensored report was made available, and later still Uri Simonsohn has made a "working paper" available which reveals yet more details of his methodology. The uncensored Erasmus report still leaves, for me, many unanswered questions concerning their exact methodology.

    The main resource for these notes was therefore the report of the Erasmus University Committee on Scientific Integrity (commissie wetenschappelijk integriteit, CWI). I am very grateful for recent communication with fraud-buster Uri Simonsohn, social pychologist at Wharton University School of Operations and Information Management, who pointed out a major error in an earlier version of my R experiment and also in my thinking! The results given here are still speculative and my understanding of the statistical procedures used by Erasmus-CWI (who refuse to comment) may be far from correct.

    Simonsohn's web page contains links to two interviews he has given in "Nature" and in the Dutch newspaper "Volkskrant" (English translation). For Smeesters' version of events see the interview with Smeesters in the Belgian (Flemish) newspaper De Standaard; part two of this interview is unfortunately only available for subscribers.

    Just recently Uri has released a new working paper called "Just post it: the lesson from two cases of fabricated data detected by statistics alone". A link can be found at his homepage. The paper contains another link to some useful supplementary material.

    According to the Erasmus-CWI report, Simonsohn's idea was that if extreme data has been removed in an attempt to boost significance, the variance of sample averages will decrease. Now researchers in social pyschology typically report averages, sample variances, and sample sizes of subgroups of their subjects, where the groups are defined partly by an intervention (treatment/control) and partly by covariates (age, sex, education ...). So if some of the covariates can be assumed to have no effect at all, we effectively have replications: i.e., we see group averages, sample variances, and sample sizes, of a number of groups whose true means can be assumed to be equal. Simonsohn's test statistic for testing the null-hypothesis of honesty versus the alternative of dishonesty is the sample variance of the reported averages of groups whose mean can be assumed to be equal. The null distribution of this statistic is estimated by a simulation experiment, by which I suppose is meant a parametric bootstrap in the situation where we do not have access to the orginal (but post-massage) data of the experiment. If the "original" data is available we could use the full (non-parametric) bootstrap.

    In the present experiment I used a parametric bootstrap under an assumption of normality. Within the bootstrap, we pretend that the reported group variances are population values, we pretend that the actual sample sizes are the reported sample sizes, and we play being an honest researcher who takes normally distributed samples of these sizes and variances, with the same mean.

    I did 1000 simulations of honest and of dishonest researchers in the following scenario: each researcher takes a sample of size 20 from each of 5 groups (standard normal distributions, same means and variances). The dishonest researcher discards all observations larger than 0.5, and all observations smaller than -2.0. He or she is attempting to increase the significance of the difference between these combined groups of subjects with other groups, by making the mean value for the groups whose statistics we are studying a whole lot lower, by removal of a massive number of observations (everything bigger than 0.5). Moreover, variance is being further reduced by removal of a few very small observations (everything smaller than -2.0). The sample size is reduced to about two thirds of its original size in this way. Simonsohn's test statistic is the sample variance of the group means. Its null distribution is estimated in my experiment by parametric bootstrap. My bootstrap sample size was 1000 and each bootstrap consists of five normal samples with equal means, sample sizes equal to the reported sample sizes (varying in the dishonest case), and variances equal to the reported sample variances. So each bootstrap generates one observation of "variance of the five group means". The relative frequency of bootstrap test-statistics smaller than the actually observed test-statistic is the p-value according to the empirical bootstrap null distribution of the test statistic. The idea is that the faked data display less variation than their summary statistics would lead one to expect. The histograms displayed here are histograms of these bootstrap p-values for a one-sided test (reject for small values) for honest and for dishonest researchers. Fortunately for the honest researchers, their p-values are close to uniformly distributed. The dishonest researchers on the other hand tend to have p-values which are much smaller, as we had expected.

    This little experiment, which takes 80 seconds to run on my 2.4 GHz Intel Core 2 Duo MacBook Pro, shows that Simonsohn's test statistic could be a valuable tool. Indeed, this massive and one-sided data-amputation has decreased the variance of group averages, relative to what one would expect from the reported groups' sample sizes and standard deviations. To put it another way, the actual variation in the reported group averages is too good to be true relative to the reported standard deviations and reported sample sizes.

    In my experiment, the test does indeed have actual size very close to 5%, when used at nominal level 5% . Its power against the alternative which I took is about 12%. Not extremely exciting, but the idea is to combine many tests over many groups of groups of subjects, and possibly also over a number of experiments reported in the same paper, or even over a number of papers by the same researcher.

    It would be interesting to see if a non-parametric bootstrap gives similar results. The massaged data has a far from normal distirbution.

    I have the following (provisional) four criticisms of the methodology and the reporting thereof.
    (1) We can be sure that honest or dishonest, the original data is not normally distributed. So statistical conclusions based on a parametric bootstrap are rather tentative.
    (2) The Erasmus report told us nothing about how Simonsohn got onto Smeesters' tail. Was he on a cherry-picking expedition, analysing hundreds of papers or hundreds of researchers, and choosing the most significant outcome for the follow-up? This is not revealed in the Erasmus report, but it is important to know in order to judge the significance of Erasmus' re-analysis of the same data. According to Simonsohn, someone drew his attention to one of Smeesters' papers in August 2011 (The effect of color (red versus blue) on assimilation versus contrast in prime-to behaviour effect, coauthor Jia Liu, University of Groningen). Why? Knowing Simonsohn's reputation, this was presumably someone who had his or her own suspicions. Why? According to the Erasmus-CWI report, Simonsohn concluded from his statistical methodology that that the summary statistics in that paper were too good to be true. Simonsohn requested and obtained Smeesters' dataset and discovered more anomalies which confirmed his initial opinion. Smeesters tried to explain some of the anomalies but Smeesters' explanations would make the observed anomalies less likely, not more likely. At this point Smeesters' computer crashed and all original material of all of his experiments, ever, was lost. Other statistical analyses, replicated by CWI and described in Appendix 4 of the Erasmus report, confirmed that these further striking patterns in the data (of completely different nature to what we are studying here) are extremely unlikely to have arisen by chance. Smeesters' data has not been released so it is still not possible to replicate the most crucial parts of the analysis: those which, as it were, independently confirmed the initial suspicions of something worse than data-massage.
    (3) Erasmus-CWI uses the pFDR method (positive - False Discovery Rate) in some kind of attempt to control for multiple testing. In my opinion, adjustment of p-values by pFDR methodology is absolutely inappropriate in this case. It includes a guess or an estimate of the a priori "proportion of null hypotheses to be tested which are actually false". Thus it includes a "presumption of guilt"! The pFDR method, when correctly used, guarantees that of those results which are nominally significant at the pFDR-adjusted level 0.05, only 5% are false positives. Thus 95% of the "significant" results are for real - if the a priori guess/estimate is correct, and in the long run! This methodology was invented for massive cherry-picking experiments in genome wide association studies. It was not invented to correct for multiple testing in the traditional sense, when the simultaneous null hypothesis should be taken seriously. Innocent till proven guilty. Not proven guilty by an assumption that you are guilty some significant proportion of the times. In order that Smeesters himself is protected from cherry picking by Erasmus-CWI he should have insisted on a Bonferroni correction of the p-values which they report; a much stronger requirement than the pFDR correction.
    (4) The CWI report is itself not a good model of reporting statistical analyses. There are hundreds of different pFDR methods, which one was used? Simonsohn's paper is not only unpublished by still unobtainable, and the description in the Erasmus report is terse. One cannot reproduce their analyses from their description of their procedures. One might object that this report is the result of an internal investigation of an organisation carrying out disciplinary investigation of one of its employees, hence that Erasmus University is actually behaving with unusual and exemplary transparency. I would retort that by publishing this report Erasmus university is broadcasting a public condemnation of Smeesters as a scientist, which goes far beyond the internal needs of an organisation in the unfortunate situation when it has to terminate the employment of an employee for some misdemeanors. Erasmus university has withdrawn a number of Smeesters' publications; not the authors of those publications. I am amazed that the statisticians involved in the investigation at Erasmus are apparently forbidden to reveal the smallest technical details of their analyses to fellow scientists.

    My overall opinion is that Simonsohn has probably found a useful investigative tool. In this case it appears to have been used by the Erasmus committee on scientific integrity like a medieval instrument of torture: the accused is forced to confess by being subjected to an onslaught of vicious p-values which he does not understand. Now, in this case, the accused did have a lot to confess to: all traces of the original data were lost (both original paper, and later computer files), none of his co-authors had seen the data, all the analyses were done by himself alone without assistance. The report of Erasmus-CWI hints at even worse deeds. However, what if Smeesters had been an honest researcher?

    Incidentally, in my opinion cherry-picking and data-massaging in themselves are not evil. What is evil, is not honestly reporting your statistical procedures, and that includes all selection and massaging. A good scientist reports an experiment in such a way that others can repeat it. That includes the statistical analyses; and that includes the methodology by which you choose which results of which experiments to report, and how your data has been massaged.

    In physics, the interesting experiments are immediately replicated by other research groups. Interesting experiments are experiments which push into the unknown, in a direction in which there are well-known theoretical and experimental challenges. Experiments are repeated because they give other research groups a chance to show that their experimental technique is even better, or to genuinely add new twists to the story. In this way, bad reporting is immediately noticed, because experiments whose results cannot be replicated immediately become suspect. Researchers know that their colleagues (and competitors) are going to study all the methodological details of their work, and are going to look critically at all the reported numbers, and are going to bother them if things don't seem to match or important info is missing. In particular, if the experiment turns out to be methodologically flawed, you can be sure someone is going to tell that to the world.

    The problem in social psychology (to reveal my own prejudices about this field) is that interesting experiments are not repeated. The point of doing experiments is to get sexy results which are reported in the popular media. Once such an experiment has been done, there is no point in repeating it.


    Biography and more ...

    First Leiden inaugural lecture

    Curriculum Vitae

    Past phd students

    Just for fun: things you wish your computer had (including the classic clippy’s suicide note)

    The three doors problem

    A few years ago I discovered the enormous disussion on the Monty Hall (three doors) problem on wikipedia. My published writings on the subject are, in order of writing (and in order of insightfulness) an invited contribution to Springer's International Encyclopaedia of Statistical Science, 2010, a paper in Statistica Neerlandica, 2011, and contributions to the peer reviewed internet encyclopedias and In this manuscript you will find an expanded version the most recent published work, the article.
    In these works I distinguish between the original, somewhat ambiguous, real world question about a famous quiz show, and the many mathematizations of the question which have been proposed in the literature. Personally I prefer the lesser known game theoretic version. For me, the question is not "what is this probability?" or "what is that probability?", but: "what would you do?" And to me, the wikipedia controversy around the Monty Hall problems (concerning whether we should compute a conditional or unconditional probability of getting the car if we switch doors) is a warning against solution-driven science. I want to thank so many wikipedia editors for the inspiration they gave me.

    The holy grail of Monty Hall studies

    Suppose the car is hidden behind one of the three doors by a fair randomization. The contestant chooses Door 1. Monty Hall, for reasons best known to himself, opens Door 3 revealing a goat. We know that whatever probability mechanism is used by Monty for this purpose, the conditional probability that switching will give the car is at least 1/2. We know that the unconditional probability (ie not conditioning on the door chosen by the contestant, nor the door opened by Monty) is 2/3.
    Always switching gives the car with unconditional probability 2/3, always staying gives it with probability 1/3. Nobody in their right mind could imagine that there could exist some mixed strategy (sometimes staying, sometimes switching, perhaps with the help of some randomization device, and all depending on which doors were chosen and opened) which would give you a better overall (ie unconditional) chance than 2/3 of getting the car.
    This is true, of course. In fact, from the law of total probability, proving the optimality of (unconditional) 2/3 by always switching is equivalent to proving that all the six conditional probabilities of winning by switching, given door chosen and door opened, are at least 1/2. We can prove the latter using Bayes' theorem, or, better I think, using Bayes' rule in a smart way. However both these proofs require some sophistication.
    Is there an elementary proof? A short proof using words and ideas, no computations.
    Yes there is, and I learnt it from Sasha Gnedin.
    However you play there's always a door such that if the car is there, you'll miss it. Consider first deterministic strategies. We only need consider two cases: for "always switching" it's the door you initially chose, and for "sometimes switching" it's a door you won't switch to if you get the option. (If you never switch there are two such doors: just choose one). Ordinary readers won't be interested in randomized strategies but anyone who wants to include these will understand how to do it (now the door where you'ld certainly miss a car has to be a random door, determined by the same coin tosses used to implement the random choices in your own strategy).
    Note that the door which has been indicated in this way does not depend on where the car is actually hidden or how the host plays: it just depends on how the player plays. Therefore if the car is initially equally likely to be behind any of the three doors, we run a 1/3 chance that the car will be missed because it's behind this door. Therefore the 2/3 success-chance of always switching can't be beaten.
    I would call this a proof by coupling.

    The two envelopes problem

    From Three Doors to Two Envelopes (what will be next? One Coffin, perhaps?).
    Here is my fourth draft of the definitive article on the infamous two envelopes problem. The problem which Martin Gardner could not solve, and which many other famous people got wrong. Studied by probabilists, logicians, economists, philosophers. Now studied by me ...

    The mathematical heart of all exchange paradoxes is encapsulated in a little theorem which I call my "unified solution". It seems to be new.

    Lucia: who did do it, then?

    This letter by me published by Dutch magazine 'Nursing', April 2012, -- here's a rough English translation -- summarizes what we now know about the causes of the case: namely a kind of subconscious mass conspiracy by a number of medical specialists at JKZ to connect their own errors and other unexpected mishaps to Lucia.
    This process started at least nine months before the "unexpected" death of baby Amber in the early hours of 4 September, 2001. One piece of evidence for this (there is lots more): within 24 hours of Amber's death, the hospital director reported five unnatural deaths at his hospital over the last year to the police and to the Dutch national health inspectorate. Yet none of the other deaths had previously been thought in any way to be unusual - at least, not officially. There had been no reports of incidents to the authorities. Even Amber's death was first registered as natural, this was only converted to unnatural later in the day.
    It took JKZ's director less than fifteen minutes, from the time he was informed on the afternoon of 4 September of the latest death, to report the five unnatural deaths to police and to the health inspectorate. Within a day, Lucia was put on "non-active". All staff was informed of the upcoming police investigation and of the implication of an (unnamed) nurse. A press conference was held in which it was also mentioned that the murder investigation (by the hospital???) was being extended to other hospitals at which the same nurse had been earlier employed. Coincidentally, hospitals at this very moment in the process of being merged with JKZ.
    Implication: four dossiers were ready and waiting. The doctors had simply been waiting for one more, and the time between the death and informing the director, was simply used to complete the last dossier.
    During these few hours a number of important medical facts about Amber's death were altered on the patient's dossier by the specialists concerned: most crucially, the sequence of events leading to death. Lung failure before heart failure, suggestive of a natural death by exhaustion, was changed into heart failure before lung failure, suggestive of poisoning. Yet at the trial, the hospital director and most (but not all!) specialists claimed that there had been no suspicion at all of anything, or of anyone, till the death of Amber.
    Interestingly, the investigation by the Dutch national Health Inspectorate (IGZ), at the same time as a police investigation and then a murder trial was going on, found no evidence of anything remarkable! Their report has never been made public. Even the mere fact of their investigation seems completely unknown.
    Also interestingly, the media has totally ignored all this news. Presumably, the Dutch population is "Lucia-tired" - this story won't sell newspapers or generate large viewing numbers on TV.
    The Dutch medical community, and people in the upper echelons of "justice" still believe "she did it". The legal system got the blaim (but also takes any credit: "she got freed, didn't she?"). The medical world looks the other way.
    I sent a copy of my letter to the directors of Haga hospital, to Haga's lawyer, and to the coordinator of research at Haga (who had earlier been forbidden by her bosses to communicate with me). I was expecting at least a response from the lawyer, but nothing came.

    Lessons from Lucia

    Lucia interviewed (in English) on CNBC

    Learning from Lucia, slides from my lecture at ATSTATS 2010, video of the lecture

    "Learning from Lucia" is the title of many talks I am giving these days. I believe that there is so much of value that we can learn from understanding the Lucia case, which in one sense was concluded with the "not guilty" verdict given at her retrial in 2010, but in another sense is still completely open: what really did happen? Why was there ever a case at all?
    Below is my present list of recommendations. In my opinion a number of "system faults" were exposed by the catastrophe, some of them specific to the Dutch situation (with its specific and fascinating culture, history, ...).

    Actually, the Dutch legal system has already learnt a great deal. The same goes for the Dutch scientific world and in particular, the statistical world and the world of forensic science. What remains is for the medical world to learn. However, as long as it denies any responsibility, that learning process cannot start, and unfortunately that is definitely still the case.

    Here are some of the hard facts of the case and the hard facts of the Dutch situation.

    1) The fact that there was a case at all (2001) and the outcome of the first round of court cases (Supreme Court, 2006) was - it seems to observers with access to the dossiers and who study the reports of CEAS (judicial review committee reporting to the Public Ministry), of the Advocate General to the Supreme Court, of Prof Meulenbelt to the court in Arnhem and of the conclusions of that court itself - strongly determined by the interation between the chef-de-clinique of the Juliana Children's Hospital, a well-known and respected paediatrician, and the director of that hospital. Their actions were influenced by malicious gossip about Lucia, and moreover we now know that the paediatrican was mistaken in some diagnoses, so that she herself could well have been surprised when some of her patients suddenly died. Her brother-in-law, a theoretical computer scientist with no experience whatever in applied statistics, supported her amateur statistical conclusions based on her own data-gathering (the data of the amazing coincidence of Lucia so often being on duty whenever strange things happened). These were the statistics at the outset of the case.

    Together these two persons reported a number of deaths and other incidents to the police as being highly suspicious. Each medical dossier was accompanied by the chief paediatrician's one-page summary explaining why the incident was suspicious. She was made hospital coordinator and laison person for the subsequent police investigation, yet she never gave witness to the court of appeal, and only briefly at the lower court.

    Two extremely hierarchical and powerful organisations (a large hospital and the public ministry) had to be linked up for a murder investigation.

    So we have some medical errors and, It seems to me some, some managerial errors. The hospital director was responsible for a number of very far-reaching decisions. The fact that the hospital had become aware of a serial killer in the nursing staff, and a suspect had been suspended from duty, was communicated in a succession of three internal memos to the hospital staff and in a press conference, including TV appearance, before the police had started an investigation; before any external investigation had taken place at all. To this day, hospital staff is forbidden to talk about the affair to outsiders (the board of the the Dutch society of paediatricians has also forbidden discussion of the case by its members). The director was an authoritarian manager; perhaps necessarily so, since at the time he was charged with the merger of three badly functioning hospitals in a bad financial situation. He was focussed on processes and on the reputation of his hospital(s). He is known for, and proud of, making rapid criticial decisions and never looking back on decisions once made; see the interview with him (in Dutch) in Skipr, the Dutch magazine for health care managers.

    These two key persons consider that they acted completely properly and state they would do exactly the same again if the same circumstances arose again today -- which further underlines the point I am trying to make: the "integrity" of the system in which these individual human beings were embedded needs to be investigated, since it so easily allowed the chance interaction of their personalities together with some bad luck to spark a catastrophe - of which they too are the victims. And by the way, such medical and managerial personalities are not rare, nor - as I will explain - is the "run of bad luck" which hit Lucia.

    Compare this with an investigation into an air disaster. The direct cause might be the chance mechanical failure of some bolt or electrical failure of some wiring followed by some errors of judgement of pilots faced with what seems like a dangerous situation. My wife usually says: "I know why the plane came down: because of gravity"; but sometimes she lays the blame on the hubris of man(kind). These are two extremes of causation, and since we cannot do anything about either they are not really interesting, however true; we should look somewhere in between the immediate and the ultimate root cause. The point of investigations into air disasters is to make air travel safer for you and me in the future, and for pilots and maintenance engineers too for that matter, by uncovering opportunities to improve training or maintenance procedures or emergency procedures or engineering standards.

    So at this stage, we have just found that some persons took some in retrospect unfortunate decisisions when confronted with a chance situation which to them appeared sinister. These things happen, and with the benefit of hindsight it is all so easy. But I am not talking about blame. I want to understand.

    2) In most modern countries where this sort of case arises the very first thing that happens is not a police investigation following a press-release, but a *confidential* and *independent* medical investigation.

    3) In most modern countries, whenever statistical data like this is involved, an external professional statistician is involved. And the first thing that that person does is to go back to the original data, I mean back to original hospital records and back to the persons who gathered and compiled the data. How did they do it, what definitions did they use, what were they looking for? As Willem van Zwet had always said: when you see such extreme data as the little contingency table of shifts of Lucia and shifts with incidents which led to Elffers' infamous "one in 342 million" the first thing you can be sure of as a statistician is that the data is wrong. He turns out to have been completely right. A better number might be something like "one in a hundred".

    4) In the UK and in many other modern countries the nursing staff is much better organized and harder to ignore. Florence Nightingale? In NL, nurses have only had a single organisation representing them for a couple of years. They are largely ignored in hospital management decisions and certainly by medical specialists. They are less well-paid and consequently less-well educated than in quite a few other countries. A colleague of mine was in hospital for 6 weeks with a severe heart condition and took great care to note exactly what medication he was supposed to be having and what he actually got. He was given the wrong pills on 8 occasions. He told this to his heart-surgeon who exclaimed "oh those careless sluts". This shocked my colleague to the core, since he could see that a dedicated and overworked nursing staff was doing an almost impossible job to the very best of their ability. Mismanagement and understaffing, mistakes by specialists and pharmacists, illegible prescriptions, were the order of the day.

    So my recommendations are:

    1) Strengthening of the role and prestige (hence improvement of level of education, level of training, hence level of salary) of nursing staff in hospitals.

    2) More scientific diagnostic reporting ("differential diagnostics"). In the medical-legal situation the medical specialist must discard his role of God who knows the right decision to make and never makes a mistake (in life and death situations), and adopt a more humble scientific attitude, concordant with the facts that even after post-mortem examination cause of death is not really known in 30% of deaths, and that three people a day die in Dutch hospitals because of avoidable medical errors (compare this to two a day in road accidents). But admitting individual medical errors is taboo. In the Lucia case, none were admitted, but finally many were revealed. This problem is so severe (for the many victims of medical errors) that from June 16, 2010, a new "code of practice" has been introduced, which allows medical practitioners to apologize for mistakes, without thereby admitting legal responsibility!

    3) External and independent and confidential medical investigations in Lucia scenarios, before calling in the police. Probably this will often need non-Dutch speaking experts and more openness concerning health care in hospitals.

    4) In the court situation, written scientific expert evidence needs to be put into the public domain as far as possible, so that the scientific methodology used can be openly discussed in the scientific community.

    5) A multidisciplinary and in particular statistical and epidemiological analysis should be made of data on medical incidents at JKZ, say 1995--2005. We can be pretty certain that there was no serial killer active during this period, yet we know that over time there were huge oscillations in the numbers of incidents on at least one particular ward. But no professional statistician has ever had access to more than the most summary of biased summary statistics (no professional statistician was ever heard in court or consulted by the hospital or police).

    Just like sun-spots, earthquakes, or volcanic eruptions, long periods of almost total quiescence were interspersed with short bursts of intense activity: unexplained clusters of events. This phenomenon is seen world-wide. It has numerous times led to wild-goose-chase murder investigations which always end up ruining quite a few lives, even if at the end of the day there is no reason whatever to suppose that anybody did anything wrong at all. In fact, some investigators (who have built an academic career with lucrative media opportunities out of HCSKs or "Health Care Serial Killers), report a world wide epidemic.

    Simultaneously to studying patterns of incidents at JKZ one should study patterns in nurses' shifts, so that we finally known what is the "normal situation", or more precisely, "a" normal situation. One thing that is for sure, is that the time pattern of a nurse's shifts doesn't look anything like the outcome of a homogenous Poisson process. Thus even if shifts and events are unrelated (which for many good reasons is not true either) we are going to see over-dispersion in the numbers of incidents experienced by each nurse, since time itself is a hidden confounder. Since the mean is low but the variance is rather large, many nurses will experience no events at all over long periods of time, while just a few will experience "surprisingly" many. All experienced nurses know this as an empirical fact of nursing life.

    It is so important to study this scientifically and empirically and in a multidisciplinary framework, not just to gain knowledge into a fascinating but never studied phenomenon, but also in order to protect hospital workers by avoiding future red-herring-witch-hunts generated by ignorance and prejudice and the irrelevant statistics of amateur statisticians. We have seen in the cases of Sally Clarke, Lucia de Berk, O.J. Simpson, and in so many others, that when lawyers and medics pretend to be able to do statistics, truth flies out of the window. Lord Rutherford said "if you need statistics, you did the wrong experiment". I beg to submit that "if they use statistics in court, someone will be screwed".

    I reported my recommendations, and requested scientific collaboration, in an email to the chairman of the board of Haga (pdf), Autumn 2010. The response was a notice from a lawyer acting on behalf of Haga hospital, that (civil) legal action would be taken against me for slander of one of Haga hospital's respected medical specialists, unless I acceded to certain demands. My (university's) lawyer's advice was to yield to just one of Haga's demands, namely to remove statements by me on various internet discusson blogs concerning this person, which might be seen to be too personal and hence beyond the bounds of propriety. However, I stood on my position that the information which I had disseminated was in the public interest and that I had had no intention whatsoever of harming that person's reputation. I refused to sign a declaration that I would never ever publish material on these "personal" aspects of the case. Discussions with my lawyers and Haga's lawyers and the long process of getting my offending comments removed from other people's blogs and discussion fora, cost the Dutch taxpayer a large amount of money (though probably peanuts for Haga hospital). The final letter from Haga's lawyers was the statement that Haga would immediately take legal action against me anytime one of my - by now removed - allegations was repeated on internet, or published in any other ways, by third parties. Interestingly, one and a half years later, my earlier postings were cited in their entirety by third parties, who also informed Haga and Haga's lawyers of their actions. However there has been no response from Haga hospital.

    12 April, 2010: founding of the

    Bureau of Lost Causes

    This organisation has been set up inspired by the self-less efforts by so many people over the last six years, which only just now led to the extraordinary and total rehabilitation of Lucia de Berk. Now that the judicial authorities have apoligized personally and publicly, it is time to start finding out where avoidable mistakes were made. It is hard to believe that these can only be attributed to police investigation and legal procedures. However that is the implication of the recent public statement by the board of Lucia's hospital, (unauthorized rough English translation).

    Lucia de Berk
    The tunnel-vision which characterized Lucia's case was cemented in the two weeks around "the" nine-eleven inside a hospital in the Hague. Once by the end of those two weeks a major medical institution had (by implication) told the world that it had caught a serial killer, it must have been very hard for those who brought charges - a few individuals at the very top of the very same institution - not to have had some large influence, deliberately or innocently, on the results of police investigation, and on the "medical" interpretation of the medical dossiers which went to the courts. The events of the past year which came up during those two weeks of internal investigation and suddenly associated with Lucia had become unexpected and inexplicable, though previously every single one of them had been unremarkable.

    The hospital investigators into the crime were the same people earlier treating those patients, and making, as is completely natural, errors of diagnosis or treatment from time to time. The collegiality of the medical community means that mistakes by medical specialists within the Netherlands can hardly be admitted by others inside the same relatively tight, and extremely powerful, community. Highly placed medical authorities had to stand firm by their own previous and now provenly mistaken diagnoses. Others would be loath to criticise a highly regarded colleague's decisions in such a critical case.

    In the Netherlands, medical practitioners almost never admit to having made mistakes... consequently, they do not have to insure themselves agaist being sued for malpractice (which is good for their income), and in theory medical treatment should be less expensive than in other countries where lawyers and insurance companies profit from medical missers. However the Dutch arrangement has led to increasing distress among all those "victims" of medical errors, many of whom would probably be satisfied just to have an "accident" admitted! This June 16, a new code of practice has been introduced, by which medical practitioners will in future be able to apologize for errors without thereby admitting legal responsibility. A giant step for the medical profession, though only a very small step for their patients. Better than nothing, or merely a crumb to keep us "consumers" (the ones who pay for health care) quiet?

    José Booij
    One of the cases we have just taken on board is to unravel the unbelievable story of the illegal kidnapping of José Booij's six week old daughter Julia by a local child protection agency (Assen), and the ensueing cover-up by silencing of the mother through fair means and foul, now in it's sixth year. The kidnapping was judged illegal and a court order was given to return the child immediately. The judges of the courts for child-protection and family simply laughed, and did nothing. The child protection agency had acted on the basis of lies and insinuations of a jealous neighbour to local police and doctors. Her claims about Jose were believed. No attempt whatever was made to check these accusations, nor to hear Jose.

    In desparation, two and a half years ago, Jose wrote to the Cabinet of the Queen just before she was made homeless and all her remaining possessions were taken from her because she could no longer pay her bills (many of them fat lawyer's bills who did nothing except making a phone call and deciding to keep out of this mess), after losing her job, house, and health. Here is Jose's
    letter to the queen in Dutch (original) and in English (first rough translation), written just before she went underground.

    The cabinet of the queen forwarded her plea to the Ministry of Family and Welfare.

    Nothing has been done for two and a half years now.

    Her case was also brought at the same time to the European Court of Human Rights in Strasbourg.

    Nothing has been done for two and a half years now.

    Here is an official report by psychiatrist Bram Bakker, Dutch original, and the report in rough English translation, written five years ago, when Jose was up and fighting, though already suffering post-traumatic stress syndrome. It still then seemed that it might not be difficult to get her baby back to her, provided she kept on fighting against the injustice which had been done her, and someone, somewhere, would stick out their neck for her. Still then, it could easily have been possible to save Josés health and livelihood and future.

    Unfortunately, that would have required admitting that some mistakes had been made by some irresponsible local officials. Something which is Not Done in quaint Kafkhanistan-on-Rhinemouth - where the tulips are in flower, and the smell of fresh smoked nether-weed greets you as you wander along the pretty canals of the old cities, advisably keeping an eye open for dog shit below and pickpockets to the side, as well as for the splendid seventeenth century facades above you.

    The picture the Dutch like to project of themselves (indeed, they believe in it themselves!) to the outside world is sometimes discordant with the reality within. And, as we know from the case of Lucia de Berk, truth can be far stranger from fiction in the Netherlands. Incredible miscarriages of justice can be triggered when a chance event sets off a time bomb built from the interaction of personalities of a handful of people in some critical positions. Moreover, once the damage has been done, legal and bureaucratic thinking and the Dutch culture of "mind your own business" (cobbler stay at your last) traps the victim in a complex vicious circle of Catch-exponential-22 system-assumptions ensuring that escape is impossible.

    Resistance is futile. You will be assimilated. Read more at the Bureau of Lost Causes.

    Kevin Sweeney
    Another case we are studying, with all the same features, is the extraordinary story of Kevin Sweeney. More information on that case can be found below. The incredible similarities between the cases provide a worthy study in individual versus group mentality, and how a scape-goat is chosen when a society is feeling under threat. This will be researched by a multidisciplinary team of cultural anthropologists, ethologists, sociologists, historians, lawyers, psychologists and mathematicians during my DLF fellowship at NIAS and of course by the Bureau of Lost Causes.

    More Various

    Statistical ethics of the probiotica trial. This randomized triple-blind clinical trial of probiotics treatment for patients with predicted severe acute pancreatitis ended in controversy, when it transpired at the conclusion of the trial in December 2007, that rather more patients had died on the treatment arm of the trial than on the control arm.

    It seemed strange that the trial had not been terminated at the interim analysis. The researchers were using a a stopping rule of S.M. Snapinn, by which the trial would to be terminated early either if it were almost certain that the final result would be a significant positive effect of probiotica, or if it were almost certain that the final result would be insignificant. Here is a paper by myself, to appear in Statistica Neerlandica, and, in Dutch, a short article by probabilist Ronald Meester and microbiologist Pieter ter Steeg which appeared in the newspaper Trouw and an open letter to Meester and ter Steeg by biostatisticians Hans van Houwelingen and Theo Stijnen. Also in Dutch there are a series of interviews (early 2008) on the current affairs chat show “Pauw and Witteman”: chairman of the hospital board Geert Blijham, 23 January; patient Jochim Vromans, 24 Jaunary; probiotics expert Eric Claassen, 25 January; leader of the research team Hein Gooszen, 14 February.

    Later we obtained the data at the time of the interim analysis. It was given to journalists at a press conference on Feb. 13 2008, but never released to interested scientists. It turned out that the probiotica trial was not terminated for futility (following the Snapinn stopping rule) at the half way interim analysis, through a mis-reading of output of the SPSS package, which, without consulting the user, always reports the smaller p-value of the two one-sided Fisher's exact tests for equality of two binomial probabilities. Proper application of their own stopping rule would have led to early termination of the trial, since according to the criteria set in advance, there was no chance any more that it would result in a positive result for the probiotica treatment. The trial was de facto continued because there was a good chance that it would finally result in a negative result for probiotica. Here are slides of my talk careless statistics costs lives on the subject.

    Kevin Sweeney ... recently left a Dutch jail at the end of his sentence for murder of his wife by arson. He has always claimed innocence. Here is a link to his own site, Justice for Kevin Sweeney, here is a short synopsis of the case, and here is my blog entry Justice in the Netherlands: Guilty until Proven Innocent. In May, 2008, he put in an application to revise the case (English translation) to the Supreme Court. The application is based on an analysis of the fire evidence by Fred Vos, entitled Het vergeten tijdspad (the forgotten timeline). This is the first time a careful reconstruction of the course of the fire has taken place, taking account of all evidence available to the courts. The evidence seems totally consistent with a fire accidentally started by smoking in bed; and is totally inconsistent with the prosecution’s claim of arson using large quantities of white spirits (Dutch: terpentine). Vos is careful to distinguish observed facts from interpretations thereof. Many writers on the case, including myself, have been misled by such misinterpretations.

    Mathematical Centre (Amsterdam) publications are now available on internet. Here are two early works which had quite some impact, including the reprint of my 1979 PhD thesis:
    R.D. Gill (1980), Censoring and Stochastic Integrals, MC Tract 124.
    R.D. Gill (1983), The sieve method as an alternative to dollar-unit sampling: the mathematical background, Report SN 12
    Another useful link is to my Saint Flour lectures on survival analysis.

    Product-integrals are to products, as integrals are to sums. Though they have been around for more than a hundred years, they never became part of the standard toolbox, possibly because no-one invented the right mathematical symbol for them. I made a try quite some years ago, though they still have not caught on yet. With the crucial help of JC Loredo, my efforts resulted in, files for getting beautiful \prodi and \Prodi and \PRODI symbols in your LaTeX, and Loredo.ttf, a TrueType font for ordinary word processing. It is not that difficult these days to get new fonts into your latex, see for instance TUG's font installation instructions.

    Nurse Lucia de Berk, victim of gossip and bad statistics

    The Dutch nurse Lucia de Berk has been completely exhonerated. Not only is there no proof that she committed any murders, there is no reason whatsoever to suppose that any of the deaths and other incidents with which she was connected were in any way unnatural. Lucia had been given a life sentence for seven murders and three attempted murders of patients in her care. Statistical reasoning played a central role in her case, first explicitly but later, after an appeal court confirmed the sentence, implicitly: it was converted into irrefutable medical evidence, in a completely circular and seemingly unbreakable chain of legal reasoning. An official judicial review committee uncovered many irregularities in the handling of the case, in which the rapid response of hospital authorities led to tunnel-vision and bias from the earliest stages of the case. A new medical investigation commissioned by the supreme court has removed the linch-pin of the prosecution case, the only death “proven” to be a murder, and “proven” to have been committed by Lucia, on its own merits. There is no reason now not to suppose that this was a natural death. Evidence of any wrong-doing in any of the cases is totally nonexistent. There was however the usual amount of medical blunders and mistaken diagnoses, but at least the professional behaviour of the nurses was exemplary. The statistical evidence - which is all that remains - has been totally discredited. The data was seriously biased, a meaningless statistic was computed, and the model used was completely inappropriate. A cluster of incidents on this hospital ward was actually a common occurrence. The presence of Lucia at many of the incidents in one cluster was not terribly unlikely, though striking enough to have drawn attention to her. Neither shifts nor incidents occur uniformly at random. Half of the incidents were repeated events associated with a small number of particularly sick children. Shifts and incidents are not independent of one another, since a more observant nurse notices problems with a patient earlier than a less careful nurse. Lucia had more weekend shifts than most of the nurses (lesser qualified part-timers, trainees, and temporary employees), while incidents typically occurred in the weekends. Neither fact is surprising, both facts were never reported.

    I have written more on the case on my pages Lying Statistics Damn Nurse Lucia de B, and you can also find much information (Dutch and English) at

    My sanskrit name

    Sarasvati Leela dasa (dasa: a devotee; Leela: games; Sarasvati: goddess of science, music, self-knowledge)

    My Korean signature

    (Last updated: 27 January 2014)