Richard D. Gill’s home page
To contact me quickly, try email
(surname at math dot leidenuniv dot nl), or mobile phone.
Click here for my postal and visiting addresses,
office and mobile phone numbers, email addresses, and further contact possibilities.
If you are an Indian IT student looking for a summer
interneeship then please read this.
YouTube video. Slides of my talk
Murder by Numbers.
The naked truth about the case of Lucia de B.
A historical document: statement by Haga-Hospital, 2010,
regarding the acquittal of Lucia de Berk: English translation, original.
Four days after the TEDx event, I saw the movie Lucia de B.
on its premiere night in Amsterdam. Here is my personal
film review: Splendid acting, very moving, beautifully told human story,
centering around Lucia herself. Despite compression of the story line and focus
on Lucia's personal experiences, it still contained such key features as:
the close personal links
between key people from hospital, justice and experts (image right);
the mental illness and mental breakdown of the chief-paediatrician at JKZ ...
There was a vain and ambitious hospital director. A bad statistician.
Real life heroine Metta de Noo and hero Ton Derksen
were concentrated in the film into the imaginary person
of one imaginary whistleblower at the last place you might expect
to find them: in the Public Prosecution service. But on the other hand:
it wasn't black and white. There were good medics and bad medics,
good nurses and bad nurses, good cops and bad cops ... Apparently,
even some people in the Public Prosecution service found the witch hunt
Some years ago
I offered a prize for the person who remasters the logo of the VVS: the Dutch statistical society (top image) in the most beautiful postscript. An exercise in
curve fitting with splines, perhaps?
Better still would be a mathematical/statistical story of the curves
themselves, providing an elegant parametric family which reproduces the
whole logo. Finally I decided to do it myself, and I think I am getting
close with this perspective image of some very simple 3-dimensional
curves, with indeed a statistical story behind them
The R script which draws this logo can be found here.
It should generate a rotatable 3-d image...
See the slides of my Amst-R-dam R users group meetup 2011 (updated 2012) talk R-fun: part 1, the VVS logo in R; part 2, R on an iDevice.
For the latest news on "R on an iDevice" see the 2014 talk R on an iDevice, given at a Data Science NL meetup.
VVS stands for "Vereniging voor Statistiek". SMS stands for "Section Mathematical Statistics". The VVS also has an OR section, hence the common alternative name VVS-OR.
FS: Tuesdays 11:00--12:30 room 401
Master's level (or advanced Bachelor's)
The master specialization Statistical
Science for the Life and Behavioural Sciences is a collaboration of our group with others in biomedical statistics, biostatistics, and psychometrics.
Here you can find links to various courses I have given in the past, in particular quantum statistics, statistics for astronomers, HOVO courses (adult education courses, in Dutch) on use and abuse of statistics, forensic science
Interests, most active marked *
Recent talks and papers
WARNING: Richard P. Feynmann said
that attempting to understand quantum mechanics causes you
to fall into a black hole, never to be heard from again
The past is particles, the future is a wave
Bell’s fifth position
During the academic year 2010-2011 I was Distinguished Lorentz Fellow (DLF) at the Netherlands Institute of Advanced Study in the Humanities and Social Science, NIAS.
Here's my research proposal.
The award ceremony was at NIAS, Wassenaar, late-afternoon of March 22, 2010.
On the morning of the same day we held a complementary
Breakfast symposium "Science, Media, Justice"
The Smeesters affair (revised: July 4, 2013)
Slides of talk on Smeesters case, and on the Geraerts-Merckelbach Memory paper affair.
Talk originally given December 2012; slides updated March 2013; title "Integrity or fraud - or just questionable research practices?"
Stimulated by media interest in the Geraerts-Merckelbach controversy on their
"Memory" paper, I studied the published summary statistics in this paper using the same techniques as Simonsohn used for Smeesters, and found quite clear
statistical evidence for "too good to be true". Without
experimental protocols written up prior to the experiment, original data-sets,
and laboratory log books detailing all the data selection and manipulation steps
which resulted in the final data-set on which the summary statistics
in the paper are based, one can only guess how these patterns arose.
It certainly need not be fraud (fraud requires active intention to deceive).
R-code for experiment with Simonsohn's fraud test (new version)
Histogram of p-values of an honest researcher
Histogram of p-values of a dishonest researcher
Disclaimer: These notes formed an attempt to reconstruct the statistical analyses
performed by Erasmus University Committee on Scientific Integrity, based
intitially only on the censored version of the report of the committee
which Erasmus released at the start of the affair.
Later an uncensored report was made available, and
later still Uri Simonsohn has made a "working paper" available which reveals yet more details of his methodology. The uncensored Erasmus report still leaves,
for me, many unanswered questions concerning their exact methodology.
The main resource for these notes was therefore the report of the Erasmus University
Committee on Scientific Integrity (commissie wetenschappelijk integriteit, CWI). I am very grateful for recent communication with fraud-buster Uri Simonsohn,
social pychologist at Wharton University
School of Operations and Information Management, who pointed
out a major error in an earlier version of my R experiment and
also in my thinking!
The results given here are still speculative and my understanding
of the statistical procedures used by Erasmus-CWI (who refuse to comment)
may be far from correct.
Simonsohn's web page contains links to two interviews he has given in "Nature"
and in the Dutch newspaper "Volkskrant" (English translation).
For Smeesters' version of events see the interview with Smeesters
in the Belgian (Flemish) newspaper De Standaard; part two of
this interview is unfortunately only available for subscribers.
Just recently Uri has released a new working paper called "Just post it: the lesson from two cases of fabricated data detected by statistics alone".
A link can be found at his homepage. The paper contains another link
to some useful supplementary material.
According to the Erasmus-CWI report,
Simonsohn's idea was that if extreme data has been removed in an attempt to
boost significance, the variance of sample averages will decrease.
Now researchers in social pyschology
typically report averages, sample variances,
and sample sizes of subgroups of their subjects, where the groups are
defined partly by an intervention (treatment/control) and partly by
covariates (age, sex, education ...). So if some of the covariates can be assumed to have no effect at all, we effectively have replications: i.e., we
see group averages, sample variances, and sample sizes,
of a number of groups whose true means can be assumed to be equal.
Simonsohn's test statistic for testing the null-hypothesis of honesty versus the alternative
of dishonesty is the sample variance of the reported averages of groups
whose mean can be assumed to be equal. The null distribution of this
statistic is estimated by a simulation experiment, by which I suppose is
meant a parametric bootstrap in the situation where we do not have
access to the orginal (but post-massage) data of the experiment. If
the "original" data is available we could use the full (non-parametric)
In the present experiment I used a parametric bootstrap under an
assumption of normality. Within the bootstrap, we
pretend that the reported group variances
are population values, we pretend that the actual sample sizes are the
reported sample sizes,
and we play being an honest researcher who takes
normally distributed samples of these
sizes and variances, with the same mean.
I did 1000 simulations of honest and of dishonest
researchers in the following scenario: each researcher takes a sample
of size 20 from each of 5 groups (standard normal distributions, same means and variances). The dishonest researcher discards all observations larger than 0.5, and all observations smaller than -2.0. He or she is attempting to increase the significance of the difference between these combined groups of subjects with other groups, by making the mean value for the groups whose statistics we
are studying a whole lot lower, by removal of a massive number of observations
(everything bigger than 0.5). Moreover, variance
is being further reduced by removal of a few very small observations (everything smaller than -2.0). The
sample size is reduced to about two thirds of its original size in this way. Simonsohn's test statistic is the sample variance of the group means.
Its null distribution is estimated in my experiment by parametric
bootstrap. My bootstrap sample size was 1000 and each bootstrap consists
of five normal samples with equal means, sample sizes equal to the reported
sample sizes (varying in the dishonest case), and variances equal to the reported sample variances. So each bootstrap generates one observation of "variance of the five group means".
The relative frequency of bootstrap test-statistics smaller than the actually observed test-statistic is the p-value according to the empirical bootstrap null distribution of the test statistic. The idea is that the faked data display less variation than their summary statistics would lead one to expect.
The histograms displayed here are histograms of these bootstrap p-values
for a one-sided test (reject for small values) for honest and for dishonest
researchers. Fortunately for the honest researchers, their p-values are close
to uniformly distributed. The dishonest researchers on the other hand tend
to have p-values which are much smaller, as we had expected.
This little experiment, which takes 80 seconds to run
on my 2.4 GHz Intel Core 2 Duo MacBook Pro,
shows that Simonsohn's test statistic could be a valuable tool.
Indeed, this massive and one-sided
data-amputation has decreased
the variance of group averages, relative
to what one would expect from the reported groups' sample
sizes and standard deviations.
To put it another way, the actual variation in the reported group averages is too good to be true relative to the reported standard deviations and
reported sample sizes.
In my experiment, the test does indeed have
actual size very close to 5%, when used at nominal level 5% .
Its power against the alternative which I took is about 12%.
Not extremely exciting, but the idea is to combine many tests over
many groups of groups of subjects, and possibly also over a number
of experiments reported in the same paper, or even over a number
of papers by the same researcher.
It would be interesting to see if a non-parametric bootstrap gives similar
results. The massaged data has a far from normal distirbution.
I have the following (provisional) four
criticisms of the methodology and the reporting thereof.
(1) We can be sure that honest or dishonest,
the original data is not normally distributed.
So statistical conclusions based on a parametric bootstrap are
(2) The Erasmus report told us nothing about how Simonsohn got onto Smeesters' tail.
Was he on a cherry-picking expedition, analysing hundreds of papers or
hundreds of researchers, and choosing the most
significant outcome for the follow-up?
This is not revealed in the Erasmus report, but it is important to
know in order to judge the significance of Erasmus' re-analysis
of the same data. According to Simonsohn, someone drew his
attention to one of Smeesters' papers in August 2011
(The effect of color (red versus blue) on assimilation versus contrast in prime-to behaviour effect, coauthor Jia Liu, University of Groningen).
Why? Knowing Simonsohn's reputation, this was presumably someone
who had his or her own suspicions. Why?
According to the Erasmus-CWI report, Simonsohn concluded
from his statistical methodology that that the summary statistics in that paper
were too good to be true. Simonsohn requested and obtained
Smeesters' dataset and discovered more anomalies which confirmed his
initial opinion. Smeesters tried to explain some of the anomalies but Smeesters'
explanations would make the observed anomalies less likely, not more likely.
At this point Smeesters' computer crashed and all original material
of all of his experiments, ever, was lost.
Other statistical analyses, replicated by CWI and described
in Appendix 4 of the Erasmus report, confirmed that these further striking
patterns in the data (of completely different nature to what we are studying here)
are extremely unlikely to have arisen by chance. Smeesters' data has not
been released so it is still not possible to replicate the most crucial parts
of the analysis: those which, as it were, independently confirmed the
initial suspicions of something worse than data-massage.
(3) Erasmus-CWI uses the pFDR method
(positive - False Discovery Rate) in some kind of attempt to control for multiple testing.
In my opinion, adjustment of p-values by pFDR methodology is absolutely inappropriate in this case.
It includes a guess or an estimate of the a priori "proportion of null hypotheses to be tested which are actually false".
Thus it includes a "presumption of guilt"!
The pFDR method, when correctly used, guarantees that of those results which are nominally significant at the pFDR-adjusted level 0.05, only 5% are false positives.
Thus 95% of the "significant" results are for real - if the a priori guess/estimate is correct, and in the long run!
This methodology was invented for massive cherry-picking
experiments in genome wide association studies.
It was not invented to correct for multiple testing in the traditional sense, when the simultaneous null hypothesis should be taken seriously.
Innocent till proven guilty.
Not proven guilty by an assumption that you are guilty some significant proportion of the times.
In order that Smeesters himself is protected from cherry picking by Erasmus-CWI
he should have insisted on a Bonferroni correction of the p-values which they report; a much stronger requirement than the pFDR correction.
(4) The CWI report is itself not a good model of reporting statistical analyses.
There are hundreds of different pFDR methods, which one was used? Simonsohn's paper is not only unpublished by still unobtainable, and the description in the
Erasmus report is terse. One cannot reproduce their
analyses from their description of their procedures. One might object that
this report is the result of an internal investigation of an organisation
carrying out disciplinary investigation of one of its employees, hence that Erasmus
University is actually behaving with unusual and exemplary transparency.
I would retort that
by publishing this report Erasmus university is broadcasting a
public condemnation of Smeesters as a scientist,
which goes far beyond the internal needs of an
organisation in the unfortunate situation when it has to terminate
the employment of an employee for some misdemeanors.
Erasmus university has withdrawn a number of Smeesters' publications;
not the authors of those publications. I am amazed that the statisticians
involved in the investigation at Erasmus are apparently forbidden to
reveal the smallest technical details of their analyses to fellow scientists.
My overall opinion is that Simonsohn has probably found a useful investigative
tool. In this case it appears to have been used by the Erasmus committee on scientific
integrity like a medieval instrument of torture: the
accused is forced to confess by being subjected to an onslaught of vicious
p-values which he does not understand.
Now, in this case, the accused did have a lot to
confess to: all traces of the original data were lost (both original
paper, and later computer files), none of his co-authors
had seen the data, all the analyses were done by himself alone without assistance. The report of Erasmus-CWI hints at even worse deeds.
However, what if Smeesters had been an honest researcher?
Incidentally, in my opinion cherry-picking and data-massaging in themselves
are not evil. What is evil, is not honestly reporting your statistical procedures, and that includes all selection and massaging. A good scientist
reports an experiment in such a way that others can repeat it.
That includes the statistical analyses; and that
includes the methodology by which you choose which results of which
experiments to report, and how your data has been massaged.
In physics, the interesting experiments are immediately replicated by other
research groups. Interesting experiments are experiments which push into the
unknown, in a direction in which there are well-known theoretical and
Experiments are repeated because they give other research groups
a chance to show that their experimental technique is even better,
or to genuinely add new twists to the story.
In this way, bad reporting is immediately noticed, because
experiments whose results cannot be replicated immediately become suspect.
Researchers know that their colleagues (and competitors) are going to study all
the methodological details of their work, and are going to look critically
at all the reported numbers, and are going to bother them if things don't
seem to match or important info is missing. In particular, if the experiment
turns out to be methodologically flawed, you can be sure someone is
going to tell that to the world.
The problem in
social psychology (to reveal my own prejudices about this field)
is that interesting experiments are not repeated. The point of
doing experiments is to get sexy results which are reported in the popular
media. Once such an experiment has been done, there is no point in repeating
Biography and more ...
First Leiden inaugural lecture
Past phd students
Just for fun:
things you wish your computer had
(including the classic clippy’s suicide note)
A few years ago I discovered the enormous disussion on the Monty Hall (three doors) problem on wikipedia.
My published writings on the subject are, in order of writing (and in order of
insightfulness) an invited contribution to
Springer's International Encyclopaedia of Statistical Science, 2010,
a paper in Statistica Neerlandica, 2011,
and contributions to the peer reviewed internet encyclopedias
citizendium.org and StatProb.com.
In this manuscript you will find
an expanded version the most recent published work, the
In these works I distinguish between the original, somewhat
ambiguous, real world question about a famous quiz show, and the many
mathematizations of the question which
have been proposed in the literature. Personally I prefer the lesser
theoretic version. For me, the question is not "what is this
probability?" or "what is that probability?", but: "what would you do?"
And to me, the wikipedia controversy around
the Monty Hall problems (concerning whether we should compute a
conditional or unconditional probability of getting the car if we
switch doors) is a warning against solution-driven science.
I want to thank so many wikipedia editors for the inspiration they gave
Suppose the car is hidden behind one of the three
doors by a fair randomization. The contestant chooses Door 1. Monty
Hall, for reasons best known to himself, opens Door 3 revealing a goat.
We know that whatever probability mechanism is used by Monty for this
purpose, the conditional probability that switching will give the car
is at least 1/2. We know that the unconditional probability (ie not
conditioning on the door chosen by the contestant, nor the door opened
by Monty) is 2/3.
Always switching gives the car with unconditional probability 2/3,
always staying gives it with probability 1/3. Nobody in their right
mind could imagine that there could exist some mixed strategy
(sometimes staying, sometimes switching, perhaps with the help of some
randomization device, and all depending on which doors were chosen and
opened) which would give you a better overall (ie unconditional) chance
than 2/3 of getting the car.
This is true, of course. In fact, from the law of total
probability, proving the optimality of (unconditional) 2/3 by always
switching is equivalent to proving that all the six conditional
probabilities of winning by switching, given door chosen and door
opened, are at least 1/2. We can prove the latter using Bayes' theorem,
or, better I think, using Bayes' rule in a smart way. However both
these proofs require some sophistication.
Is there an elementary proof? A short proof using words and ideas, no computations.
Yes there is, and I learnt it from Sasha Gnedin.
However you play there's always a door such that if the car is
there, you'll miss it. Consider first deterministic strategies. We only
need consider two cases: for "always switching" it's the door you
initially chose, and for "sometimes switching" it's a door you won't
switch to if you get the option. (If you never switch there are two
such doors: just choose one). Ordinary readers won't be interested in
randomized strategies but anyone who wants to include these will
understand how to do it (now the door where you'ld certainly miss a car
has to be a random door, determined by the same coin tosses used to
implement the random choices in your own strategy).
Note that the door which has been indicated in this way does not
depend on where
the car is actually hidden or how the host plays: it just depends on
how the player plays. Therefore if the car is initially equally likely
to be behind any of the three doors, we run a 1/3 chance that the car
will be missed because it's behind this door. Therefore the 2/3
success-chance of always switching can't be beaten.
I would call this a proof by coupling.
From Three Doors to Two Envelopes (what will be next? One Coffin, perhaps?). Here
is my fourth draft of the definitive article on the infamous two
envelopes problem. The problem which Martin Gardner could not solve,
and which many other famous people got wrong. Studied by probabilists,
logicians, economists, philosophers. Now studied by me ...
heart of all exchange paradoxes is encapsulated in a little
theorem which I call my "unified solution".
It seems to be new.
This letter by me
published by Dutch magazine 'Nursing', April 2012, -- here's a rough English translation -- summarizes what we now know about the causes of the case: namely a kind of subconscious mass conspiracy by a number of medical specialists at JKZ to connect their own errors and other unexpected mishaps to Lucia.
This process started at least nine months before the "unexpected" death of baby Amber in the early hours of 4 September, 2001.
One piece of evidence for this (there is lots more): within 24 hours of Amber's death, the hospital director reported five unnatural deaths at his hospital over the last year to the police and to the Dutch national health inspectorate.
Yet none of the other deaths had previously been thought in any way
to be unusual - at least, not officially. There had been no reports of incidents
to the authorities. Even Amber's death was first registered as natural,
this was only converted to unnatural later in the day.
It took JKZ's director less than fifteen minutes, from the time he was informed
on the afternoon of 4 September of the latest death, to report
the five unnatural deaths to police and to the health inspectorate.
Within a day, Lucia was put on "non-active". All staff was informed of the upcoming police investigation and of the implication of an (unnamed) nurse.
A press conference was held in which it was also mentioned that the murder investigation (by the hospital???) was being extended to other hospitals
at which the same nurse had been earlier employed.
Coincidentally, hospitals at this very moment in the process of
being merged with JKZ.
Implication: four dossiers were ready and waiting. The doctors had simply
been waiting for one more, and the time between the death and informing
the director, was simply used to complete the last dossier.
During these few hours a
number of important medical facts about Amber's death were altered
on the patient's dossier by the specialists concerned: most crucially, the sequence of events leading to death. Lung failure before heart failure, suggestive of a natural death by exhaustion, was changed into heart failure
before lung failure, suggestive of poisoning.
Yet at the trial, the hospital director and most (but not all!) specialists claimed that there had been no suspicion at all of anything,
or of anyone, till the death of Amber.
Interestingly, the investigation by the Dutch national Health Inspectorate (IGZ), at the same
time as a police investigation and then a murder trial was going on, found no evidence of anything remarkable! Their report has never been made public.
Even the mere fact of their investigation seems completely unknown.
Also interestingly, the media has totally ignored all this news. Presumably, the
Dutch population is "Lucia-tired" - this story won't sell newspapers or generate large viewing numbers on TV.
The Dutch medical community, and people in the upper
echelons of "justice" still believe "she did it". The legal system
got the blaim (but also takes any credit: "she got freed, didn't she?").
The medical world looks the other way.
I sent a copy of my letter to the directors of Haga hospital,
to Haga's lawyer, and
to the coordinator of research at Haga (who had earlier been forbidden
by her bosses to communicate with me). I was expecting at least a response
from the lawyer, but nothing came.
Lucia interviewed (in English) on CNBC
Learning from Lucia, slides from my lecture at ATSTATS 2010, video of the lecture
"Learning from Lucia" is the title of many talks I am giving these days.
I believe that there is so much of value that we can learn from understanding the Lucia case, which in one sense was concluded with the "not guilty" verdict given at her retrial in 2010, but in another sense is still completely open: what really did happen? Why was there ever a case at all?
is my present list of recommendations.
In my opinion a number of "system faults" were exposed by the catastrophe,
some of them specific to
the Dutch situation (with its specific and fascinating culture, history, ...).
Actually, the Dutch legal system has already learnt a great deal. The
same goes for the Dutch scientific world and in particular, the
statistical world and the world of forensic science.
What remains is for the medical world to learn. However, as long as it
denies any responsibility, that learning process cannot start, and
unfortunately that is definitely still the case.
Here are some of the hard facts of the case and the hard facts of the
1) The fact that there was a case at all (2001) and the
outcome of the first round of court cases (Supreme Court, 2006)
was - it seems to observers with access to the dossiers
and who study the reports of CEAS
(judicial review committee reporting
to the Public Ministry), of the Advocate General to the Supreme
Court, of Prof Meulenbelt to the court in Arnhem
and of the conclusions of that court itself -
strongly determined by the interation between
the chef-de-clinique of the Juliana Children's Hospital,
a well-known and respected paediatrician,
and the director of that hospital.
Their actions were influenced by malicious
gossip about Lucia, and moreover we now know that the
paediatrican was mistaken in some diagnoses,
so that she herself could well have been surprised when
some of her patients suddenly died.
Her brother-in-law, a theoretical computer scientist with no experience
whatever in applied statistics, supported her amateur
statistical conclusions based on her own data-gathering
(the data of the amazing coincidence of Lucia
so often being on duty whenever strange things happened).
These were the statistics at the outset of the case.
Together these two
persons reported a number of deaths and other incidents to the
police as being highly suspicious. Each medical dossier
was accompanied by the chief paediatrician's one-page summary
explaining why the incident was suspicious.
She was made hospital coordinator and laison person for the subsequent
police investigation, yet she never gave witness to the court of
appeal, and only
briefly at the lower court.
Two extremely hierarchical and powerful organisations (a large
hospital and the public ministry) had to be linked up for a murder
So we have some medical errors and, It seems to me some, some
errors. The hospital director was responsible for a number of very
far-reaching decisions. The fact that the hospital had become aware of
a serial killer in the nursing
staff, and a suspect had been suspended from duty, was communicated in
a succession of three internal memos to the hospital staff and in a
including TV appearance, before the police had started an
investigation; before any external investigation had taken place at
all. To this day, hospital staff is forbidden to talk about the affair
(the board of the the Dutch society of paediatricians has also
forbidden discussion of the case by its members).
The director was an authoritarian manager; perhaps necessarily so,
since at the time he was charged with the merger of three badly
functioning hospitals in a bad financial situation. He was focussed on
processes and on
the reputation of his hospital(s). He is known for, and proud of,
making rapid criticial decisions and never looking back on decisions
see the interview with him (in Dutch) in Skipr, the Dutch magazine for
health care managers.
These two key persons consider that they acted completely properly and
state they would do exactly the same
again if the same circumstances arose again today -- which further
underlines the point I am trying to make: the "integrity" of the system
in which these individual human beings were embedded needs to be
investigated, since it so easily allowed the chance
interaction of their personalities together with some bad luck to spark
a catastrophe - of which they too are the victims. And by the way, such
and managerial personalities are not rare, nor - as I will explain - is
the "run of bad luck" which hit Lucia.
Compare this with an investigation into an air disaster. The direct
cause might be the chance mechanical failure of some bolt or electrical
failure of some
wiring followed by
some errors of judgement of pilots faced with what seems like a
dangerous situation. My wife usually says: "I know why the plane came
down: because of gravity"; but sometimes she lays the blame on the
hubris of man(kind). These are two extremes of
causation, and since we cannot do anything about either they are not
interesting, however true; we should look somewhere in between the
immediate and the ultimate root cause. The point of investigations into
air disasters is to make air travel safer for you and me in the future,
and for pilots and maintenance engineers too for that matter, by
uncovering opportunities to improve training or maintenance procedures
or emergency procedures or engineering standards.
So at this stage, we have just found that some persons took some
in retrospect unfortunate decisisions when confronted with a chance
situation which to them appeared sinister. These things happen, and
with the benefit of hindsight it is all so easy. But I am not talking
about blame. I want to understand.
2) In most modern countries where this sort of case arises the very
first thing that happens is not a police investigation following a
press-release, but a *confidential* and *independent* medical investigation.
3) In most modern countries, whenever statistical data like this is
involved, an external
professional statistician is involved. And the first thing that that
is to go back to the original data, I mean back to original hospital
and back to the persons who gathered and compiled the data. How did
do it, what definitions did they use, what were they looking for?
As Willem van Zwet had always said: when you see such extreme data as
the little contingency table of shifts of Lucia and shifts with
incidents which led to Elffers' infamous "one in 342 million"
the first thing you can be sure of as a statistician is
that the data is wrong. He turns out to have been completely right.
A better number might be something like "one in a hundred".
4) In the UK and in many other modern countries the nursing staff
is much better organized and harder to ignore. Florence Nightingale? In
NL, nurses have only had a single organisation representing them for a
couple of years. They are largely ignored in hospital management
decisions and certainly by medical specialists. They are less well-paid
and consequently less-well educated than in quite a few other
countries. A colleague of mine was in hospital for 6 weeks with a
severe heart condition and took great care to note exactly what
medication he was supposed to be having and what he actually got. He
was given the wrong pills on 8 occasions. He told this to his
heart-surgeon who exclaimed "oh those careless sluts". This shocked my
colleague to the core, since he could see that a dedicated and
overworked nursing staff was doing an almost impossible job to the very
best of their ability. Mismanagement and understaffing, mistakes by
specialists and pharmacists, illegible prescriptions, were the order of
So my recommendations are:
1) Strengthening of the role and prestige (hence improvement of level of
education, level of training, hence level of salary) of nursing staff in hospitals.
2) More scientific diagnostic reporting ("differential diagnostics").
In the medical-legal situation the medical specialist must discard his
role of God who knows the right decision to make and never makes a
mistake (in life and death situations), and adopt a more humble
scientific attitude, concordant with the
facts that even after post-mortem examination cause of death is not
really known in 30% of deaths, and that three people a day die in Dutch
hospitals because of avoidable medical errors (compare this to two a
in road accidents). But admitting individual medical errors is taboo.
In the Lucia case, none were admitted, but finally many were revealed.
This problem is so severe (for the many victims of medical errors) that
from June 16, 2010, a new "code of practice" has been introduced, which
allows medical practitioners
to apologize for mistakes, without thereby admitting legal
3) External and independent and confidential medical investigations
in Lucia scenarios, before calling in the police. Probably this will
often need non-Dutch speaking experts and more openness concerning
health care in hospitals.
4) In the court situation, written scientific expert evidence needs
to be put into the public domain as far as possible, so that the
scientific methodology used can be openly discussed in the scientific
5) A multidisciplinary and in particular statistical and
analysis should be made of data on medical incidents at JKZ, say
1995--2005. We can be pretty certain that there was no serial killer
active during this period, yet we know that over time there were huge
oscillations in the numbers of
incidents on at least one particular ward. But no professional
statistician has ever had access to more than the most summary of
biased summary statistics (no professional statistician was ever heard
in court or consulted by the hospital or police).
Just like sun-spots, earthquakes, or volcanic eruptions,
long periods of almost total
quiescence were interspersed with short
bursts of intense activity: unexplained clusters of events. This phenomenon is
seen world-wide. It has numerous times led to wild-goose-chase murder
investigations which always end up ruining quite a few lives,
even if at the end of the day there is no reason whatever to suppose that anybody did anything wrong at all.
In fact, some investigators (who have built an academic career with lucrative
media opportunities out of HCSKs or "Health Care Serial Killers),
report a world wide epidemic.
Simultaneously to studying patterns of incidents at JKZ one should
patterns in nurses' shifts, so that we finally known what is the
"normal situation", or more precisely, "a" normal situation. One thing
that is for sure, is that the time pattern of
a nurse's shifts doesn't look anything like the outcome of a homogenous
Poisson process. Thus even if shifts and events are unrelated (which
for many good reasons is not true either) we are going to see
over-dispersion in the numbers of incidents experienced by each nurse,
since time itself is a hidden confounder. Since the mean is low but the
variance is rather large, many nurses will experience no events at all
over long periods of time, while just a few will experience
"surprisingly" many. All experienced nurses know this as an empirical
of nursing life.
It is so important to study this scientifically and empirically and in a
multidisciplinary framework, not just to gain knowledge into a fascinating but never studied phenomenon,
but also in order to protect hospital
workers by avoiding future red-herring-witch-hunts
generated by ignorance and prejudice and
the irrelevant statistics of amateur statisticians.
We have seen in the cases of Sally Clarke, Lucia de Berk, O.J. Simpson,
and in so many others, that when lawyers and medics
pretend to be able to do statistics, truth flies out of the window.
Lord Rutherford said "if you need statistics, you
did the wrong experiment". I beg to submit that "if they use
statistics in court, someone will be screwed".
I reported my recommendations, and requested scientific collaboration, in an email to the chairman of the board of Haga (pdf), Autumn 2010. The response was a notice from a lawyer acting on behalf of Haga hospital, that (civil) legal action would be taken against me for slander of one of Haga hospital's respected medical specialists, unless I acceded to certain demands. My (university's) lawyer's advice was to yield to just one of Haga's demands,
namely to remove statements
by me on various internet discusson blogs concerning this person, which might be seen to be too personal and hence beyond the bounds of propriety. However, I stood on my position that the information which I had disseminated was in the public interest and that I had had no intention whatsoever of harming that person's reputation. I refused to sign a declaration that I would never ever publish material on these "personal" aspects of the case. Discussions with my lawyers and Haga's lawyers and the long process of getting my
offending comments removed from
other people's blogs and discussion fora, cost the Dutch taxpayer a large amount of money (though probably peanuts for Haga hospital). The final letter from Haga's lawyers was the statement that Haga would immediately take legal action against me anytime one of my - by now removed - allegations was repeated on internet, or published in any other ways, by third parties. Interestingly, one and a half years later, my earlier postings were cited in their entirety
by third parties, who also informed Haga and Haga's lawyers of their actions.
However there has been no response from Haga hospital.
12 April, 2010: founding of the
This organisation has been set up inspired by the self-less efforts by so many
people over the last six years, which only just now
led to the extraordinary and total rehabilitation of Lucia de Berk.
Now that the judicial authorities have apoligized personally and publicly,
it is time to start finding out where avoidable mistakes were made. It is hard
to believe that these can only be attributed to police investigation
and legal procedures. However that is the implication of the recent
public statement by the board of Lucia's hospital, (unauthorized rough English translation).
Lucia de Berk
The tunnel-vision which characterized Lucia's case was cemented in the two
weeks around "the" nine-eleven inside a hospital in the Hague.
Once by the end of those two weeks a major
medical institution had (by implication)
told the world that it had caught a serial killer,
it must have been very hard for those who brought
charges - a few individuals at the
very top of the very same institution -
not to have had some large influence, deliberately or innocently,
on the results of police investigation,
and on the "medical" interpretation of the medical dossiers which
went to the courts. The events of the past year
which came up during those two weeks of internal investigation
and suddenly associated with Lucia
had become unexpected and inexplicable,
though previously every single one of them had been unremarkable.
The hospital investigators into
the crime were the same people earlier treating those patients, and making,
as is completely natural, errors of diagnosis or treatment from time to time.
The collegiality of the medical community means
that mistakes by medical specialists within the Netherlands
can hardly be admitted by
others inside the same relatively tight, and extremely powerful, community.
Highly placed medical authorities had to stand firm
by their own previous and now provenly mistaken diagnoses. Others
would be loath to criticise a highly regarded colleague's decisions
in such a critical case.
In the Netherlands, medical practitioners almost never admit to having made
mistakes... consequently, they do not have to insure themselves agaist being sued
for malpractice (which is good for their income), and in theory
medical treatment should be less expensive than in other countries
where lawyers and insurance companies profit from medical missers.
However the Dutch arrangement has led to increasing distress among all
those "victims" of medical errors, many of whom would probably be satisfied just
to have an "accident" admitted! This June 16, a new code of practice has been
introduced, by which medical practitioners will in future be able to apologize for errors
without thereby admitting legal responsibility. A giant step for the medical profession,
though only a very small step for their patients. Better than nothing, or merely a crumb
to keep us "consumers" (the ones who pay for health care) quiet?
One of the cases we have just taken on board is to unravel
the unbelievable story
of the illegal kidnapping of José Booij's
six week old daughter
Julia by a local child protection agency (Assen),
and the ensueing cover-up by silencing of the mother through fair means
and foul, now in it's sixth year. The kidnapping was judged illegal and
a court order was given to return the child immediately. The judges of
the courts for child-protection and family simply laughed, and did nothing.
The child protection agency had acted on the basis of lies and insinuations
of a jealous neighbour
to local police and doctors. Her claims about Jose
were believed. No attempt whatever was made to check these
accusations, nor to hear Jose.
In desparation, two and a half years ago,
Jose wrote to the Cabinet of the Queen just before she was made
homeless and all her remaining possessions were taken from her
because she could no longer
pay her bills (many of them fat lawyer's bills who did nothing except making
a phone call and deciding to keep out of this mess), after losing her job, house, and health.
Here is Jose's letter to the queen in Dutch (original) and in English (first
rough translation), written just before she went underground.
The cabinet of the queen forwarded her plea to the Ministry of Family and Welfare.
Nothing has been done
for two and a half years now.
Her case was also brought at the same time to the
European Court of Human Rights
Nothing has been done
for two and a half years now.
Here is an official report by psychiatrist Bram Bakker, Dutch original, and the report in rough English translation,
written five years ago, when Jose was up and fighting,
though already suffering post-traumatic stress syndrome.
It still then seemed that it might
not be difficult to get her baby back to her, provided she kept on fighting
against the injustice which had been done her, and someone, somewhere,
would stick out their neck for her.
Still then, it could easily have been
possible to save Josés health and livelihood and future.
would have required admitting that some mistakes had been made by some
irresponsible local officials.
Something which is Not Done in quaint
Kafkhanistan-on-Rhinemouth - where the tulips are in flower,
and the smell of fresh smoked nether-weed greets you as you wander
along the pretty canals of the old cities, advisably keeping
an eye open for dog shit below and pickpockets to the side,
as well as for the splendid seventeenth century facades above you.
picture the Dutch like to project of themselves (indeed, they believe
in it themselves!) to the outside world
is sometimes discordant with the reality within. And, as we know from
the case of Lucia de Berk, truth can be far stranger from fiction in
the Netherlands. Incredible miscarriages of justice can be triggered
when a chance event
sets off a time bomb built from the interaction of personalities of a
of people in some critical positions. Moreover, once the damage has
legal and bureaucratic thinking and the Dutch culture of "mind your own
business" (cobbler stay at your last) traps the victim in a complex
vicious circle of Catch-exponential-22 system-assumptions ensuring that
escape is impossible.
Resistance is futile. You will be assimilated. Read more at the Bureau of Lost Causes.
Another case we are studying, with all the same features,
is the extraordinary story of Kevin Sweeney.
More information on that case can be found below.
The incredible similarities between the cases provide a worthy study in individual
versus group mentality, and how a scape-goat is chosen when a society is feeling
under threat. This will be researched by a multidisciplinary team of
cultural anthropologists, ethologists, sociologists, historians, lawyers, psychologists
and mathematicians during my DLF fellowship at NIAS and of course by
the Bureau of Lost Causes.
Statistical ethics of the probiotica trial.
This randomized triple-blind clinical trial of probiotics treatment
for patients with predicted severe acute pancreatitis ended in controversy,
when it transpired at the conclusion of the trial in
December 2007, that rather more patients had died on the treatment
arm of the trial than on the control arm.
It seemed strange that the trial had not been terminated at the
interim analysis. The researchers were using a a stopping rule
of S.M. Snapinn, by which the trial would
to be terminated early either if it were almost certain that the
final result would be a significant positive effect of probiotica,
or if it were almost certain that the final result would be insignificant.
Here is a paper by myself, to appear
in Statistica Neerlandica,
and, in Dutch, a short article by
probabilist Ronald Meester and microbiologist Pieter ter Steeg which
appeared in the newspaper Trouw and an open letter to Meester and ter Steeg
by biostatisticians Hans van Houwelingen and Theo Stijnen. Also in Dutch there
are a series of interviews (early 2008)
on the current affairs chat show
“Pauw and Witteman”:
chairman of the hospital board Geert Blijham, 23 January;
patient Jochim Vromans, 24 Jaunary;
probiotics expert Eric Claassen, 25 January;
leader of the research team Hein Gooszen, 14 February.
Later we obtained the data at the time of the interim analysis.
It was given to journalists at a press conference on Feb. 13 2008,
but never released to interested scientists.
It turned out that the probiotica trial was
not terminated for futility (following the Snapinn stopping
rule) at the half way interim analysis,
through a mis-reading of output of the SPSS package,
which, without consulting the user,
always reports the smaller p-value of the
two one-sided Fisher's exact tests for
equality of two binomial probabilities. Proper application of their
own stopping rule would have led to early termination of the trial,
since according to the criteria set in advance,
there was no chance any more that it would result in a positive
result for the probiotica treatment. The trial
was de facto continued because there was a good chance that it
would finally result in a negative result for probiotica.
Here are slides
of my talk careless statistics costs lives
on the subject.
Kevin Sweeney ...
recently left a Dutch jail at the end of his sentence
for murder of his wife by arson. He has always claimed innocence.
Here is a link to his own site,
Justice for Kevin Sweeney, here is a
of the case, and here is my blog entry
in the Netherlands: Guilty until Proven Innocent.
In May, 2008, he put in an application
to revise the case (English translation)
Supreme Court. The application is based on an analysis of the fire
evidence by Fred Vos, entitled
Het vergeten tijdspad
(the forgotten timeline). This is the first time a careful
reconstruction of the course of the fire has taken place,
of all evidence available to the courts.
The evidence seems totally consistent with a fire accidentally
started by smoking in bed; and is totally inconsistent with the prosecution’s
claim of arson using large quantities of white spirits (Dutch: terpentine).
Vos is careful to distinguish observed facts from interpretations thereof.
Many writers on the case, including myself, have been misled by
Mathematical Centre (Amsterdam) publications are now available on internet.
Here are two early works which had quite some impact, including
the reprint of my 1979 PhD thesis:
R.D. Gill (1980),
and Stochastic Integrals, MC Tract 124.
R.D. Gill (1983),
sieve method as an alternative to dollar-unit
sampling: the mathematical background, Report SN 12
Another useful link is to my Saint Flour lectures on survival analysis.
Product-integrals are to
products, as integrals are to sums. Though they have been around for
more than a hundred years, they never became part of the standard
toolbox, possibly because no-one invented the right mathematical symbol
for them. I made a try quite some years ago, though they still have not
caught on yet. With the crucial help of JC Loredo, my efforts resulted
prodint.zip, files for getting beautiful \prodi and \Prodi and \PRODI symbols in your LaTeX, and Loredo.ttf,
a TrueType font for ordinary word processing. It is not that difficult
these days to get new fonts into your latex, see for instance TUG's font installation instructions.
The Dutch nurse Lucia de Berk has been completely exhonerated. Not only
is there no proof that she committed any murders, there is no reason
whatsoever to suppose that any of the deaths and other incidents with
which she was connected were in any way unnatural.
Lucia had been given a life sentence for seven murders and three
murders of patients in her care. Statistical reasoning played
a central role in her case, first explicitly but later,
after an appeal court confirmed the sentence,
implicitly: it was converted into irrefutable medical evidence, in a
completely circular and seemingly unbreakable chain of legal reasoning.
An official judicial review committee
uncovered many irregularities in the handling of the case,
in which the rapid response of hospital authorities led to
tunnel-vision and bias from the earliest
stages of the case. A new medical investigation
commissioned by the supreme court has removed the linch-pin
of the prosecution case, the only death “proven” to be a murder, and
“proven” to have been committed by Lucia, on its own merits. There is
no reason now not to suppose that this was a natural
death. Evidence of any wrong-doing in any of the cases
is totally nonexistent. There was however the usual amount of medical
blunders and mistaken diagnoses, but at least the professional
behaviour of the nurses was exemplary. The statistical evidence - which
is all that remains - has been totally discredited. The data was
seriously biased, a meaningless statistic was computed, and the model
used was completely inappropriate. A cluster of incidents
on this hospital ward was actually a common occurrence. The
presence of Lucia at many of the incidents in one cluster was not
terribly unlikely, though striking enough to have drawn attention to
her. Neither shifts nor incidents occur uniformly at random. Half of
the incidents were repeated events associated with a small number of
particularly sick children. Shifts and incidents are not independent of
one another, since a more observant nurse notices problems with a
patient earlier than a less careful nurse. Lucia had more weekend
shifts than most of the nurses (lesser qualified part-timers, trainees,
and temporary employees), while incidents typically occurred in the
weekends. Neither fact is surprising, both facts were never reported.
I have written more on the case on my pages
Damn Nurse Lucia de B, and you can also find much information
(Dutch and English)
My sanskrit name
Sarasvati Leela dasa (dasa: a devotee; Leela: games; Sarasvati: goddess of science, music, self-knowledge)
My Korean signature
(Last updated: 27 January 2014)