The Bristol Royal Infirmary Inquiry Logo


bullet list decorationHome Page

bullet list decorationSearch

bullet list decorationFinal Report

bullet list decorationInterim Report

bullet list decorationEvidence

bullet list decorationInquiry Seminars

bullet list decorationAbout the Inquiry

bullet list decorationHelp

Seperator Bar

Hearing summary

3rd November 1999

Today, analysts commissioned by the Inquiry and members of the Inquiry’s Expert Group, who have assisted with the evaluation, fed back their findings from the analysis of six relevant data sources.

Brian Langstaff QC, Counsel to the Inquiry, opened the hearing this morning by explaining that the statistical review, analysis and synthesis of six key data sources relevant to the work of the Inquiry is part of the exploratory phase of the Inquiry's strategy for using statistics to inform its investigation of children's heart surgery at Bristol. He highlighted the need to place figures in context and stressed that today’s evidence is one part in the jigsaw of the Inquiry’s investigation.

The following analysts and members of the Inquiry’s expert group presented their findings to the panel during the course of the days hearings:

Professor Michael Campbell, Professor of Medical Statistics, University of Sheffield, explained how to convey complex statistical concepts and identified the techniques used during the analyses.

Professor Stephen Evans, Principle Consultant Statistician, Quintiles, reported on reported on the analysis of local data relating to children who received cardiac surgery under the terms of reference of the Bristol Royal Infirmary Inquiry.

Dr Paul Aylin, Senior Clinical Lecturer, Imperial College School of Medicine, London, reported on the analysis of Hospital Episode Statistics.

Professor Gordon Murray, Professor of Medical Statistics, University of Edinburgh, reported on the analysis of the UK Cardiac Surgical Register and the South West Congenital Heart Register.

Dr David Spiegelhalter, Senior Scientist, MRC Biostatistics Unit, University of Cambridge, presented a synthesis of all the statistical sources concerning the nature of the outcomes of paediatric cardiac surgical services at Bristol relative to other specialist centres from 1984 to 1995.

Dr Eric Silove, Paediatric Cardiologist, Birmingham Children’s Hospital,

Mr Leslie Hamilton, Paediatric Cardiac Surgeon, Newcastle Upon Tyne Hospitals, members of the Inquiry’s Expert Group, also attended the hearings to comment on the presentation of the reports.

 

FULL TRANSCRIPT

 

   1              Day 70, Wednesday, 3rd November 1999
   2   (10.00 am)
   3   THE CHAIRMAN: Good morning, everyone. Good morning,
   4     Mr Langstaff. An apology is due from me that we are
   5     starting somewhat later than we should. I accept
   6     responsibility for that. We had a number of things we
   7     had to deal with elsewhere. Forgive me. We are now
   8     ready to hear you, Mr Langstaff.
   9          INTRODUCTION TO TODAY'S EVIDENCE
  10   MR LANGSTAFF: Sir, thank you. Sir, today, as I have
  11     indicated on more than one occasion, we are renewing
  12     the acquaintance in evidence with statistics. Back in
  13     July we examined the way in which this Inquiry would
  14     deal with statistics and to an extent the part that
  15     statistics would play in the Inquiry, and outlined
  16     a strategy which involved an exploratory phase. This is
  17     today to be the report of that exploratory phase.
  18        At the outset the Inquiry outlined the process
  19     which statistical analysis would take. Figures can be
  20     seductive. It is tempting to regard the numbers of
  21     operations and the numbers of deaths recorded in any
  22     statistical analysis of Bristol as answering
  23     the question whether care was adequate or not. That
  24     would be entirely wrong.
  25        After all, if 40 years ago 8 out of 10 patients
0001
   1     suffering from a particular condition died, that might
   2     be regarded if the condition were particularly complex
   3     as not being a cause for investigation and
   4     recrimination, but rather a cause for saying, "In this
   5     operation for this condition in this hospital we managed
   6     to save 2 out of 10", and be a cause for celebration.
   7     Figures have to be placed in context.
   8        Although the evidence you will hear today will
   9     emphasise statistics and analyses of figures and
  10     databases, it is important from first to last that
  11     the context be remembered. The statistics are only part
  12     of the jigsaw. Figures alone can never say whether they
  13     represent an adequate state of affairs or not. They can
  14     only be part of the picture. They can suggest that an
  15     explanation is called for; they cannot themselves give
  16     the explanation.
  17        From first to last today it must be remembered
  18     that the Inquiry will approach its task set out in its
  19     terms of reference of assessing the adequacy of care in
  20     four principal ways. First is the evidence, written and
  21     oral. Part of the adequacy of care is, for instance,
  22     the way in which parents and children were treated as
  23     people. How can figures tell us that?
  24        Just as figures can tell us little about
  25     the approach of clinical staff to patients, because that
0002
   1     is a matter for qualitative assessment by the Panel, so
   2     too a qualitative assessment can be made on the basis of
   3     the evidence provided as to the level of care and its
   4     adequacy delivered at Bristol.
   5        Second, there is evidence given by experts and
   6     others as to the adoption of procedures and policies
   7     which in their expert view from long experience in
   8     the field might have made a difference. This too is
   9     part of the picture.
  10        Third, there is a mosaic of individual cases and
  11     individual stories which makes up the pattern of care.
  12     The Inquiry has already heard from many parents, each
  13     telling from their own individual perspective of
  14     the care given to their son or daughter. This is
  15     invaluable. It is part of the picture.
  16        In each of those three areas which I have already
  17     dealt with statistics play no part.
  18        Next the clinical case note review which we will
  19     look at in evidence tomorrow forms a basis for
  20     a representative snapshot of the care given to children
  21     at Bristol over the years with which the Inquiry is
  22     concerned. We shall hear more of that.
  23        Finally, and you will by now I hope appreciate
  24     that it is part and part only of the picture, part of
  25     the complex jigsaw that makes up the picture of adequacy
0003
   1     of care, there is that which the figures can tell us; if
   2     you excuse the expression, "a health warning".
   3        Figures in this area must be treated with
   4     caution. If I take a coin and toss it once, there is
   5     a 50 per cent chance that it will come down heads and a
   6     50 per cent chance that it will come down tails. If
   7     I toss it twice therefore it should come down heads on
   8     one occasion and tails on the other. If it however came
   9     down twice as heads, one could not safely conclude that
  10     I was using a double-headed coin. That is life.
  11        The true chance of it coming down heads or tails
  12     would still be 50 per cent and it would remain 50
  13     per cent on the next occasion, no matter what the run of
  14     heads or the run of tails had been. If I take the coin
  15     and toss it ten times, then on average, but only on
  16     average, you might expect it to come down heads five
  17     times and tails five times. But, if it came down eight
  18     times one way and twice the other, you would not
  19     immediately say there is something wrong with the coin.
  20        Everyone has in daily life experience of tossing
  21     a coin. If, however, all the evidence that we had to go
  22     on to compare two coins was that one when tossed ten
  23     times came down heads on eight occasions and the other
  24     when tossed ten times came down heads up on four
  25     occasions, if that was all we had to go on, it might be
0004
   1     tempting to think there was some real difference between
   2     the coins that made one more likely to come down heads.
   3        Yet the truth, as we know from everyday
   4     experience, will be that there was actually no
   5     difference at all between them, despite the twofold
   6     difference on the figures.
   7        I should say that I am grateful to our experts for
   8     giving me this everyday simple example in order to
   9     make two serious points about figures. First of all
  10     there is an inevitable variability about figures which
  11     can be very misleading. Secondly, any picture which is
  12     painted by figures must necessarily be general.
  13        We in this Inquiry must never forget when
  14     a general picture is painted, and perhaps particularly
  15     when it is painted by statistics, that at the centre of
  16     the figures are cases of individual children, some of
  17     whom will have been treated and will thankfully have
  18     survived, some of whom will have been treated and sadly
  19     will either have suffered subsequently in consequence of
  20     the treatment or of the condition which led to the
  21     treatment and others who will have lost their lives,
  22     whether because of the underlying condition for which
  23     they suffered or because the hospital failed to care for
  24     them as it might have done.
  25        Statistics are only part of the picture and any
0005
   1     figures are variable. It is false to treat a particular
   2     figure as being any more than within a range of likely
   3     figures. They can distract attention from
   4     the individual case which matters.
   5        But perhaps most important of all is the message
   6     which our statisticians have given us repeatedly:
   7     statistics on their own cannot show by demonstrating let
   8     us suppose low mortality rates at Bristol that surgical
   9     competence or any other factor was the cause of that
  10     success. If it shows the opposite, it cannot show what
  11     the cause was of the high mortality rate. What
  12     the statistics can do is show that there was
  13     a difference. They cannot on their own show what was
  14     the reason for that difference.
  15        We will explore in evidence today six data sets.
  16     The evidence you will hear will first of all be that of
  17     Professor Campbell, who will explain far better than
  18     I can possibly hope to do and much more authoritatively
  19     some of the statistical concepts which underlie
  20     the examination which others have conducted.
  21        Secondly, we will hear from Professor Stephen
  22     Evans as to his analysis of three local data sets:
  23     the patient administration system in Bristol;
  24     the Inquiry's own examination of the clinical case
  25     records and what they have shown -- bear in mind that
0006
   1     some 2,000 cases have been looked at, coded and
   2     analysed, and his is that analysis -- and the surgeon's
   3     logs from Mr Wisheart and Mr Dhasmana.
   4        We shall then hear from Dr Aylin as to
   5     the hospital episode statistics a national data set.
   6     That will be related by him and by Professor Gordon
   7     Murray to the Cardiothoracic Surgical Register of
   8     the United Kingdom. Both those data sets provide
   9     something of a national picture. Professor Murray will
  10     also compare the Cardiothoracic Register with a register
  11     kept locally by cardiologists, the South Western
  12     Congenital Heart Register.
  13        Each of them in turn will present their findings
  14     to you, each of them will be asked some questions at
  15     the end of their presentation, and each of them expects
  16     to show you various slides to demonstrate the points
  17     that they have to make.
  18        At the end of the day we shall have a synthesis of
  19     those various different data sources prepared for us by
  20     Dr David Spiegelhalter. He will indicate what in his
  21     view, and I expect the other statistical analysts to
  22     contribute to this in discussion, is the way forward and
  23     what the Inquiry needs to do next in order to examine
  24     what it is that the statistics are actually telling us
  25     so far as they can.
0007
   1        One of the difficulties with having six data sets
   2     is that there are inevitably different ways in which
   3     each of the groups that compile those data sets have
   4     gone about their task. If you take six people looking
   5     at the same data but separately, they will approach it
   6     in different ways. They will group, for instance,
   7     certain ages, they will take certain age ranges, they
   8     will look at certain operations in a certain way and
   9     immediately you have a problem of comparing one data set
  10     with another because the classification, the approach of
  11     each to the underlying data, is different.
  12        So if any message is to be learned from
  13     the available data, it is necessary that they look at
  14     that data in the same way if they can. The problems
  15     that occurred to the Inquiry in, as it were, getting
  16     the data sets to speak the same language so that they
  17     could usefully be compared -- and comparison is
  18     important if one is to gain credibility from the data,
  19     given the particular problems that a number of the data
  20     sets had that were identified to us back in July --
  21     the essential problems were those of definition: how
  22     does one define the operative procedure that is used?
  23     Each operation is, as Professor Anderson indicated to
  24     us, to an extent unique because the anatomy of each
  25     heart will be different from the anatomy of each other
0008
   1     heart. So although one may say, "This is a truncus,
   2     this is a case of transposition of the great arteries",
   3     it is never quite so simple.
   4        One of the problems therefore is comparing an
   5     analysis, for instance, of the PAS system locally which
   6     was made by administrative clerks with no particular
   7     clinical expertise looking at the discharge letters and
   8     working out from those what the nature of the operation
   9     was and under which heading which group it should be
  10     placed, comparing that with the surgeon's logs, where
  11     the surgeons have their own view as to how the operation
  12     should be classified and, when one comes to look at
  13     the whole position nationally, assessing how on earth
  14     one compares Bristol with the rest of the country
  15     without knowing for certain that people in one centre
  16     compared to people in another have adopted exactly
  17     the same pattern and the same degree of approach,
  18     the same approach, to similar data.
  19        What the Inquiry decided to do in order to allow
  20     a synthesis was to decide that it was necessary first of
  21     all to class procedures as either open or closed, and to
  22     class them as open if the procedure was one which was
  23     performed on cardiopulmonary bypass.
  24        Secondly, it was necessary to group the procedures
  25     in a valid way, valid both clinically and
0009
   1     statistically. Grouping is important for statistical
   2     purposes. If you go back to my example of each heart
   3     being unique, it would be unhelpful in any form of
   4     analysis or comparison to say, "The most one has in any
   5     particular data set is one case, because every case is
   6     unique." Although it is right, there has to be a measure
   7     of similarity -- only a measure, not complete
   8     similarity -- to enable one to have a sufficient number
   9     to conduct an analysis.
  10        You will hear in a moment or two that the Inquiry
  11     decided, on having taken advice -- and it has to be said
  12     that the advice was by no means entirely consistent,
  13     entirely with one voice -- to rank the procedures under
  14     13 headings; classify them under 13 headings; but,
  15     having classified them, to rank them. Let me
  16     demonstrate what I mean by showing you the first slide
  17     of the day, INQ 13/54. What you have here is a list of
  18     the groupings which the Inquiry has adopted for
  19     the purpose of the various analyses. You will see that
  20     each of the analyses approaches the data sets in
  21     the same way.
  22        There are 13 groups, "G" for group, and then
  23     the number. The second box, OPCS 4, procedure code,
  24     a word of explanation: throughout much of the period
  25     with which we have been concerned the large national
0010
   1     data set hospital episode statistics has described
   2     operations by a mixture of letters and numbers, an
   3     "operation code". The relevant codes are K and L
   4     covering cardiac and vascular conditions.
   5        In the list of codes a separate number is given
   6     for the description of different procedures. Bear in
   7     mind these are procedures and not diagnoses. That is
   8     a distinction which again has to be borne in mind in
   9     comparing data sets, some of which look at diagnoses and
  10     some of which look at procedures.
  11        What is under OPCS 4 procedure codes you can see,
  12     and the detail for the moment does not matter, is
  13     attributed to one or other group and then a description
  14     is given. G2 and G3 demonstrate, as you will hear in
  15     evidence, some of the difficulties of allocating
  16     a particular operation to a particular group. They
  17     demonstrate one of the problems with data sets, because
  18     of necessity once there has been a description in order
  19     to understand data over a period of time the same
  20     description has to be applied to the same procedure
  21     throughout history.
  22        The problem is that life is not like that.
  23     Medicine moves on. Procedures change. They develop.
  24     For instance, when the data sets reported upon
  25     the switch operation in the early 1980s, they would have
0011
   1     been reporting much more on what we know as Mustard and
   2     Sennings operations, interatrial switches for the same
   3     condition than the data set of the 1990s which will be
   4     reporting much more on arterial switch operations about
   5     which we have already heard in evidence.
   6        You will see when the data is examined that it
   7     appears that there is quite a grey area around
   8     the margins as to which operation, which procedure,
   9     should in the data sets that we will look at be put
  10     under which heading. To that extent the data has to be,
  11     it is, uncertain. To that extent, any conclusions that
  12     might be drawn in looking simply at group 2 in isolation
  13     or group 3 in isolation would have to be very cautious
  14     indeed.
  15        It may be that sense can be made by grouping both
  16     or looking at group 2 and group 3 together. But bear in
  17     mind the purpose of dividing it up into 13 groups was to
  18     look at each of the 13 groups individually at
  19     the outset.
  20        The fourth heading across the page, "Primary
  21     procedure ranking", has a number of numbers attributed
  22     to the various operations. The problem here is
  23     essentially a practical one. When the clerk,
  24     the administrative clerk or the person compiling
  25     the data set, comes to record his particular operation
0012
   1     as one or the other, and he looks at the list
   2     of procedures which may have been followed, if he has
   3     one code and one code alone to use, which does he
   4     choose? If you might, for instance, have something
   5     which will be classed as a Fontan type operation, G9,
   6     and if in association with that there needed to be
   7     a closure of an atrial septal defect, which of those two
   8     procedures would be coded?
   9        Here the Inquiry had to take clinical advice from,
  10     it has to be said, the bulk of our expert team of
  11     cardiologists, cardiac surgeons and others who were
  12     identified in a note giving full details of this
  13     procedure published today, had to identify which should
  14     take precedence. That is the ranking procedure that is
  15     indicated by the list which is third from the right on
  16     the chart as you look at it.
  17        So that, if there was a truncus arteriosus, that
  18     would in precedence to any other condition identified be
  19     the grouping that was chosen. You will see that
  20     therefore, if one comes down to the last of the open
  21     operations, there are 13 groupings in total, two closed,
  22     11 open; 11, group 6, closure of secundum and sinus
  23     venosus atrial septal defects. But that is not going to
  24     be coded, except where it occurs on its own as an open
  25     condition.
0013
   1        Of course one of the further problems that anyone
   2     may have in trying to group operations under one of
   3     these headings, the primary procedure groupings, is that
   4     it may just not fit. Take a transplant -- not in fact
   5     performed in Bristol, but bear in mind that there will
   6     be comparisons with other centres where transplants may
   7     have been performed. There are three in the country.
   8     There is no heading under which something such as
   9     a transplant can be placed. As a result, there has to
  10     be a residual coding, and there was indeed in
  11     the national data sets a residual grouping for
  12     procedures which could not be coded under one of
  13     the more precise letters and numbers used by OPCS.
  14        So for our purposes the statistical analysts have
  15     not only looked at 13 procedure groups, but also looked
  16     at open operations, all open operations, whether the 13
  17     or more, 14, 15, 16, 20, whatever it might be, and
  18     compared those with open operations elsewhere, and all
  19     closed operations with all closed operations elsewhere.
  20        When comparing one centre with another there can
  21     of course be a comparison made across each of these
  22     individual groupings, each of the 13, or indeed across
  23     the general open and general closed categories. But
  24     please remember that the boundaries may not always be
  25     clear between one group and another, and there may be
0014
   1     some groups in some centres which are not shown here.
   2        One of the reasons why these particular groupings
   3     were chosen is explained by the far right-hand column on
   4     the chart, "Map to UKCSR". UKCSR stands for United
   5     Kingdom Cardiac Surgical Register. As we already heard
   6     in the Inquiry, this register was throughout the period
   7     that the Inquiry is concerned with, except for a few
   8     months in the middle 1990s, collecting data sent into it
   9     by centres and allowing any one centre to compare its
  10     performance against the national performance of the year
  11     or two before.
  12        The precise data which gave rise to those reports
  13     is of course not now readily available. What we have is
  14     simply the groupings which the Cardiothoracic Register
  15     had. It is a valuable data source, particularly since
  16     it was reported to the Cardiothoracic Register by or on
  17     behalf of the clinicians most closely connected with
  18     the care of the particular child at the time.
  19        In order to make use of that data set there has to
  20     be a means of bringing, if you like, the other data sets
  21     into line with it. This, in my simple understanding, is
  22     what is meant by "mapping", trying to fit the procedures
  23     identified in the other data sets to the diagnoses with
  24     which the Cardiothoracic Surgical Register deals. That
  25     is some explanation as to how one can take the groupings
0015
   1     which you see here and make sense of them in
   2     a comparison with the Cardiothoracic Surgical Register.
   3        The statistical evidence that you will hear was
   4     essentially of two sorts. One will establish Bristol's
   5     performance as best the data can tell us. Secondly, it
   6     will compare that performance with elsewhere.
   7        In looking at the local data, there is some
   8     difference of numbers; bear in mind that the PAS,
   9     patient administration system, is coded by
  10     administrative clerks who are not medically qualified,
  11     the surgeons' logs are not formally validated, but
  12     the data may be in small groups and therefore subject to
  13     large statistical variation -- go back to my example of
  14     tossing the coin -- and much of the data was not
  15     collected for the purpose of assessing the adequacy of
  16     care. That is the local figures.
  17        In comparing data with elsewhere it needs to be
  18     borne in mind that the data may not have been validated
  19     and may therefore be inaccurate. Where a comparison is
  20     based on data from the Cardiac Surgical Register for
  21     the period from 1985 to 1991, there is no cross-check
  22     with the national HES, hospital episode statistics,
  23     system.
  24        Professor Murray will tell you that it is clear
  25     that the primary data quality of the UKCSR is poor, and
0016
   1     accordingly the ordinary degree of variability,
   2     the range within which you might expect figures to fall,
   3     is all the greater. Professor Murray will emphasise
   4     that he had been unable to visit cardiac units elsewhere
   5     in the country to assess the quality of their primary
   6     data, and further work to assess that quality will be
   7     needed before one could assess the weight which should
   8     be placed on the results of comparative analyses.
   9        When comparing Bristol with other centres
  10     a twofold comparison is adopted. A word of explanation
  11     of this before the evidence is called. First, there is
  12     a comparison of Bristol children with all children
  13     elsewhere. This is, you will hear, compares the rates
  14     of mortality which appear on the face of the data with
  15     those collected from the whole of the rest of
  16     the country grouped together.
  17        Secondly, Bristol as a centre will be compared
  18     with other centres. For this purpose the average of
  19     the performance by centre was determined. It is obvious
  20     that no centre will be exactly average. Some will, by
  21     reason of simple chance variability, in any one year
  22     have more or less deaths in any given operation than any
  23     other centre. The greater the number of operations and
  24     the periods considered, the greater the chance that
  25     the centre will both appear to have above average number
0017
   1     of deaths in some procedures and below average in
   2     others.
   3        It may be distressing to many parents to hear
   4     the terms in which our statistical analysts will express
   5     this. They will talk of excess deaths. It is essential
   6     to realise that this is simply a way of expressing an
   7     apparent difference, and it is important to remember
   8     from everything that I have already said that
   9     the difference may not be real and that some difference
  10     is inevitable. It will be most unfortunate if a simple
  11     and ordinary way in which statisticians express
  12     themselves may seem be to callous, unfeeling and
  13     upsetting.
  14        To say, for instance, that in any particular
  15     centre there have been, say, six excess deaths is not
  16     passing a judgment as one would in a court of law
  17     assessing whether there had been clinical incompetence
  18     or not. If this is not understood, there is a very
  19     great risk that someone might ask, "Which were the six
  20     deaths?" or even worse, "Was it my child?" To ask those
  21     questions will be to misunderstand the purpose of
  22     the statisticians expressing themselves as they do.
  23     They are painting a picture with a variability, with
  24     data which may not be entirely reliable in order to
  25     provide a generalised comparison.
0018
   1        If the point needed to be made, you will see that
   2     when they present their papers that quite often
   3     the excess deaths in one or other centre are presented
   4     with a decimal point. To say that there were six excess
   5     deaths in this or that centre might understandably give
   6     rise to the question, "Which were they?" But if the
   7     expression used is, for instance, 6.9, then I hope it
   8     will be clear to all that this is one way of expressing
   9     a difference. It is not an attempt to draw conclusions
  10     in any individual case.
  11        A word about the groupings which make it almost
  12     inevitable that there is likely to be an expression of
  13     excess deaths used in the statistical sense that I have
  14     mentioned in almost any centre. The data have been
  15     grouped in epochs, taking us from January 1984 to
  16     December 1987, epoch number 1; January 1988 to December
  17     1990, number 2; January 1991 to March 1995, number 3;
  18     and April 1995 to December 1995, number 4.
  19        Bear in mind, please, when we come to epoch number
  20     4 that the nature of the operations performed may not
  21     have been of the same type of case, say, mixture as in
  22     the preceding two or three epochs. The data have also
  23     been grouped by age. The ages chosen for the purposes
  24     of analyses are nought to 90 days, 90 days to 365 days,
  25     and 1 year to 15 years.
0019
   1        Again, you come back to points of difference
   2     between the data sets. The Cardiothoracic Register, for
   3     instance, has no cut off in age. There is no way of
   4     knowing simply from the figures presented whether it is
   5     looking at any particular number of children over
   6     the age of 16.
   7        There are therefore in total four epochs, three
   8     age groups, that itself will give you 12 comparisons,
   9     and we have 13 procedure groupings plus the open/closed
  10     difference.
  11        You will hear that there was no significant
  12     difference between Bristol and children elsewhere in
  13     England, and indeed between Bristol as a centre and
  14     centres elsewhere in England so far as closed procedures
  15     are concerned.
  16        So far as open procedures are concerned, you will
  17     hear a considerable amount of evidence. The overall
  18     picture is perhaps best summarised by showing you my
  19     second slide taken from the reports, INQ 15-4. Can we
  20     scroll up a bit, thank you. This comes from David
  21     Spiegelhalter's paper. It shows that over the period
  22     1988 to March 1995 there appears to be approximately two
  23     times the rate of death in open operations across all
  24     age groups combined. You will see immediately that that
  25     is expressed in the third column as excess deaths. You
0020
   1     will notice, to emphasise the point I have made that
   2     this is simply a way of demonstrating a difference, that
   3     the excess deaths recorded have decimal points to them.
   4        You will see from Dr Aylin's evidence, when he
   5     gives it, that the mortality is highest in the youngest
   6     age group, nought to 90 days. Bear in mind that, since
   7     there is also evidence that Bristol operated on a lower
   8     percentage of young babies than other centres did and on
   9     a greater proportion of Downs Syndrome children than
  10     other centres did, these factors may mean that there has
  11     to be further examination of the difference. It would
  12     not be right without further careful consideration to
  13     assume that the fact of difference implied a reason for
  14     the difference.
  15        Dr Aylin will present a comparison of the apparent
  16     mortality rate of Bristol compared to the 11 other
  17     principal centres in England. Bristol is significantly
  18     different. It is an outlier compared with all other
  19     centres, save one which is an outlier in the 1 to 15
  20     year age group. That other centre is known as centre
  21     number 10 in the analyses you will see.
  22        When our statistical analysts did the work, they
  23     did not know which centre it was. However, this Inquiry
  24     has made a commitment to openness. Since it may well be
  25     there are problems and difficulties with the figures,
0021
   1     even beyond those that have so far been identified, and
   2     they need to be stressed, it is important that anyone in
   3     a position to contribute should know the identity of all
   4     the other centres referred to by number in the figures
   5     and diagrams that you will see.
   6        Accordingly let me give you alphabetically
   7     the centres and their identifying numbers. Birmingham
   8     Children's Hospital will be known in the figures that
   9     you will see as centre 11; Freeman Hospital, which is
  10     now part of the Newcastle upon Tyne hospital's NHS Trust
  11     as centre number 9; Great Ormond Street Hospital, which
  12     is now part of the Great Ormond Street Hospital for
  13     children NHS Trust, as centre number 8; Guys Hospital,
  14     now part of the Guys and St Thomas Hospital NHS Trust,
  15     number 5. I should emphasise that these numbers are
  16     arbitrary numbers given by the Inquiry to these
  17     centres. They do not imply any form of ranking
  18     whatsoever. It is not a question of number 1 being
  19     best, number 11 being worst or anything of that sort.
  20        Harefield Hospital, now part of the Royal Brompton
  21     and Harefield NHS Trust, is centre number 10;
  22     Killingbeck Hospital, Leeds, part of the Leeds Teaching
  23     Hospital's NHS Trust, is centre number 3; the Royal
  24     Brompton SHA, special health authority, now part of
  25     the Royal Brompton and Harefield NHS Trust, is centre
0022
   1     number 12; the Royal Liverpool Children's Hospital, part
   2     of the Alderhey Children's Hospital and Royal Liverpool
   3     Children's NHS Trust, as centre number 6; Southampton
   4     University Trust, Southampton University Hospital's NHS
   5     Trust, centre number 7; Glenfield Hospital, Leicester,
   6     not a designated centre, but it did a significant number
   7     of operations, now part of the Glenfield Hospital NHS
   8     Trust, centre number 2; the Radcliffe Infirmary at
   9     Oxford, the Oxford Radcliffe Hospital NHS Trust, centre
  10     number 4; the United Bristol Healthcare NHS Trust,
  11     Bristol Royal Infirmary, as you might expect in this
  12     Inquiry, centre number 1.
  13        You will see that centre number 10 is now part of
  14     the Royal Brompton and Harefield NHS Trust. It was
  15     Harefield Hospital. A word of caution: it would be
  16     unduly alarmist to conclude that Harefield was a poor
  17     hospital or to suggest that this Inquiry had uncovered
  18     something that had not been appreciated to some extent
  19     before.
  20        Moreover, in drawing comparisons between Bristol
  21     and centre 10, it is known that centre 10 performed a
  22     much higher proportion of its procedures on children
  23     over one year of age, a much higher proportion of its
  24     work was outside the 13 procedure groups that I have
  25     mentioned and fell into the other category, and both of
0023
   1     those features are potentially associated with higher
   2     risks and hence with poorer outcomes.
   3        This makes a point which needs to be emphasised
   4     throughout. The purpose of statistics at their best is
   5     to compare like with like. The statistical analysts
   6     have in the case of Bristol done what they can to deal
   7     with the question of case mix. There is not much
   8     information to be given. Because the subject of our
   9     Inquiry is not centrally centre 10 or any other centre,
  10     it would be wrong to assume without Inquiry that
  11     the case mix was the same.
  12        Let me give an example. If a particular hospital
  13     had -- this is purely a hypothetical example, I hasten
  14     to had -- a policy of treating any case, however poor
  15     the outcome was likely to be, even though it might be as
  16     low as 1 per cent, the mortality figures produced by
  17     that hospital would be very different from the other
  18     hospitals faced with the same type of patient which
  19     declined to operate, perhaps on perfectly good clinical
  20     grounds, upon the same group of patients.
  21        The case mix of the patients would be different.
  22     The case mix would in such an example be an obvious
  23     possible explanation for any statistical difference
  24     the figures showed.
  25        I do not suggest that this is essentially
0024
   1     the difference between Harefield and Bristol. I am not
   2     in a position to make a suggestion one way or
   3     the other. But it must be borne in mind that there may
   4     well prove to be a difference which is of the greatest
   5     significance in interpreting these figures.
   6        I should emphasise that the statistical analysts
   7     have not been able to visit the other cardiac units to
   8     assess their primary data quality. Further work to
   9     assess that quality would be required before one could
  10     assess the weight which could be placed upon it.
  11        Moreover, as you will come to see from
  12     the figures, centre 10 is not the only centre which has
  13     some outlying results. Accordingly, the figures should
  14     be treated with some caution. Anyone listening must be
  15     remember that the figures at Bristol have been examined
  16     across a range of data sets and in much greater detail
  17     than has been possible with other centres. That said,
  18     if there is to be any explanation as to the reasons why
  19     Harefield appears to be an outlier as well as Bristol,
  20     that is a matter for Harefield to explain.
  21        The Department of Health knows today this Inquiry
  22     will reveal the performance of the various centres
  23     relative to each other. It is likely to be a matter of
  24     interest in the locality in which each centre is placed
  25     as to how that centre has performed. I would hope that
0025
   1     any comment which is made, perhaps in the press, perhaps
   2     by others who read this on the Internet, must be subject
   3     to the caveats which I have already entered and which
   4     will be entered in the course of the evidence today both
   5     orally and in the written papers submitted.
   6        The Royal Brompton and Harefield Trust has been
   7     told of the figures, as have the other centres.
   8     I should emphasise as one would hope to be the case that
   9     it is open and forthcoming about its figures.
  10        With that introduction, which I hope is not so
  11     obvious as to be simplistic, I shall call the evidence
  12     before you to put further detail on that which I have
  13     given to you in outline.
  14        Sir, the way in which we propose to arrange
  15     matters is that we shall begin with Professor Campbell,
  16     if he would come up to the desk, and for the purpose of
  17     enabling discussion if necessary between all
  18     statisticians, we shall arrange two chairs at the front
  19     and he will be joined by Professor Evans, whose
  20     presentation will be next.
  21        If the expert table to my right could then be
  22     filled, please, by Dr Aylin, Dr Spiegelhalter and
  23     Professor Murray. Sir, it would be convenient I think
  24     to swear each in turn as they come to give their
  25     presentation rather than all now. If therefore we may
0026
   1     begin, please, with Professor Campbell.
   2   THE CHAIRMAN: I think that is right. Thank you for that.
   3     I was just making sure everyone was comfortable and able
   4     to see. Thank you.
   5        PROFESSOR MICHAEL JOSEPH CAMPBELL (SWORN):
   6            Examined by MR LANGSTAFF:
   7   Q. Professor Campbell, your full name, please?
   8   A. Michael Joseph Campbell.
   9   Q. Can you tell us a bit about yourself and why you are an
  10     expert, briefly?
  11   A. I am Professor of Medical Statistics in the University
  12     of Sheffield. Previously I used to work in
  13     the University of Southampton. I have no idea why I am
  14     an expert, except I have written two books. One is
  15     called "Medical Statistics - a Commonsense Approach" and
  16     one is called "Statistics from Square One".
  17   Q. What I am going to ask you to do, you have read through
  18     the statistical evidence which is later to be presented
  19     to us, I think, and you have formed a view of those
  20     matters which may need to be explained first of all in
  21     greater detail to what one might call the wider audience
  22     and any explanation that may be needed for a more
  23     scientific and technical audience. You are in
  24     a position to do that for us, are you?
  25   A. I am.
0027
   1   Q. What role, as you see it, does statistics play in this
   2     Inquiry?
   3   A. I think statistics has a number of roles. I think
   4     the first and probably most important one is to
   5     ascertain the data validity and to make sure that
   6     the data that we are actually examining is
   7     representative of what really actually happened in
   8     the various centres.
   9        Data validity and design of studies is central in
  10     statistics to make sure that we are actually basing our
  11     conclusions on solid evidence. Having collected
  12     the data, the next question that statisticians have to
  13     ask is how these differences could have arisen and
  14     whether the differences are purely due to chance or
  15     whether there is some inherent difference between
  16     different centres that cannot be attributed to chance,
  17     and that is where the statistical analysis comes in.
  18        Finally, I think quite importantly, statistics
  19     need to be able to present these results in a way that
  20     is easily understood by the general public. I think
  21     data presentation is also very important.
  22   Q. So collection, analysis and presentation?
  23   A. Correct.
  24   Q. So far as the analysis is concerned, to what extent in
  25     that analysis does the statistician try to demonstrate
0028
   1     whether the results are beyond what one would expect as
   2     a matter of chance or not?
   3   A. Well, statisticians generally try and set up a null
   4     hypothesis which they then try and ascertain whether
   5     the data are congruent with this particular hypothesis.
   6     The example that you gave earlier on tossing a coin, for
   7     example, you toss the coin a number of times. Your
   8     null hypothesis might be that the coin is unbiased, and
   9     this would be your normal null hypothesis, that
  10     the probability of a head is a half. Then you collect
  11     your data and then you decide, "I assume that the coin
  12     is unbiased. What is the probability of getting
  13     the observed number of heads that I have seen?" This is
  14     where assessing whether the probability is due to chance
  15     or not.
  16   Q. By unbiased we mean that the coin is unweighted or it is
  17     not a double headed coin; there is nothing odd about it?
  18   A. Correct.
  19   Q. The word "bias" is something we may come across. Would
  20     you like to say a couple of words about how
  21     statisticians read the word?
  22   A. There are a large number of ways that you can interpret
  23     bias. I think one of the most important ones in
  24     the Bristol Health Inquiry is whether the way that
  25     the data were presented in Bristol is different from
0029
   1     the way it was presented or collected in other areas.
   2     So, for example, were Bristol more or less scrupulous in
   3     reporting their mortality data and were they more or
   4     less scrupulous in collecting all the babies that came
   5     to the hospital? I think that is where you can get
   6     bias, is where you have differential reporting of
   7     results from different centres.
   8   Q. Suppose, for example, that Bristol were scrupulous about
   9     collecting data on deaths and other centres were not.
  10     What effect would that have upon the apparent
  11     performance of Bristol compared to these other centres?
  12   A. Clearly if Bristol was scrupulous and reported all their
  13     deaths and the other centres did not report all their
  14     deaths, then in fact it would make Bristol appear worse
  15     than the other centres.
  16   Q. And is there, as it happens, some suggestion in what you
  17     have seen of the statistics that Bristol were
  18     scrupulous?
  19   A. There is some suggestion, but it seems difficult to
  20     understand how -- very difficult to understand how there
  21     could be loss from the other centres to explain
  22     the differences that we have actually observed.
  23   Q. We will come back to that in a little while, if we may.
  24     You mentioned that the statistician begins by taking an
  25     null hypothesis, and you demonstrated that in terms of
0030
   1     the coin and say, "What we want to assume is that
   2     the coin is a perfectly ordinary coin with heads on one
   3     side and tails on the other." What conclusion would you
   4     be likely to draw if, let us say, you tossed the coin
   5     100 times and on 98 occasions out of the 100 it came
   6     down heads?
   7   A. You could work out the probability of observing 98 heads
   8     or 99 heads or 100 heads assuming that the probability
   9     of a head was still a half, and you would find that
  10     this probability was actually very small and therefore
  11     you might conclude that the coin was in fact not an even
  12     balanced coin, that in fact it was biased in some way.
  13        So you make up the hypothesis, which is that
  14     the coin is unbiased, and then you look at the data and
  15     then you reject the hypothesis. That is essentially
  16     the statistical method which is very comparable to
  17     the scientific method where you make predictions from a
  18     theory, you then observe the data and then, if your data
  19     and the theory do not correspond, you reject your
  20     theory.
  21   Q. You have a slide for us. I think it is INQ 17-1.
  22   A. I wanted to emphasise here, and in fact since having
  23     discussions with Dr Spiegelhalter I think perhaps
  24     the emphasis is not necessarily warranted too much, in
  25     the different types of statistical inference there are
0031
   1     two basic approaches to statistical inference which are
   2     called the Bayesian approach and the frequentist
   3     approach. I put Thomas Bayes here to show that in fact
   4     Bayesian methods have a long and well-established
   5     tradition and that Thomas Bayes published his results
   6     posthumously and it has been said that it is a shame a
   7     lot of other statisticians did not follow his example.
   8        The important difference is that frequentists tend
   9     to think in terms of repeatedly running a study and
  10     the long-term proportion of successes. So this is
  11     the way you tend to think about probability. You tend
  12     to think of the probability of a boy being born, you
  13     observe a large number of successes or failures or
  14     whatever, and you end up with say 52 boys out of 100.
  15     This means that the probability of a boy being born is
  16     52 out of 100 or 0.52.
  17        What a frequentist thinks after an investigation
  18     is exactly what I said about the coin tossing. They
  19     ask, "What is the probability of getting these data if
  20     my model, if my hypothesis, is correct?"
  21        Bayesians tend to give probabilities of things
  22     that are not directly observable and that they can
  23     attach probabilities to hypotheses such as the true
  24     underlying mortality rate being 10 per cent.
  25     Essentially they have what are called prior probability
0032
   1     before the experiment and posterior probabilities after
   2     the experiment.
   3        They tend to attach probabilities like the
   4     probability that it is going to rain tomorrow is 20
   5     per cent. You attach a probability to that. But you
   6     cannot imagine running tomorrow 100 times to see if it
   7     rains ten times.
   8        They are able to give -- after an investigation
   9     they say, "Given what I believed about my model before
  10     the investigation and given the data I have collected,
  11     what is the probability that my model is now correct?"
  12     This is a very common way of arguing in medicine as well
  13     where a patient comes in to a surgery and the doctor
  14     will assess their possibility of having a certain
  15     disease. They will then question the patient and elicit
  16     certain symptoms and then they will modify their
  17     probability that the patient has a particular disease.
  18     If their probability is sufficiently high, they might
  19     refer them on for investigation.
  20   Q. In the surgeon's case or the GP's case, he may have what
  21     is called a differential diagnosis, have a number of
  22     ideas in his mind and begin to discard them as he gets
  23     more evidence?
  24   A. Yes, correct. He will discard them depending on
  25     the probabilities that he has -- the symptoms he has
0033
   1     elicited from the patient.
   2   Q. You have used words on your slide "prior probabilities"
   3     and "posterior probabilities". The posterior
   4     probability is what?
   5   A. The reason I put this on the slide is the posterior
   6     probability is the measure of how much you believe your
   7     model is actually correct, your hypothesis is actually
   8     true, given that you have collected some data to either
   9     refute or accept this particular hypothesis.
  10   Q. The statisticians in their reports to us have used
  11     a Bayesian approach, not a frequentist approach. Does
  12     it, do you think, make a difference? If so, to what
  13     extent?
  14   A. I have had long, long discussions, especially with
  15     Dr Spiegelhalter on this particular issue. Although
  16     the method is ostensibly Bayesian, in fact you can
  17     demonstrate that in fact it is largely a frequentist
  18     approach.
  19   Q. Best of both worlds.
  20   A. In fact if we go on to the next overhead.
  21   Q. Number 2, please.
  22   A. I said we have two hypotheses, H0, which is that Bristol
  23     is no different from the other centres, and H1 that
  24     Bristol is quantitatively different from the other
  25     centres. The data that was collected is the mortality
0034
   1     from 12 centres.
   2        So what the frequentist can answer is, "What is
   3     the probability of getting our data" -- and
   4     unfortunately I have to put in brackets (or data more
   5     extreme from H0), which means further away from the null
   6     hypothesis -- "assuming that Bristol is in fact no
   7     different from the other centres?"
   8        In the tables that are given in the evidence this
   9     is the probability -- that big P means
  10     the possibility -- of the data given the
  11     null hypothesis, or it is called the P value. In some
  12     of the earlier tables you will find mentions of P
  13     values. That is what the P value is telling us.
  14        What the Bayesian inference can answer is, "What
  15     is the probability that Bristol is different from
  16     the other centres given the data?" That is
  17     the probability of H1, which is the hypothesis that
  18     Bristol is different, given the data. That is also
  19     given in the tables as well, of the probability attached
  20     to the excess deaths or the deaths greater than
  21     expected.
  22        In fact it turns out, because the way the prior
  23     distributions were chosen and also because of the way
  24     that the hypothesis was set up, which was that we take
  25     Bristol out of the equation and we then try to predict
0035
   1     from the other centres what we should have expected in
   2     Bristol, which is the null hypothesis, we can in fact
   3     interpret the probabilities given in the Bayesian method
   4     in a very similar way to the probabilities we get under
   5     the frequentist approach, which I think is a very nice
   6     synthesis.
   7   Q. So essentially which ever approach had been taken is
   8     likely to have produced pretty much the same results,
   9     given the data?
  10   A. That is my opinion, yes.
  11   Q. You mention in the middle of that slide the P value.
  12     I wonder if you would like to say some words by way of
  13     explanation as to what a P value is and what, if any,
  14     comfort or the opposite we may take from it?
  15   A. If we take your example before of tossing a coin 100
  16     times and observing 98 heads, we can work out either by
  17     computer simulation or by mathematics the probability of
  18     observing this particular run of events if we assume
  19     the coin is unbiased. This probability is the P value.
  20     It is convention to reject the null hypothesis, which is
  21     the coin is unbiased, if the P value is sufficiently
  22     small. The usual accepted level for rejection is 5
  23     per cent. So we say that, if there is only a 1 in 20
  24     chance of getting our observed data or, more extreme, if
  25     the null hypothesis is true, then we are going to reject
0036
   1     the null hypothesis.
   2   Q. Would it be the case that if the coin had been tossed 90
   3     to 100 times and come down heads on 94 occasions, not
   4     95, that you could not safely assume statistically that
   5     the coin was biased?
   6   A. It could be that if you tossed the coin 94 times your P
   7     value might be 0.06 -- sorry, if you tossed it 100 times
   8     and it came up heads 94 times, your P value would be
   9     0.06. If you tossed it 100 times and it came up 95
  10     times, your P value would be 0.04. Therefore you could
  11     reject it on the latter case but not on the former.
  12   Q. If we think in terms of tossing a coin, this is the
  13     example we have used, if anything has or is given
  14     statistical significance, does it mean that as a matter
  15     of numbers and mathematical calculation it is
  16     the equivalent of, as it were, tossing the coin and
  17     finding that on less than 5 occasions out of 100 it came
  18     down one way as opposed to the other?
  19   A. Sorry, could you repeat that?
  20   Q. Yes, I was trying to make the example simple. It is my
  21     fault. When we see in the course of the analyses
  22     statistical significance or a P value of less than 0.05
  23     ascribed to a particular data set, does that mean that
  24     you can be as confident that that is not due to chance
  25     as you would be in the case of a coin which had been
0037
   1     tossed and on 95 occasions out of 100 it came down on
   2     one side only?
   3   A. You would have to -- I do not know what the actual
   4     probability of getting a 95 -- perhaps one of my learned
   5     colleagues here could tell me. Very, very small. But
   6     essentially you interpret the probability of -- if you
   7     have a P value of less than 5 per cent, it would mean
   8     that if you were to run the whole scenario 100 times you
   9     would only expect to see this particular event 5 times
  10     out of 100 or less.
  11   Q. Thank you. You heard what I said about excess deaths in
  12     the course of the introductory remarks that I have
  13     made. Broadly, and please do not be frighten to offend
  14     me, was I right in the approach that I took to it?
  15   A. Yes, I believe you were. One important thing to realise
  16     is that statistically speaking excess deaths could be
  17     negative as well as positive. It just means that they
  18     are different from predicted.
  19   Q. So by "excess" we should really read "different number
  20     of" rather than "excessive" in the ordinary colloquial
  21     sense?
  22   A. Yes.
  23   Q. Through the statistics which we will see, then we may
  24     have talk of P values and we have talk of excess
  25     deaths. So far as P values are concerned, there is some
0038
   1     reference that we will see to 95 per cent intervals.
   2     Could you give us an explanation of that?
   3   A. Yes. The authors have been quite careful to refer to 95
   4     per cent intervals. The frequentist approach is to call
   5     things 95 per cent confidence intervals. Essentially
   6     that enables us to -- if we have a 95 per cent
   7     confidence interval, it says -- let us go back to
   8     the coin tossing experiment again, since it seems to be
   9     generally understood.
  10        We observe a certain number of heads, say 80
  11     per cent, 80 out of 100. We can calculate about that 80
  12     per cent an interval. If we actually ran this
  13     experiment 100 times we could each time we ran
  14     the experiment calculate this interval, then 95 per cent
  15     of the intervals that we calculate would include
  16     the true probability of the coin coming up. So if
  17     the coin was actually unbiased, then we would expect
  18     that 95 per cent of these intervals would include 50
  19     per cent or P as 0.5 as the null hypothesis. That is
  20     a 95 per cent confidence interval.
  21        When you go on to 95 per cent interval, then this
  22     is a slightly different approach, but it is essentially
  23     the same. But it means that we have 95 per cent
  24     confidence that the true value, the value underlying
  25     the hypothesis, is somewhere within that interval;
0039
   1     although it is important to realise that most of
   2     the emphasis will lie in the centre of the interval and
   3     not at the extremes.
   4   Q. Excess deaths was the other part that I was going to ask
   5     you about. Is there any sense in which statistically
   6     one can identify which deaths are excess?
   7   A. No, no. Essentially, as I was saying before about
   8     the frequentist approach to probability, out of 100
   9     people you could say that 10 per cent are likely to die,
  10     but there is no way of identifying which particular
  11     10 per cent. It is a bit like the lottery. We cannot
  12     identify in advance who is going to win the lottery.
  13   Q. I said in my introductory remarks that if you take
  14     a comparison amongst 12 centres, given any particular
  15     performance table, one is bound to be best and the other
  16     is bound to be worst. That as a matter of logic must be
  17     right. How do statistics help in describing
  18     the difference so that one may be satisfied either that
  19     the apparent top or bottomness of one centre is chance
  20     or that it is more than chance, probably?
  21   A. This is a relatively new area of statistics, and
  22     Dr Spiegelhalter has been fundamental in doing research
  23     on this. But essentially we can say that if you rank
  24     the centres and one of them is well below what would
  25     have been predicted from the others, then it is unlikely
0040
   1     to have happened by chance and we can put
   2     the probability on it being at the bottom, even
   3     though -- so you might observe something which is at
   4     the bottom, but you might get a confidence interval
   5     which is between say 8 and 12. So it could have been
   6     that in fact in reality it was only 8 out of 12. But
   7     sometimes you can get the confidence interval is only
   8     12, in which case we can be 95 per cent sure that if we
   9     had run this thing a large number of times it would have
  10     come out at the bottom 95 per cent of the time.
  11   Q. Because the gap or the difference is so great that it is
  12     likely to be repeated on that number of occasions?
  13   A. Correct, yes.
  14   Q. Professor Campbell, I have asked you a number of
  15     questions. Is there anything that you would wish to add
  16     by way of introductory remarks, familiar as you are with
  17     the data which is going to be introduced, so that we
  18     have a proper perspective of what is going to be
  19     presented?
  20   A. I suppose I would re-emphasise the remarks that you made
  21     earlier, that the statistics and the probability are
  22     just part of a jigsaw and that one of the most important
  23     things is that we have very few explanatory variables
  24     because of the way the data was collected; we do not
  25     have things such as the clinical condition of
0041
   1     the babies. So we cannot provide any explanation as to
   2     why these -- the way that the data have evolved. Simply
   3     we can point to the fact that it is most unlikely to
   4     have arisen purely by chance. So I think it needs to be
   5     just part of a general picture.
   6   MR LANGSTAFF: What I am going to do now is to suggest to
   7     our chairman that we have a short break, perhaps 10 or
   8     15 minutes, and then Professor Evans will introduce
   9     the Bristol picture and the Bristol rates and present
  10     that to us.
  11   THE CHAIRMAN: Yes, Mr Langstaff. Just before we break for
  12     15 minutes, that introduction was extremely helpful
  13     I think in setting out the language, the terms of
  14     reference by reference to which we have to read
  15     the material before us. I hope it was helpful to
  16     everyone to hear what these various terms mean so that
  17     we can translate what we see. It was extremely helpful;
  18     thank you. Shall we now break and reconvene at 11.30.
  19   (11.15 am)
  20   (11.40 am)
  21   MR LANGSTAFF: Sir, Professor Evans, may he take the oath?
  22          PROFESSOR STEPHEN EVANS (SWORN):
  23            Examined by MR LANGSTAFF:
  24   Q. Professor Evans, your full names, please, and essential
  25     qualifications?
0042
   1   A. I am Stephen James Weston Evans. I have a BSc and MSc.
   2     I am a chartered statistician. I was Professor of
   3     Medical Statistics at the London Hospital Medical
   4     College, part of London University. I was there for 25
   5     years, and then I was head of epidemiology for
   6     Medicine's Control Agency. I now work locally, not far
   7     from Tonbridge Wells, and I would describe myself as
   8     a statistical epidemiologist.
   9   Q. Did you, for the purposes of this Inquiry, prepare
  10     a report which we have, the first page presently on the
  11     screen, beginning at INQ 12/1?
  12   A. I did.
  13   Q. And does the text of the report finish at INQ 12/33?
  14   A. Yes.
  15   Q. Let us look at this a minute, the very bottom of the
  16     page. Then that is followed by a number of tables and
  17     figures and an annex. The report as a whole ends, does
  18     it, at page 49, INQ 12/49?
  19   A. Yes.
  20   Q. You are prepared to present your findings, are you, to
  21     the Inquiry, and would you perhaps like to do so?
  22   A. Yes. I am happy to do that. I would like to in some
  23     senses preface my remarks, well aware of the parents,
  24     those who are concerned about this Inquiry, concerned
  25     health professionals, and say to them -- I am sure I can
0043
   1     speak on behalf of my colleagues -- we as statisticians
   2     are very aware of the personal trauma that many of you
   3     have gone through and when we talk about children and
   4     procedures and deaths and numbers, we do so in the
   5     knowledge that each one of those is someone who is
   6     important. When we talk about surgeons and centres and
   7     health professionals, we are aware that they are
   8     important.
   9        It may seem that sometimes our numbers are
  10     expressed in a way that is cold and uncaring. That to
  11     some degree is the lot of the statistician, that they
  12     will be seen in that way, but we would wish to say that
  13     we are aware of your hurts; we are aware of your current
  14     concerns, and we are doing our best to try and help this
  15     Inquiry in the terms of the purposes of the Inquiry, to
  16     come to answers that are as helpful as possible.
  17        So please do remember, when we use terms that may
  18     be distressing, I am sure that we would be happy to
  19     accept from the Panel or others, advice as to how
  20     perhaps we can ameliorate that distress, but we are
  21     conscious of it and we will do our best. But
  22     nevertheless, we are dealing in numbers and these things
  23     may appear hard. You said this yourself at the
  24     beginning, Mr Langstaff, but we wish to reiterate it.
  25        I think that we have to remember that when the
0044
   1     Inquiry was set up, there were about 2,000 sets of
   2     medical records, and each of these consisted of perhaps
   3     one folder, perhaps a number of folders, that related to
   4     the children who had received cardiac surgery between
   5     1st January 1984 and 31st December 1995. It was clearly
   6     impossible to scrutinise all of these records -- they
   7     form a very, very large volume in great detail -- using
   8     teams of medical, surgical and nursing experts, but at
   9     the same time, it was very important to be able to take
  10     into account the records of every child who had received
  11     care under the terms of the Inquiry. This again is
  12     important to the parents. All of the children have been
  13     considered and we have analysed data that relates to all
  14     of them.
  15        The Panel decided to take a sample of these
  16     records and more details, as you have heard, will be
  17     given of this tomorrow. However, if we are to take
  18     a representative sample that reflects the concerns that
  19     led to the Inquiry, it is important to have summary
  20     information available on every child. This was done
  21     using a team of people who are used to summarising
  22     medical records.
  23        One has to remember that the language in which
  24     a medical record is written may vary from time to time
  25     with the same clinician, describing the same sort of
0045
   1     conditions, and will vary between one clinician and
   2     another for the same conditions. They will write down
   3     and use words slightly differently.
   4        But when we come to try and summarise the data, we
   5     have to use a language that is common to all. It may be
   6     that this language is not the most appropriate one for
   7     ordinary conversation. The team of people who are doing
   8     the coding of the records in the hospitals use
   9     nationally and internationally agreed terms for these
  10     operations and diagnoses, and the groupings, as you have
  11     been shown, are labelled using a code number with
  12     a letter and this process is described as "coding". The
  13     people who carry out the task are called "coders". This
  14     process was described in evidence given in July.
  15        In generating a summary for each child, we entered
  16     this summary on to a computer and it is described as the
  17     "clinical coded record" database, the CCR.
  18        To check that these records did cover the relevant
  19     children and to compare the quality of the data recorded
  20     there, we have made comparison with two other sources
  21     and the first, as we have heard, is the Patient
  22     Administration System, known as PAS, a routine computer
  23     system used by the local Bristol Health Trust for
  24     administrative purposes.
  25        It is also the basis for preparing returns to be
0046
   1     submitted to the Department of Health of England,
   2     notably for the production of hospital episode
   3     statistics, and more details of this will be provided by
   4     Dr Aylin.
   5        The third source of local data is the logs of the
   6     operations done at the Bristol Royal Infirmary by
   7     Mr Wisheart and Mr Dhasmana, and they have been
   8     described by them in evidence given to the Inquiry.
   9        These logs used words that were typed or
  10     handwritten and they have also been coded by a coder
  11     from the team who coded the clinical records.
  12        Each of the sources has some basic information on
  13     the children and if we look at my report, INQ 0012/18,
  14     a summary of the information that is in common across
  15     the sources is given at the top there. This has patient
  16     name, date of operation, BRI number, whether the patient
  17     died, the surgeon, the date of death, the diagnosis, the
  18     age (derived from the date of birth that we recorded for
  19     the CCR and the PAS) and OPCS codes for operative
  20     procedures.
  21        So there is other information in each of the
  22     sources, but we have concentrated on the information
  23     that is in common across the sources because we want to
  24     carry out comparisons.
  25        So we had three purposes of bringing together
0047
   1     these sets of data, and I think my first overhead slide,
   2     which is number 50, described the purposes.
   3        If we can have that rotated and enlarged, the
   4     purposes of our looking at these sources of data was
   5     firstly, as we have said, to describe the overall care
   6     and to describe the children receiving that care in
   7     Bristol. That was an important purpose.
   8        The second one was to allow us to carefully select
   9     a sample to allow for detailed examination of the
  10     medical records in that sample, because we could not do
  11     it for all 2,000.
  12        If we are to do it for a sample, we have to select
  13     that in such a way that every child concerned has an
  14     equal chance, as every other similar child, of being
  15     included in the sample.
  16        The other thing that is important is that when we
  17     begin to make national comparisons, and Dr Aylin and
  18     Professor Murray will be making national comparisons, we
  19     need to be sure that the local data really reflect what
  20     actually is recorded nationally as far as we can. So we
  21     want to check that the Patient Administration System is
  22     a reasonable reflection of the care given and the
  23     outcome of the care.
  24        We also want to make sure that the surgeons' logs,
  25     which were the underlying source for the UK cardiac
0048
   1     surgeons' register, should also have similar results as
   2     the medical records.
   3        If we look at the processes of care, I think it is
   4     probably at this point helpful to look at Annex 1 of my
   5     report on pages 47 and 48. We will look at page 47
   6     first of all.
   7        If we look here, we see that there can be
   8     a child -- and each child must have been admitted to
   9     hospital to be included in any of the sets of data.
  10     They have to be admitted to the hospital.
  11        A child may have a single admission. Within that
  12     admission, for all practical purposes, they must also
  13     have an operation. Within one operation, when they go
  14     to the operating theatre, they may have more than one
  15     operative procedure carried out by the surgeon.
  16        These operative procedures are what are coded
  17     using the Office of Population Census and Surveys coding
  18     system. They use labels. This is the language that is
  19     used to describe the operations that the children had.
  20        If you go to the next page, we see that some
  21     children, of course, did not only have one operation, or
  22     one admission. Within an operation, they have more than
  23     one procedure. During a single admission to hospital,
  24     they may actually have more than one operation. You
  25     have here an example with two operations, and in each of
0049
   1     those two operations, if we scroll down a little
   2     further, we have two procedures.
   3        So the question is, what should we count? It is
   4     not going to be easy to count children in a way that is
   5     comparable nationally. In the hospital episode
   6     statistics, what they count is essentially an episode of
   7     care, and Dr Aylin will explain a little more about
   8     that. In terms of the cardiac surgeons register, they
   9     record operations. They do not identify individual
  10     children. So we have to make a decision as to how we
  11     are going to do our counting. It is very difficult to
  12     count only children. We could count admissions, except
  13     that, in the clinical coded notes, we do not have dates
  14     of admission and discharge. Nor do we have them in the
  15     surgeons' logs, but we do have operations, and so what
  16     we have done is, where we can count admissions, we count
  17     them; where we can only count operations, we count
  18     them. Hence, as Mr Langstaff was explaining, in
  19     a situation here, where we have two operations and in
  20     each of them we have two procedures, we do not want to
  21     count both those procedures, so we need to have a system
  22     whereby we decide which of these procedures we should
  23     count.
  24        So we can see a slightly more complex situation
  25     further down, where there are two admissions and, if we
0050
   1     scroll down, each of those can have two operations and
   2     two procedures.
   3        If we look at the language that has been used for
   4     doing the coding of these operations, as Mr Langstaff
   5     said, K codes are for the heart and the L codes are for
   6     arteries and veins, what he described as "vascular"
   7     operations. For example, I have not got an immediate
   8     example for you, but KO 1 stands for transplantation of
   9     heart and lung. Then there are a series of individual
  10     codes under that that go KO 1.1, which is
  11     allotransplantation of heart and lung; KO 1.2 is
  12     revision of transplantation of heart and lung; KO 1.8 is
  13     "other specified"; KO 1.9 is unspecified.
  14        We have these detailed codes. The language that
  15     is then used for that is not necessarily exactly the
  16     same language that a surgeon will write down. They do
  17     not write these codes down. They use their medical
  18     terminology, and somebody has to translate that medical
  19     terminology into these codes.
  20        For example, if we look at KO 4, that is the
  21     overall code for correction of tetralogy of Fallot. But
  22     KO 4.1 is correction of tetralogy of Fallot, using valve
  23     right ventricle outflow conduit, so there is a level of
  24     detail there.
  25        We will come to some of these groups of operations
0051
   1     where we have to use the greatest detail, and some where
   2     we can use a broader category.
   3        The consequence is, in doing this translation
   4     between whatever is written, on the surgeons' logs or in
   5     the medical notes, or perhaps in a discharge letter that
   6     is coded by the Patient Administration System, there
   7     will be inevitable differences in the way the
   8     translation happens. If I was speaking in French and
   9     somebody were carrying out simultaneous translation, you
  10     would realise that my use of French could be translated
  11     in more than one way in English and in some instances in
  12     French we have "tu" and "vous", both of which in English
  13     are "you", but they have subtly different meanings in
  14     French. We do not have any longer in English "thee" and
  15     "thou" that would reflect the "tu" of French. So we
  16     may find that there are things written in the record
  17     that are incapable of being translated.
  18        Having gone on at considerable length about that,
  19     we need to see that the Cardiac Surgeons' Register does
  20     not have those kind of operation codes. If we look at
  21     table 2.1 in Dr Aylin's report of his report INQ 13,
  22     page 54, which we have already seen, and there is
  23     a similar table 6 of Professor Murray's report, we will
  24     see these operation codes. I hope this has perhaps made
  25     it a little clearer. So group one is KO 4, tetralogy of
0052
   1     Fallot, and that means KO 4.1, point 2, point 3 and so
   2     on, and point 8 and point 9, and that is a broad
   3     grouping. If we look down to group 8 for truncus, that
   4     is specifically LO 1.1. So that if something was coded
   5     LO 1.8, which is an unspecified operation, it will not
   6     then get counted.
   7        So you can see the subtle differences here.
   8        This makes it amazing that there can be any
   9     comparability between the records. I expected, when
  10     I looked at these data, to find that when I started
  11     looking at the numbers, I would not get much agreement.
  12        There are also a larger group of procedures, as
  13     Mr Langstaff said, classified by whether they were open
  14     or closed.
  15        Again, if we look at INQ 13/82, again from
  16     Dr Aylin's report, here we see a list of procedures
  17     beginning at KO 2.1 which was an open procedure, and
  18     K 15.1 and K 15.2 that were closed procedures, and then
  19     on the right-hand side is K 16.1 and point 2 and so on
  20     all the way to point 8 which are excluded. It does not
  21     mean the children were not counted, we have looked at
  22     those, but in terms of classifying whether an operation
  23     was open or closed, we were unable to decide on the
  24     advice of the clinicians whether it was unequivocally
  25     open or closed. It could be either.
0053
   1        So, if we classed it as open we may make
   2     a mistake, if we class it as closed we may make
   3     a mistake. If we leave it out, we are reasonably sure
   4     we are comparing similar things.
   5        That explains, when we then move on to my next
   6     slide, essentially, which comes from my report OO12/43,
   7     in the top half of that page, we will see a figure and
   8     this shows us something of the data that we have.
   9        We see here a diagram that shows to us that in the
  10     Patient Administration System, which only ran from
  11     1st January 1988 -- it does not cover the whole period
  12     of the Inquiry -- we have nearly 2,000 children who were
  13     identified as being admitted under the care of
  14     cardiologists or cardiac surgeons, or paediatric
  15     cardiologists, or paediatric cardiac surgeons, and we
  16     have 2,000 children.
  17        We see there that they have nearly 4,000
  18     admissions. You can see that the number of operations
  19     and the number of admissions is rather similar; this
  20     number here is similar to this number here. So that
  21     when we decide to count operations or admissions, we are
  22     not making a big error. The number of operations per
  23     admission is close to 1 in the children we have.
  24        But in each operation, there are perhaps 1.5 or
  25     a little more procedures: nearly 6,000 procedures
0054
   1     received by those 2,000 children. Of course some
   2     children received only one procedure and some received
   3     a very large number indeed. There is enormous
   4     variability in that number.
   5        We when come to group these, if we group them by
   6     open and closed, it turns out that a lot of the
   7     admissions cannot be grouped by either open or closed in
   8     an unequivocal way.
   9        Most of them are open. There are a smaller number
  10     that are closed. If we look at the procedures, we can
  11     classify those by, for example, in the black here, these
  12     are non-cardiac procedures, so that a child who is
  13     admitted for cardiac surgery may have a procedure that
  14     is not cardiac.
  15        We also have a large number of procedures that are
  16     not grouped, which are nevertheless cardiac, and, for
  17     example, cardiac catheterisation, or contrast radiology
  18     of the heart, were not grouped, but yet those are
  19     recorded in the records and recorded in the Patient
  20     Administration System, and many of them will have had
  21     those, but they are not sufficiently serious procedures,
  22     and will nearly always be accompanied by other
  23     procedures, and then this rather smaller number, which
  24     makes it look as though we are only looking at a few,
  25     but we are looking at the principal procedures that
0055
   1     children who had heart surgery had. This is the pattern
   2     we see for the Patient Administration System.
   3        If we now move on down to the next page, page 44,
   4     to the top half of that, we see the same pattern for the
   5     clinically coded records. Here we cannot have
   6     admissions, we can only have operations. We see down
   7     here in procedures we have quite a reasonably large
   8     number that are non-cardiac, a considerable number that
   9     are not grouped and then still a large number that are
  10     grouped.
  11        If we move on down to the bottom of that page, we
  12     will see the same thing for the surgeons' logs, on
  13     a slightly different scale. We have got rather smaller
  14     numbers; we are talking about only something like 1,300
  15     children, with slightly more operations, but we now see
  16     that among these the classification is that the vast
  17     majority are open operations, and so they should be,
  18     because to go to the BRI meant that you were going there
  19     to have open-heart surgery; you did not need to go to
  20     the BRI if you were only having closed surgery.
  21        Here we see the procedures, the distribution of
  22     them is rather different; we have a very small number of
  23     non-cardiac procedures that are recorded. We have
  24     relatively few not grouped because the catheterisation
  25     and the contrast radiology of the heart would not have
0056
   1     been done there generally, and most of our procedures
   2     are grouped.
   3        So we end up comparing them and on the surface of
   4     it, it looks as though they are very different.
   5        The other thing we need to be aware of, and if we
   6     can go back a little to the previous page, page 43, to
   7     the bottom half, we need to think about how we have
   8     classified whether a child was alive or dead. Many of
   9     the children will have died relatively quickly after an
  10     operation, and this curve here shows us that immediately
  11     after the operation, there is a certain mortality. This
  12     shows us that all the children are alive here, but by
  13     this point (indicating) 95 per cent; 5 per cent have
  14     died quite rapidly.
  15        Then the deaths go on occurring during the first
  16     30 days. They do not stop occurring at the end of 30
  17     days; they go on happening, and you can see that the
  18     curve continues there. But we, in most of the sources
  19     of data, do not have long-term follow-up of the
  20     children. We have that in the medical notes, we have it
  21     to a degree in the Patient Administration System, and we
  22     have it in Bristol rather better than we have it
  23     elsewhere.
  24        But we have chosen to make sure that we can make
  25     comparisons between children in one centre and another.
0057
   1     We have chosen the 30-day. This is something that
   2     surgeons do not only for children but also in other
   3     operative areas, not only in paediatric cardiac surgery
   4     but elsewhere. But we need to be aware that sometimes
   5     we will be talking about children alive, meaning they
   6     had survived for 30 days. Sadly, they may have died
   7     a short time after that or in some instances they may
   8     have survived until 15 and have died at that point.
   9        So when we talk about that, we need to be aware
  10     that we have perhaps left some things out.
  11        Remember the Patient Administration System covers
  12     1st January 1988 to the end of 1995, but the coded
  13     records and the surgeons' logs cover the whole period.
  14     As Mr Langstaff said, we divided time into epochs and if
  15     we look at page 45, we see only open operations.
  16        If we go to page 39, which is table 5.1, if we go
  17     to the top, we see here this is from the surgeons'
  18     logs. We have age at which the operation took place,
  19     grouped into 0 to 90 days and so on, and 1 to 15 years
  20     and we see that the death rate is really a great deal
  21     higher, very sick children are operated on between 0 and
  22     90 days, so their death rate is much higher than older
  23     ages.
  24        So when we make our comparisons, it is important
  25     not only to make comparisons that are for similar
0058
   1     operations; they have to be for similar ages, because
   2     the death rate is varying.
   3        If we move further down to table 5.1 -- and I am
   4     coming shortly to an end, Mr Langstaff -- we see here
   5     comparisons for the three major data sources, the
   6     Patient Administration System, the clinical coded
   7     records, and the surgeons' logs.
   8        We find here that although the period of time is
   9     really rather different, for tetralogy of Fallot we have
  10     as it happens exactly the same death rate, even though
  11     the numbers differ somewhat. The Patient Administration
  12     System we would expect to be less in that it covers
  13     a shorter period of time.
  14        We look at the intra-atrial transposition
  15     operations. We have 10 per cent, 5 per cent and 1 per
  16     cent. There is a bit of a difference there, and we may
  17     need to examine that. You may recall that there is some
  18     difficulty in distinguishing between those and the
  19     arterial switches that people do not necessarily code
  20     them correctly and we see here also differences, 38 per
  21     cent, 30 and 41 per cent. If we were to average those,
  22     it turns out that we would have rather better
  23     comparability of the data.
  24        The other problem is that these are not making
  25     a comparison for the same epochs. If we can go to my
0059
   1     second overhead, which is my last one for the moment,
   2     which is, I think, page 51, if we can rotate that, we
   3     have here a similar table where we have the 11 groups
   4     only. We are just looking at the open operations
   5     because the surgeons' logs do not cover the closed ones,
   6     and we look at the Patient Administration System, the
   7     clinical coded records and the surgeons' logs, we find
   8     that the death rates are relatively similar, except in
   9     groups 2 and 3 and we find considerable agreement and
  10     down the bottom here, in terms of the totality, we have
  11     similar numbers of deaths and similar numbers of
  12     operations.
  13        The general conclusion from this is that we cannot
  14     use any of these sources of data to decide exactly what
  15     happened to individual children. It would be dangerous
  16     to go to one of those sources and say, "That gives us
  17     the truth about that".
  18        The medical records should give us a full idea,
  19     but do not forget, we have had to translate from a pile
  20     that may be one foot high into one or two sheets of
  21     information using our translation process to get the
  22     codes. That process is inevitably subject to error.
  23        When we look at it, the comparability of these
  24     different sources -- I think we can probably finish with
  25     the overheads now -- is really remarkably good. We will
0060
   1     see that later on. If we return to our purposes and we
   2     perhaps -- I am sorry, we could perhaps go to my
   3     conclusions which are on page 32, I can read out
   4     a little bit from paragraph 6.3 there in the middle:
   5        "The exercise of coding the medical records was
   6     a considerable task, and although the individual coding
   7     has been of high quality, there have been many minor
   8     problems with the data.
   9        "The Patient Administration System does seem to
  10     provide an adequate method for an overview of the amount
  11     of care and the mortality."
  12        It is very difficult to cross-check these. One of
  13     the things that came across to me was the number of
  14     changes of names of children. The family tragedies that
  15     are involved in that are really very considerable, and
  16     I think that one has to be aware of that. We are
  17     talking about children and families here that are very
  18     affected, but nevertheless, when we look at the whole
  19     thing, the coding and entry and so on has not been
  20     perfect.
  21        From a clinical perspective, as I have written at
  22     the bottom of the page there, each child is unique and
  23     not simply to be pigeon-holed, but we have to make
  24     comparisons, either over time or between surgeons in
  25     a unit, or between units, and if we are to do that, we
0061
   1     have to put things into categories. We cannot make
   2     progress without it and there will be imperfection. The
   3     great majority of the children who received care did not
   4     die, but there is also very strong evidence, to me, that
   5     these different sources of data suggest that none of
   6     them is perfect and none of them has a major problem in
   7     the way that it has described the deaths in the care.
   8     I am sorry to go on rather.
   9   THE CHAIRMAN: Thank you very much, Professor Evans.
  10   MR LANGSTAFF: Professor Evans, to what extent do the
  11     figures produced from one data source, in your view,
  12     support and give strength to the figures from another
  13     data source?
  14   A. I think the first thing to remember is that if they did
  15     not, then the exercise of national comparison would be
  16     a waste of time. So it is a necessary condition, but
  17     not a sufficient condition that the comparison is valid.
  18        They do complement one another. In looking at the
  19     details those numbers have errors in them. There are
  20     errors when people have typed them into the computers.
  21     There are errors in programming, doing the grouping,
  22     that sort of thing, and we still have to try and get
  23     that better. They are not perfect, even though they
  24     have been put in that report. But as I have looked at
  25     it more and more, the more detail I have, the more they
0062
   1     agree with one another and that is encouraging. If they
   2     did not, we would be wasting our time doing an analysis
   3     of the data, but it has not said that we should believe
   4     the numbers perfectly.
   5   Q. An alternative way of looking at it might be that the
   6     data that we had simply was not good enough and had not
   7     been collected in a way that was good enough to enable
   8     any valid comparison to be made?
   9   A. I certainly approached the data with that as
  10     a possibility, but I do not hold that view any longer.
  11     I am amazed at how consistent they are and this will
  12     come in Professor Murray's and Dr Spiegelhalter's view
  13     of these things. The consistency is very much greater
  14     than I would have expected.
  15   Q. Can I ask you about specific tables? If we go to
  16      INQ 12/35, table 4.3, what we are looking at there is
  17     from the PAS system so it only covers 3 epochs, because
  18     the Patient Administration System as you point out gives
  19     us data only from January 1988 onwards.
  20        If one compares the death rate from 1988 to 1995,
  21     13 per cent and 11 per cent, with the death rate from
  22     April 1995 to December 1995, 2 per cent, there is a very
  23     obvious drop. First of all, is that a drop in rate
  24     which is consistent across the data sources?
  25   A. Yes.
0063
   1   Q. Secondly, in looking at the material before you, did you
   2     notice any difference in the mix or nature of the
   3     operations being performed during that last period
   4     compared to the two earlier epochs?
   5   A. Yes. There were certainly very many fewer operations
   6     overall. There are obviously only 132 in total in the
   7     PAS for that period. We are only talking about a short
   8     period. The earlier epochs cover several years and this
   9     only covers part of the year, so the overall numbers are
  10     smaller, and some of the high risk operations almost
  11     disappeared in 1995; they were not there. So we are not
  12     comparing like with like across those epochs, the last
  13     epochs.
  14   Q. Does it follow that it would not be a safe conclusion to
  15     say, "April 1995 the surgeon was changed, and look what
  16     a difference it made to the death rate"?
  17   A. I think that one ought to examine the question of
  18     whether that is so, but I think that those figures on
  19     their own should certainly not be used to draw that
  20     conclusion.
  21   Q. If you would turn, please, to INQ 12/41, this is
  22     a comparative table which you have derived from the
  23     surgeons' logs which compares the rate attributable to
  24     operations which were conducted by Mr Dhasmana and
  25     Mr Wisheart.
0064
   1        The first question needs to be asked: is there any
   2     statistical basis for ascribing any particular success
   3     in the sense of survival, or failure if in the sense of
   4     death, to the surgeon as opposed to the whole process
   5     involving the whole of the surgical team?
   6   A. I do not think that I have data to answer that
   7     question. If we look at the overall rates, they are
   8     similar for the two. There are individual operations
   9     where there is a difference, for example, with truncus
  10     arteriosus, just over halfway down the table, but these
  11     are based on very small numbers.
  12        The consequence is that you cannot be certain of
  13     the differences. They are compatible with differences,
  14     but overall, there is no evidence for a systematically
  15     higher rate with one surgeon than another.
  16        That would be compatible with some system failure,
  17     if Bristol were shown to be different to other centres,
  18     but it does not mean it is system failure and you cannot
  19     use these data alone to draw that conclusion.
  20   Q. Can I take you to INQ 12/38? The last table we looked
  21     at, as indeed this table, comes as the "SL" suggests
  22     from the surgeons' logs. Am I right in thinking from
  23     your earlier description that these descriptions are not
  24     lifted straight, as it were, from the wording in the
  25     surgeons' logs, but go through a process whereby the
0065
   1     information in the logs has been coded and the same
   2     coding process applied to that as to other data?
   3   A. Yes.
   4   Q. So the surgeon himself may -- and may with force -- say
   5     "That is not how I would have described this particular
   6     operation"?
   7   A. Yes. I think that that is so. What we have here in
   8     this table, of course, though, is totals, and perhaps
   9     the interesting thing is that during the first three
  10     epochs there is no evidence of any particular trend, and
  11     while in the fourth epoch it appears to be much lower,
  12     we are based on very small numbers and there is some
  13     suggestion that the type of operations changed in that
  14     period.
  15   Q. You are talking of table 4.10 there, I think?
  16   A. I am sorry, yes.
  17   Q. Can we scroll down so we see what you are looking at
  18     there?
  19   A. I am saying that in those three periods, the 12, the
  20     15 per cent and the 13 per cent, they are from
  21     a statistical point of view rather similar.
  22   Q. Can I ask you to go back up the top of the page, which
  23     deals with individual operations? What I was asking
  24     you, and I think you were accepting, is that the
  25     description which an individual surgeon might give to
0066
   1     whether a particular operation should be classed as one
   2     or other of these groups, that may very well be said by
   3     a surgeon, might it, with force, because the nature of
   4     your data is to put it through a coding process where
   5     coders look at the data and make of it as best they can
   6     from their particular perspective?
   7   A. Yes. I think that one of the things is that in the
   8     surgeons' logs there we see a very different death rate
   9     for inter-atrial transposition of the great arteries,
  10     the second line of that table, from that for other
  11     transposition which is the arterial switches.
  12        To me, that demonstrates that the coding of the
  13     switches within the surgeons' logs was very much better
  14     than it was in, for example, the Cardiac Surgeons'
  15     Register, where the difference between the second and
  16     the third group, there was some confusion between them.
  17        My view is that this is reasonably reliable. It
  18     has not been done for the surgeons' logs, but for the
  19     clinically coded records, we actually carried out
  20     a re-sample and a re-look at the data, and found
  21     considerable consistency.
  22        So while a surgeon may say, for any one case this
  23     is not reliable, the overall pattern, I think, would
  24     have to be said to be reliable.
  25   THE CHAIRMAN: Mr Langstaff, I for one did not quite
0067
   1     understand that last answer. I wonder whether Professor
   2     Evans could go through it again? I am looking at "that
   3     demonstrates that the coding of the switches within the
   4     surgeons' logs was very much better than it was, even
   5     for example if the Cardiac Surgeons' Register, where the
   6     difference between the second and the third group, there
   7     was some confusion between them."
   8        That is what you said.
   9        That induces in me a degree of confusion also.
  10   A. Yes, I am sorry, because I have read the report on the
  11     Cardiac Surgeons' Register that has not yet been given
  12     in evidence, and I think it will become clearer when
  13     that does, but what was said by Mr Langstaff in his
  14     opening remarks was that the coding of the switches
  15     between the Mustard and Senning and the arterial
  16     switches, sometimes got confused. The consequence of
  17     a confusion of that kind would lead to the death rate in
  18     each of the groups ending up looking similar. Whereas
  19     in fact, the death rate in the inter-atrial
  20     transposition is very low, and the death rate in the
  21     arterial switches is rather high, and that is shown by
  22     the coding of the surgeons' logs, implying that that has
  23     been done relatively reliably, rather than muddling
  24     between the two.
  25   THE CHAIRMAN: That is very helpful, I am grateful.
0068
   1   MR LANGSTAFF: You say in fact, it is as we see in the
   2     surgeons' log. How can you say that when all you have
   3     to go on is figures?
   4   A. I think that the point is that if the death rates in the
   5     two had both been 8 per cent, which you might see if
   6     there was a lot of error, if you are randomly putting
   7     down one of those codes and assigning deaths to it, we
   8     would then see that the difference in the risk of those
   9     two different operations would be blurred and they would
  10     move together.
  11        Just from the figures, I can see a big separation
  12     in those, and in those it looks that the coding of those
  13     particular operations was done reasonably in the
  14     surgeons' logs and the surgeons' logs, in their words,
  15     also, were clear. Your problem of translation can be
  16     caused by failure to describe clearly, but it looks as
  17     though the combination of the translation, the coding
  18     and the original statement in the surgeons' logs, was
  19     sufficiently clear for distinctions to be made between
  20     them.
  21   Q. You, from your perspective, are reporting upon death
  22     rates here at Bristol shown by the process that you have
  23     gone through by a comparison of three different sets of
  24     data.
  25        Am I right in thinking that you, in your report,
0069
   1     do not make any comparison with national rates and so we
   2     have no way of knowing from your data alone how these
   3     particular rates may compare or do compare with
   4     elsewhere?
   5   A. No, you are absolutely correct: no national comparison
   6     was made.
   7   Q. The next area which I want to explore with you is
   8     whether you, as a statistician, had any input into the
   9     90 day cut-off, the grouping of the age from 0 to 90
  10     days, and then from 91 to 365?
  11   A. I had a little input into it, in that, within our data,
  12     we could group it by individual day for each of these
  13     sources, because we had exact dates of death and we had
  14     exact dates of operation. So we could look at that.
  15        Certainly my looking at the data was that in the
  16     first month, the first 30 days, which is traditionally
  17     regarded by many as being the highest risk time, this
  18     was not substantially different from the risks in the
  19     first 90 days, in the data that I looked at.
  20        So it seemed more sensible to compare that.
  21        Outside paediatric cardiac surgery, death rates
  22     fall very dramatically after the first week of life. So
  23     there is a tendency to want to group things in the first
  24     week, or at least the first month of life and not the
  25     rest. In paediatric cardiac surgery, in looking at the
0070
   1     data -- and I think perhaps others may be able to
   2     comment on that -- there is a pattern that suggests that
   3     the death rate in operations that are during the first
   4     90 days are relatively similar to one another, rather
   5     than just in the first 30 days.
   6   Q. I do not know whether there is a comment from my right,
   7     and our panel of experts, as to the degree of
   8     justification there is for having a 0 to 90 degree
   9     category, which plainly on the figures that you produce,
  10     shows a much higher death rate, and therefore, perhaps,
  11     may have a tendency to mislead by skewing, given what
  12     you say about early deaths being inevitably more likely
  13     than later ones.
  14        Do we have a comment that any of you would wish to
  15     make?
  16   THE CHAIRMAN: I think the way you ended that sentence
  17     slightly itself skewed our understanding, Mr Langstaff,
  18     with respect. Professor Evans, would you comment on the
  19     question?
  20   A. Can I just comment that the alternatives were to group
  21     0 to 30 days and 30 days to 1 year. Had we done that,
  22     we would have found a rate in the 0 to 30 days that was
  23     high, and perhaps slightly higher than the 0 to 90 days,
  24     but only slightly higher. If we had the 30 days to
  25     1 year, that would muddle a group who had a high rate
0071
   1     from 31 days to 90 with a group that had a notably lower
   2     rate from 91 days to the end of the year.
   3        So grouping it in the way that says "I will only
   4     put 30" ends up muddling things and not exaggerating the
   5     effect. We have not, by having the group at 90 days,
   6     exaggerated any effect.
   7   MR LANGSTAFF: I see nods from my right so that is
   8     a comment, I think, in itself.
   9        Professor Evans, thank you very much indeed for
  10     your presentation. Sir, I am in your hands. I think it
  11     may be convenient to proceed with Dr Aylin's
  12     presentation following on from that, as it does quite
  13     naturally, and review where we are at, say, 1 o'clock to
  14     a quarter past 1?
  15   THE CHAIRMAN: Yes.
  16   MR LANGSTAFF: I do not know if you are happy to change
  17     places, so Dr Aylin is nearer the screen, if he wishes
  18     to use it?
  19        Dr Aylin, can you remain standing so you can take
  20     the oath?
  21             DR PAUL AYLIN (SWORN):
  22            Examined by MR LANGSTAFF:
  23   Q. Dr Aylin, you have given evidence to us before. Have
  24     you, for the purposes of this part of the Inquiry's
  25     proceedings, prepared a report which we find at
0072
   1     INQ 13/1?
   2   A. Yes, I have.
   3   Q. Does it conclude with a number of tables and figures
   4     which begin at 13/53 --
   5   A. Yes, that is correct.
   6   Q. -- as the screen now shows us, and ends with
   7     a statistical appendix setting out the details of the
   8     statistical methodology you used, ending at page 86?
   9   A. Yes, it does.
  10   Q. You, I think, have prepared a presentation for us of
  11     your investigations. Would you like to give us the
  12     benefits of that, please?
  13   A. Yes. I would like to talk about our analysis of
  14     Hospital Episode Statistics, which was commissioned by
  15     the Inquiry. The report that we presented is in five
  16     sessions, essentially. First of all we looked at the
  17     quality of hospital activity data available to us. Then
  18     we looked at outcomes of surgery in terms of mortality
  19     and one or two other outcomes that we looked at,
  20     comparing the United Bristol Healthcare NHS Trust with
  21     the rest of England.
  22        We also looked at comparisons of UBHT with
  23     individual centres in England as well.
  24        We were asked also to look at activity rates and
  25     referrals, and patterns of these, and also we attempted
0073
   1     to examine co-morbidity or co-existing disease and case
   2     mix in Bristol, and comparing it with the rest of
   3     England.
   4        I will just briefly go over the review of hospital
   5     activity data. There are two main bodies of data that
   6     we could have looked at. The first was HIPE, the
   7     Hospital Inpatient Enquiry, which ran from 1985 up until
   8     1986. This was only based on a 10 per cent sample, and
   9     we felt that the numbers involved in the very small part
  10     of the time period that was covered by the Inquiry was
  11     not suitable for analysis.
  12        The next major set of data was the Hospital
  13     Episode Statistics, which was brought in in 1987, and it
  14     is running to this day.
  15        During its introduction in 1987 and for the
  16     following few years, the literature that we reviewed on
  17     this suggested that the data quality was very poor.
  18        The other problem with the data quality was that
  19     the surgical codes that we were interested in, OPCS 4
  20     codes, were not fully established nationally until 1991.
  21        So we felt that because of this patchy
  22     implementation of the coding for operations and
  23     procedures and because of the poor quality of data
  24     before 1991, that the HES data before 1991 was not
  25     suitable to inform the Inquiry. So this analysis
0074
   1     concentrates on Hospital Episode Statistics, beginning
   2     from the financial year 1991 to 1992, and going up to
   3     December 1995.
   4        I am just going to tell you a little bit about the
   5     methods we used for pulling this data off the national
   6     data system.
   7        We extracted all episodes between the period of
   8     1st April 1991 to 31st December 1995 for all children
   9     aged under 16, with a mention of a K or an L code.
  10     I can illustrate the process of this, if we go to
  11      INQ 13/96, I can show you some of the figures involved
  12     in this selection process.
  13        Can we just move down the page a little bit?
  14        If we look at the top, the first extract was based
  15     on years and as I say, we picked out all episodes in
  16     children aged under 16 which had a mention of a K and L
  17     procedure, and we also looked at all other episodes that
  18     belonged to children which we had pulled out beforehand.
  19        I ought to explain about the Hospital Episode
  20     Statistics, that they are based on episodes of care, and
  21     an episode of care is a continuous spell -- a continuous
  22     episode of care spent under a consultant.
  23        It may be possible that you have one or more
  24     episodes of care within an admission.
  25        Can I just show you slide 97? This just
0075
   1     illustrates how we managed to bring episodes together.
   2        Can we scan down a little bit? This is just
   3     a little diagram that may help to explain about episodes
   4     of care.
   5        One of the problems with Hospital Episode
   6     Statistics is that we do not have a patient identifier
   7     on the data that we had obtained, so we were not able to
   8     identify, through an identifying code, individual
   9     children.
  10        What we had to use instead was to use date of
  11     birth, sex and postcode, to try and link episodes of
  12     care to children, and link them together. We also had
  13     the date of admission, and we were therefore able to
  14     link episodes of care into admissions.
  15        This is a hypothetical example of one admission
  16     with three episodes of care, and there may have been an
  17     operation in the first episode, and a subsequent
  18     episode 2 and episode 3. That series of episodes, which
  19     we will call an "admission", ended in either a discharge
  20     home, a transfer to another hospital, or, in some cases,
  21     a death. Our linkage system was not perfect because we
  22     used date of birth, sex and postcode. During the course
  23     of an admission there may be a postcode change or there
  24     may be an error in data entry of a date of birth or sex
  25     or postcode which would make us unable to be able to
0076
   1     link episodes together to form an admission.
   2        In certain admissions, we were unable to find the
   3     final episode and therefore we were unable to ascertain
   4     whether a patient went home or was transferred or died
   5     and in that case, the outcome of the admission was
   6     unknown.
   7        I just want to go back to that page 96 now,
   8     please, and take you through this a little bit further.
   9        For this first extract, we pulled out all episodes
  10     of care which were linked to episodes which had a K or
  11     an L procedure in children aged under 16, for the
  12     financial years 1991/92 to 1995/96.
  13        For most of the tables and the analyses that we
  14     are going to present, they relate to epoch 3, which is
  15     1st April 1991 to 31st March 1995 and so this
  16     subselection here refers to this particular period in
  17     time.
  18        We identify 216,000 episodes out of those 289,000
  19     episodes.
  20        At this point, we started to link the episodes
  21     together, to form admissions. We identified 41,000
  22     admissions, with a mention of a K or an L procedure.
  23        There were many other episodes that may have been
  24     related to the same child but did not have a mention of
  25     a K and L procedure, so we may have an admission in the
0077
   1     early part of the period of a child who had a procedure
   2     with a K or an L code mentioned.
   3        Subsequently there may have been an admission for
   4     asthma or another surgical procedure later on in this
   5     period which we were not interested in, so we discarded
   6     those, so we kept all admissions with a mention of K or
   7     L procedures and that came to 41,000.
   8        We were then able to divide these up into
   9     admissions to UBHT, Bristol, and those admissions to the
  10     rest of the English admissions.
  11        Then, at this lowest point, we used the procedure
  12     groupings which we have talked about a little bit
  13     earlier, and the broad classes of procedures, the open
  14     and closed classes of procedures, to extract these more
  15     relevant procedures from the overall numbers of
  16     admissions with K and L codes.
  17        So these admissions that were discarded had
  18     mentions of K and L codes that were not included in our
  19     13 key procedure groups, or in our open and closed
  20     operations.
  21        I must say, I just want to make a point of
  22     clarification on the way in which the procedure
  23     groupings were actually used. On the HES database,
  24     there are four fields for operation codes and these use
  25     the OPCS 4 operation coding systems. There may be just
0078
   1     one procedure mentioned or there may be two, three or
   2     four procedures mentioned. If we are looking at an
   3     admission with a number of episodes, there may be
   4     a number of procedures in each of the episodes.
   5        The grouping systems were devised by our team and
   6     the other teams that are talking today, in order to make
   7     sense of these OPCS 4 coding systems.
   8        So the 13 procedure groupings and the broad class
   9     of "open" and "closed" were not known to the coders when
  10     coding the data; they just put the individual OPCS 4
  11     code in there. It was us that used the groupings in
  12     order to be able to have broadly clinical similar groups
  13     of operations so that we could make comparisons.
  14        The ranking system was devised by us in
  15     collaboration with the surgical experts, to enable us to
  16     pick a primary procedure out of a number of procedures,
  17     so the most important procedure out of a number of
  18     procedures in a particular admission.
  19        So these were developed by us for the analysis,
  20     and are not known to the coders in the hospitals who
  21     code these things up.
  22        Once we had our extract of data, we analysed the
  23     data in a number of different ways. If we could go to
  24     slide 100. I am going to present to you, if we could
  25     move up a little bit, four or three main kinds of
0079
   1     analysis, and I am going to talk a little bit about
   2     confidence intervals again, although we had a very good
   3     explanation earlier this morning.
   4        We looked at mortality rates for individual
   5     operations in individual age groups for our procedure
   6     groups of operations. We looked at the proportion of
   7     admissions which ended in a death as far as we could
   8     tell.
   9        You remember I talked to you about linking
  10     episodes together and in a few percentages of episodes,
  11     we were unable to find out what actually happened at the
  12     end of the admission. We were unable to find out
  13     whether they were discharged home or transferred or
  14     died.
  15        In calculating our mortality rates, we have
  16     excluded those unknown outcomes in our analysis, so we
  17     simply, in calculating mortality rates or the proportion
  18     of admissions that ended in death, looked at those
  19     admissions where we have known the outcome.
  20        We have calculated these rates both for UBHT and
  21     for the rest of England as a whole, and I will present
  22     those in a minute.
  23        We also looked at the ratio between mortality at
  24     UBHT and the rest of England. For instance, if we had
  25     10 deaths out of 100 admissions in the UBHT, and, say,
0080
   1     50 deaths out of 1,000 deaths in the rest of England,
   2     the ratio of the mortality rates between the UBHT and
   3     the rest of England would be 2, that is, 10 per cent
   4     mortality in UBHT and 5 per cent in the rest of
   5     England. So we gave the ratio of that and expressed
   6     that as a mortality ratio.
   7        We also calculated excess deaths according to the
   8     Bayesian principles that have been described before, and
   9     if we could go to slide 102, we looked at the difference
  10     in performance or in mortality between UBHT and
  11     a typical centre, quantified by predicting the numbers
  12     of deaths expected if UBHT had the typical mortality
  13     rate, or the mortality rate of a typical centre, and
  14     compared this to the observed numbers of deaths, so we
  15     worked out the numbers of deaths we would expect if UBHT
  16     had the mortality rate of a typical centre in the rest
  17     of En