A Colossus come to judgment:
GIO's expert system on general damages
Graham Greenleaf, Faculty of Law, University of New South Wales
(published in the Law & Information Technology column, Australian Law Journal)
26 November 1992
Inside Colossus
Artificial intelligence suffers whiplash
An opaque oracle?
Verdicts on Colossus
Jobsearch prognosis

Can a computer program which knows little about the Motor Accidents Act 1988 (NSW), and nothing of comparable verdicts, be relied upon to determine the assessment of bodily injury claims in New South Wales? Colossus, an ‘expert system’ developed jointly by GIO Australia and Continuum Australia, has that function, and the software has now been exported to overseas insurers. With around 15,000 rules in its knowledge-base, Colossus is one of the world's largest commercial expert systems.
The GIO had problems. From 1984 to 1988, its claims costs had been increasing by 14% annually. At least part of the fault seemed to lie within. An actuarial study of GIO assessment practice prior to 1988 found that claims were being assessed and settled by assessors and lawyers with very variable expertise and experience. A re-assessment of randomly selected completed files found that among less experience officers assessments would typically vary by 100%, but the variation decreased (not surprisingly) with the experience of the assessor.
Alternative solutions were rejected. More extensive training of assessors was regarded as impractical because of the difficulty of providing sufficiently comprehensive guidelines in an understandable form, and of ensuring they were correctly followed by over 250 assessors. A database of comparable verdicts and settlements, based on the many thousands of closed claims in GIO files, was considered but rejected because of the difficulty in designing it so that genuinely comparable claims would be retrieved, given the complexity of variables in a claim. Development and updating costs would also be very high. Furthermore, such a system could easily involve the entrenchment of bad decisions.
The preferred solution was to attempt to capture the accumulated expertise of GIO's best claims assessors in what is usually called an ‘expert system’, so as to attempt to mimic the reasoning processes of an expert assessor. A retired District Court judge also worked for six months to ensure that Colossus was considering the correct heads of damages. The objectives, common in the development of expert systems, were to enable the organisation to speak with a consistent voice, and to capture the knowledge of experts so that it was available for use by all staff (even after the experts have left the organisation).
Colossus has been used in all GIO Funds Administration Offices since July 1989, and about 75% of all personal injuries claims are now processed through Colossus. It does not handle claims concerning spinal cord injuries, brain damage or nervous shock.

Inside Colossus

A Colossus consultation requires an assessor to answer up to 700 questions (but usually far less than this) before it makes an assessment. Age and gender of the claimant are the only purely personal details considered. Each injury is classified according to 600 injury codes. The initial, subsequent and future treatment of each injury is ascertained. Numerous questions are asked to determine the level of trauma: ‘In the case of broken bones it will ask if there were problems of bony union (the system knows the normal union time for all bones by age and sex). If a lower extremity is injured, it will enquire about the type and time walking aids were used. It will want to know if traction was needed...’ and so on (B Hornery and P Beinat ‘Colossus’, Judicial Officers Bulletin, V3, No 2, March 1991). The system attempts to give its users considerable help in answering these questions, providing on-line medical encyclopedias and the like. Each impairment is assessed, and the level of impairment of the whole person calculated. Disabilities are assessed, taking into account such factors as duties performed under duress, ages of children, domestic assistance received, family status, and types of loss of enjoyment of life. An assessment of the value of disfigurement is left to the assessor.
The system then combines five elements (trauma, impairment, disability, loss of enjoyment of life and disfigurement), each of which are constructed from this fact gathering, into one overall seriousness percentage figure, the percentage that this claim is of a catastrophic injury (eg quadriplegia).
For common law cases, the only ‘comparable verdicts’ that the system knows is the highest verdict that has been awarded. It calculates the range of ‘Colossus assessments’ as a simple percentages of this maximum amount. All other ‘comparable verdicts’ or previous assessments are ignored in this simple linear scaling. For example, if the general damage is considered to be 36.65% of a most severe case, Colossus' range of common law assessment is $77,900 to $91,600.
The logic of Colossus' approach now has a close parallel in s79(2) of the Motor Accidents Act 1988 (NSW) which provides that ‘The amount of damages to be awarded for non-economic loss [as a consequence of a motor accident] shall be a proportion determined according to the severity of the non-economic loss, of the maximum amount which may be awarded’. The maximum amount, which may only be awarded in ‘a most extreme case’ (s79(5)), is $180,000, indexed for inflation (s80), so the maximum amount on which Colossus bases its linear scaling for motor vehicle accidents at present is $211,000. In motor accident cases, no damages are to be awarded for non-economic loss unless the injured person's ability to lead a normal life is significantly impaired (s79(1)), and only then if the loss is assessed as being greater than $15,000. In the above example, the Motor Accidents Act assessment would be $77,300. It also calculates what its common law assessment would have been, as shown above.
Between the poles of ‘a most extreme case’ and ‘no significant impairment’, how Colossus reaches its recommended figure is subject to three broad sets of variables. The validity of any of these steps could be open to reasonable argument. First is how the individual assessor has interpreted the source documents (medical reports etc) in answering Colossus' questions. Disagreement here could arise from differing interpretations of the facts of the injury and treatment, even though Colossus has extensive medical ‘help’ facilities. Second is how Colossus uses this information to construct each of its five basic categories (trauma, impairment, disability, loss of enjoyment of life and disfigurement). GIO has based its measures of relative injury severity on the American Medical Association's ‘Guides for the Evaluation of Permanent Impairment’ and its own claims experience. Even so, disagreements about the relative severity of certain categories of injury could arise. Third is the algorithm used to combine these categories into one composite percentage figure. Disagreements about the importance of, say, trauma compared with loss of enjoyment of life could arise here. The choices that the system designers have made in determining these last two matters is, in large part, the GIO's embodiment of its expertise, based on its past claims experience. But expertise is not, of course, infallibility.
Before a Colossus assessment is used as the basis for settlement offers, the whole file including the Colossus report is reviewed by an authorised officer, to reduce any dangers of misinterpretation of the facts or inadequate weight being given to very unusual factors. In this sense, the Colossus assessment is used as a guide to the eventual settlement offer, not its final arbiter.

Artificial intelligence suffers whiplash

An interesting wrinkle in the early performance of Colossus was that this linear scaling of injury seriousness resulted in Colossus making offers for whiplash which were far lower than the ‘going rate’ in the late 1980s. The GIO considered that this reflected previous artificially high settlements for whiplash in New South Wales, but business reality dictated that it adjust Colossus so that it would offer somewhat more than a pure linear scaling where whiplash was involved. It may be easier to change Colossus to reflect subtle commercial realities, or statutory changes (such as the introduction of the Motor Accidents Act), than it would be to train numerous assessors to reflect such factors in their assessments.

An opaque oracle?

For each assessment it makes, Colossus produces a report of a couple of pages. The report is a comprehensive and structured listing of all of the data that Colossus has been ‘fed’ in order to make its calculations, which should make it easier for both sides to determine whether they are negotiating on a common factual basis .
However, the report then merely states the overall percentage of a most severe case that it has calculated the injuries to be. No explanation is given of how the various components (trauma, impairment, disability, loss of enjoyment of life and disfigurement) are each measured, or how they are then combined with each other by Colossus. This limited explanation must reduce the assistance that the report gives to an assessor or lawyer representing the GIO who is attempting to negotiate a settlement. When faced with the other side's calculations of the main elements of the claim, it is not likely to be convincing to say ‘Colossus disagrees’. GIO is currently examining ways to provide a report giving an outline of the logic applied in arriving at a conclusion.
It is GIO policy to give a copy of the current Colossus report, such as it is, to the claimant's lawyers on request. On occasions, it has also allowed claimants' solicitors to put their own answers through a Colossus consultation. Could a more comprehensive explanation of the logic applied by the system also be provided to the claimant? GIO would no doubt regard precise details of how the various elements were calculated, and how they are combined (the ‘assessing rules’), as confidential, the key to Colossus' competitive performance as a means for settling claims. More meaningful explanations in the report should assist negotiations, but there will always be some inherent tension in how much it is safe to disclose.

Verdicts on Colossus

How do you measure the effectiveness or value of a system like Colossus? A GIO test showed a standard deviation of 80% from the mean in the assessment of the same bodily injury claim by different assessors, but this deviation was reduced to 15% when Colossus was used (the deviation is because different assessors still interpret the input data differently). GIO is very satisfied with its overall performance in that the average general damages component per claim has not risen since 1988, compared with the previous annual rate of 14%.
Is it meaningful to ask how ‘accurate’ Colossus is? One measure of the ‘accuracy’ of Colossus could be the extent to which claimants accept ‘its’ offers. GIO says that, in the past year, 72% of claims were settled, of which the 74% managed by Colossus were settled for within 9% of the Colossus assessment. Of course, many matters go to trial for reasons which have nothing to do with general damages, such as questions of liability and future economic loss. Another measure could be the extent to which Courts reach the same measure of damages as Colossus does – with the attendant risk of costs against a claimant who chooses to litigate when an offer of compromise is made and rejected. Of those claims that are fully litigated, GIO figures indicate that the Supreme Court awards verdicts which, for general damages, are on average 20% higher than the Colossus estimate, and the difference in relation to verdicts by the District Court is even greater. However, there may be many explanations for these differences. There may be some distortion at present by the ‘tail’ of difficult common law motor accident claims in New South Wales. There will also always be claims which, because there are special circumstances affecting the plaintiff which Colossus is unable to take into account, or because the plaintiff is likely to present particularly well in court, it will be better to litigate than to accept what Colossus would recommend.
Whatever the best measure is, it is important that Colossus makes ‘fair’ settlement offers. Even though the GIO now operates in a competitive environment, which sometimes lessens dangers of unfairness, long run commercial imperatives are not sufficient to ensure justice in individual cases. Of course, the claimant can always go to Court, but it is in the public interest that the maximum number of claims be settled, and that settlements be fair.
Colossus is just one instance of an important challenge of the information age: how to ensure that computer-based decision-making is fair and non-discriminatory. Australia is at the forefront of the development of such systems, with other examples such as the NSW Sentencing Information System, Commonwealth data matching schemes, and the computerisation of social security administration having been the subjects of past columns in this series. In many cases they lead to improved consistency, and equity, in over-all decision making. However, we need to develop measures to better ensure that they are also fair in individual cases. Administrative law, and competition, are valuable in eventually revealing system deficiencies, but maximum openness (‘transparency’) in how such systems operate, and the consequent opportunities for informed public debate, is the best preventative measure.
GIO has adopted measures such as extensive medical help systems, and checking of assessments by authorised officers, to try to ensure that Colossus operates fairly, and it is considering others such as more explanation of Colossus recommendations. It is equally important that the legal profession engages in an informed dialogue with GIO and the operators of similar systems.

Jobsearch prognosis

Another GIO expert system, only recently in use, calculates economic loss for adults. It examines the impact of the accident on the claimant's employability by comparing the prognosis of each doctor who reported on the claim with the permanent impairment ratings generated by the general damages component of the system and the requirements of the claimant's pre-injury occupation and other occupations, calculating its own ‘level of belief’ in each prognosis depending on how well it matches this other data.
It uses the Australian Standard Classification of Occupations (ASCO), to determine the relevant educational, mobility, dexterity etc criteria for 1600 occupations, as well as pay rates. This data is used to find post-accident occupations open to the claimant, and Commonwealth Employment Service (CES) data on the regional availability of employment in the relevant occupations is then used to determine the likelihood of employment. Labour market analysis is to be added. Assessors can over-ride any of the conclusions drawn by the economic loss system because of the likelihood that they will have local knowledge unknown to the system.
(Much of above description is derived from B Hornery and P Beinat ‘Knowledge-Based Systems in Insurance’, a paper published by Housley Communications Consultants, 1991, and the other paper by those authors cited above. Barry Hornery and Geoff Meadows of GIO also assisted with very helpful discussions. The opinions expressed are, of course, mine and not theirs.)