FINDER: an expert system
Introduction
FINDER is an "expert system" which is able to solve a limited class of legal problem, giving its reasons in the form of a legal opinion. It is currently written in Basic for the IBM PC.
FINDER's domain of knowledge is a class of so-called finders cases. A typical case concerns a dispute between a person who finds a chattel on real estate occupied by the other party to the dispute.1
This class of cases was chosen for several reasons. It is small and therefore suitable for a pilot study but it is by no means a simple area of the law. Many law schools use the cases as a vehicle for teaching case analysis. Indeed, one of the judges has said:
"These cases have long been the delight of professors and text writers whose task it is to reconcile the irreconcilable" [Hibbert v McKiernan [1948] 2 KB 142, per Lord Goddard].
I also chose the class because I had earlier done some analysis on the cases which provided the basis for the algorithms used in FINDER.2 This earlier analysis showed that the finder cases could be analyzed on the basis of factual description.
"Facts" have rather fallen from favour in modern jurisprudential thought because of the analysis of the "fact skeptics" of the American Realist school.3 The argument is that so-called "facts" are as much a product of the judge's decision as any other part of the case. I have no doubt that this is true, but the way that it is understood by many proves too much since the common understanding appears to be that the outcome of cases is wholly unpredictable.
Part of the entire training of a lawyer is to bring the student to the point where he or she can predict, indeed influence, the outcome of a case. Part of the way that this is done is to elicit certain "facts" from the client. These "facts" may depend upon the class of case which is under consideration, but are in some instances quite predictable. In mathematical parlance, there is a relatively small set of variables whose values contain enough information to predict the outcome of the case, a proposition which is supported by a substantial amount of research.4
Further, the fact variables may be used for more than just predicting the outcome of a new case.5 The important point is that they contain enough information to provide an important tool for the analysis of a body of case law. Standard statistical methods such as cluster analysis, non-metric scaling and discriminant analysis may be used to assist in the traditional methods of legal research.6
I do not know if this is true for all classes of cases. I doubt if it is. But it is true for the finder cases and this makes them a good choice as a knowledge base for an expert system.
FINDER: outline of operation
FINDER presumes that the client-operator is involved in a finder dispute. It asks a series of questions, currently ten, which are to be answered "yes" or "no". I will comment later in the paper on the deficiencies of this approach and how it may be changed.
Following the questioning, FINDER enters its computational mode using an algorithm developed from the work mentioned earlier.7 This algorithm has the important feature that its time required grows only in proportion to the size of the knowledge base, ie, doubling the size of the knowledge base will only double the amount of time required for the computational stage. In many expert systems, the time grows exponentially. In an exponential system toy programs work well, but it may prove impossible to scale them to a useful size.
The algorithm chooses a predicted outcome for the client's case on the basis of the outcome of the "nearest" case in the knowledge base. This is the case which provides the "authority" for the predicted outcome. The knowledge base is then searched for the case which is "nearest" to the client case but with the opposite outcome; this is the case which will be distinguished in the written opinion.
The written opinion follows a rigid format of the type which is often taught to first year students. First the outcome of the case and the most relevant authority is announced. A description of this authority is then given, followed by a noting of the "most important" similarities with the client's case.8 There is then an acknowledgement that the client's case is not "on all fours" with the authority and the "most important" differences are identified.
The process is then repeated with the case which must be distinguished except, of course, that only the "most important" differences are noted.
In order to prevent FINDER from making a complete fool of itself, there is an additional quick method used for predicting the outcome of the client's case.9 Should this alternative prediction prove to be different from the main classification, FINDER warns the user that the case is "difficult" and that the advice of a human lawyer should be sought.
FINDER: some criticism
The client interface: it should not be necessary to pass sequentially through each of the questions every time. A more suitable approach would be to identify which questions are "most important" in some statistical sense and to ask only such questions as are necessary for the decision to be made. Experiments using a discriminant analysis algorithm indicate that sometimes as few as two questions will suffice and never more than seven.10 But then the opinion becomes more difficult to write.
The size of the data base: whoever heard of a knowledge base of only eight cases? There is at least the possibility that as the base grows FINDER does not become smarter, only more confused. My intuition is that it will continue to perform well as it grows, but only experience will prove or disprove performance.
The form of the data base: This is currently a more important deficiency than the size. A fundamental concept of expert systems is that the data base should be separate from the "inference engine" part of the program and, more importantly, should be very easy to modify and update.11 FINDER could be that way, but is not now. Adding cases is easy, but changing variables and text is tedious.
The program: the existing program needs to be completely scrapped and a new one written. It is the product of to many hours tinkering and too few planning. When I first started the project the only machine available full time was a small portable that spoke only Basic. Memory was extremely limited and the data structure of the program reflects the early need to keep the size as absolutely small as possible. All of the rest of the program has suffered as a result.
The language: contrary to a lot of popular belief, modern Basic is not a bad language for a project such as this.12 Unfortunately, the program is not written in modern Basic. The "modern" part refers to the availability of true subroutines, CASE statements and recursion. There are a number of compilers available now for PCs, but I don't have one.
Relationship with other "Expert Systems"
Most successful expert systems in use at the current time are "production rule" systems.13 The knowledge base in such a model consists of a number, usually large, of rules of the form (IF A is true THEN take some action B). The truth or falsity of A may be established from other rules or, as a last resort, by asking the user of the system.
In the simplest systems, A itself may be indecomposable. Such a system may exhibit foolish behaviour if care is not taken in the construction of the production rules. For example, the system may know that "The Ball is Red" but still ask the user "Is the ball green". One solution is to include certain kinds of rules to keep the system better informed. In the example, this would include an additional rule of the form (IF the ball is red THEN it is false that the ball is green). Not only does this lack elegance, but the integrity of the knowledge base becomes difficult to police.
A better system is that each of the propositions should be decomposable at least to the (subject predicate object) format. In the literature, these are known as associative triples and are usually denoted as (object attribute value), so that (Ball-1 colour green) would describe the fact that "Ball-1 is green".14 This representation allows the manipulative part of the system, the "inference engine", to draw inferences which make the system appear somewhat more intelligent.
A production rule system then operates by cycling through the rules in some order until it is no longer able to draw any new conclusions.15 This "brute force" approach is both the strength and the weakness of production rule models. The strength is that the addition of knowledge to the system may be done with no understanding of how the system will eventually use it; we merely add a new rule to account for circumstances where the performance of the system is deficient.
The weakness is that search time becomes a problem when the number of rules grows. The MYCIN system has nearly one thousand rules now, but for systems which will be widely used the number must be substantially lower if the system is to perform on small machines. One of the strengths of the method used in FINDER is that time and machine requirements grow only linearly.
The Future
FINDER is an experimental vehicle which is intended to explore the possibility of automatic reasoning about case law. Its most important feature is that the output from the program is in a format which is most useful to lawyers.
FINDER is the first stage of a three stage project partially financed by the Australian Research Grants Scheme and the Law Foundation of New South Wales. The second stage of the the project calls for the construction of a traditional production rule model for the finders cases. The third stage is the construction of a "deeper" conceptual model which uses more advanced artificial intelligence techniques.
There has been argument that the third type of model is the only one which is suitable for a legal expert system.16 The FINDER project should be an important contribution to this debate since it will provide a direct comparison of three very different methods.
The debate is by no means merely one of intellectual interest, for the three methods require very different amounts of resources to realize. If it is true that only "deep conceptual" systems can provide legal advice, then we shall not see useful large scale systems in the near future, for they are very much more expensive to build. If they can be built at all.
Uses of Expert Systems
It should not be thought that expert systems are useful only for giving advice. A slight change in the form of FINDER's output would change it into a very different sort of advisor.
As an information retrieval system, FINDER and similar systems could provide an alternative or an addition to the keyword systems which are currently in use.17
A second, and very exciting, possibility is the use of expert systems in the teaching of law. If FINDER knows so much about the law, why should it not be able to teach? This possibility has been followed with the MYCIN system. An expert tutorial system GUIDON has been added to the basic MYCIN system for use in teaching.18 The tutorial extension is itself an expert system which contains rules about good tutoring.
1 The most recent is Parker v British Airways. Parker is not a part of FINDER's knowledge base, which makes it a good test case.
2 Tyree, "The Geometry of Case Law", [1977] VUW L Rev 403.
3 Loevinger, "Jurimetrics: The Next Step Forward" (1948) 33 Minn L Rev 755; Frank, Law and the Modern Mind, Gloucester, 1970.
4 The earliest of this research appears to be Kort, "Predicting Supreme Court Decisions Mathematically: a Quantitative Analysis of the 'Right to Counsel' Cases" (1957) 51 Am Pol Sci Rev 1. See Tyree, "Fact Content Analysis: Methods and Limitations" 22 Jurimetrics 1 (1981) for a review of such work and suggested methods.
5 Although I believe that the arguments concerning the conflict between prediction and analysis are misguided: see Tyree, "Fact Content Analysis: Methods and Limitations" 22 Jurimetrics 1 (1981).
6 Ibid.
7 Tyree, "The Geometry of Case Law", [1977] VUW L Rev 403.
8 "Most important" is given a statistical definition within the program. It depends upon the contents of the data base, so that as more cases are added the "most important" facts may well change.
9 The "quick" method computes the centroid of both groups of cases in the knowledge base. The predicted outcome is that of the group whose centroid is "closest" to the clients case.
10 The discriminant analysis program is based on that found in Naylor, Build your own Expert System, Sigma Technical Press, 1984, a book which I recommend to those interested in exploring expert systems on micro-computers.
11 Davis, "Expert Systems: Where are we now? Where do we go from here?" AI Magazine, Spring 1982; Gevarter "Expert Systems: Limited but Powerful" IEEE Spectrum, August, 1983.
12 For the more traditional rule based expert system, some language such as Prolog or LISP may be more suitable, but there is a lot of time wasted learning new languages when most programs may be written in the language that you know, particularly when that language is very likely to be Basic.
13 Possibly the best known is MYCIN which diagnoses bacterial infections; see Buchanan and Shortliffe (eds), Rule-Based Expert Systems, Addison-Wesley, 1984. For a wide ranging discussion of expert systems and their construction, see Hayse-Roth, Waterman and Lenat, Building Expert Systems, Addison-Wesley, 1983.
14 For a full discussion of the representation problem, see Charniak and McDermott, Artificial Intelligence, Addison-Wesley, 1985.
15 Most expert systems use "backward chaining" which involves a substantial amount of recursion. Recursion is expensive in both memory and execution time.
16 McCarty, "Intelligent legal information systems: problems and prospects" in Campbell (ed), Data Processing and the Law, London, 1984.
17 One project in law has been designed from the beginning as an information retrieval system: Hafner, An Information Retrieval System based on a Computer Model of Legal Knowledge, UMI Research Press, Ann Arbor, 1981.
18 Clancey, "Use of MYCIN's Rules for Tutoring" in Buchanan and Shortliffe (eds),Rule-Based Expert Systems, Addison-Wesley, 1984