Chapter 2 - Legal research via the Internet - Potential and problems
- There is already an abundance of legal materials, including legislative
materials, available on the Internet, if only it can be accessed effectively.
This material includes significant legislation collections from over 50
countries (some of which are comprehensive), case law from over 20 countries
with decisions from some major courts in the world being available within
hours, large collections of law reform reports, the texts of up to 200 law
journals, and a vast but as yet unquantifiable body research reports from
specialist law research centres, legal academics and law firms.
- There quantity of available legal information on the Internet can be
expected to further expand at a rapid rate in the next few years, and it is
reasonable to expect that a large proportion of it will continue to be
available for free access as it is now. Even if more of it becomes `user pays',
users will still need to locate information in order to purchase it, and
vendors will encourage tools that facilitate this. The `world law library on
the Internet' is a realistic description.
- It is realistic to speak of the development of a `world law library' on
the Internet, but it is one in which it is at present very difficult to find
all of the available and relevant information. The Internet provides a unique
opportunity for lawyers and legal researchers, particularly those in the
developing member countries (DMCs) of the Bank who may not otherwise have
access to international legal resources in print, to obtain affordable access
to at least the basic elements of a world-wide law library.
above conclusions support to the approach which has been taken in the
development of the Project DIAL prototype, and in related facilities developed
for the prototype host, AustLII. This approach to reducing the problems of
legal research on the Internet rests on these propositions:
- Existing intellectual indexes are inadequate and provide only very limited
coverage of the available materials, and even the few that do provide
substantial coverage lack important features such as an ability to search for
- Intellectual indexes are also inherently shallow, expensive to develop in
any depth, and cannot provide the ability to search as comprehensively as
automated indexes which allow word occurrence searching.
- Internet-wide search engines (not specific to law) based on web spider
technology do provide an ability to search at word-occurrence level for
documents located across the Internet. However, their coverage of legal
materials (or other materials) is not comprehensive.
- Another problem with using Internet-wide search engines for legal research
is that it is very difficult (particularly for inexperienced searchers such as
are likely to be found in the audiences for this project) to formulate searches
which are specific enough to remove the `noise' of non-legal usages of search
terms, or which limit the items retrieved to those concerning a particular
country. Users risk not being able to find relevant items because of the bulk
of irrelevant items retrieved.
- Many valuable law sites, particularly in developing countries, do not have
their own search engines and so cannot be searched at word-occurrence level
even when located.
- Even where a particular site has been found by a user and does have a
search engine, the proliferation of different search engines that a user must
master, perhaps for only occasional use, is likely to discourage use and result
in poor quality searching.
- Centralised collections of world-wide legal information, such as GLIN, do
not at this stage provide a solution to the problems of Internet legal
research, but may well be complementary to the approach taken by Project DIAL.
The key to effective legal research on
the Internet may therefore be a tight integration of an intellectual index and
a search engine based on a web spider, a symbiotic relationship in which each
builds on the features provided by the other.
The Bank should consider a further Technical Assistance to provide better
access to legal information via the Internet to Bank DMCs because (i) the
current and potential value of the legal information available via the internet
is clearly very high, and of particular value to DMCs; and (ii) the existing
tools for legal research on the Internet are inadequate, but methods to improve
them have been identified, and these improvements will create effective access
for DMC users.
- An intellectual index is essential to identify high value law sites and
legal resources, but cannot and should not aim to be comprehensive,
particularly in its depth of indexing particular sites once they have been
- Web spider indexing of remote law sites, and a sufficiently powerful
search engine, are necessary to provide the depth of search capacity that
intellectual indexing cannot provide;
- This is particularly so when many significant law sites do not have search
engines at all, and where there is no consistency among the search engines used
by those that do.
- Searching robot indexed sites will work much better if (i) only law sites
are indexed (to remove non-legal `noise' and improve precision); and (ii) such
sites are indexed comprehensively (to improve recall). We call such a web
spider and search engine dedicated to legal materials only a `targeted' web
- Significant law sites which normally exclude robots may allow a `targeted'
law-oriented web spider to index them, by request. The number of requests may
- A comprehensive intellectual index is needed to identify the law sites
worth indexing, and therefore to `target' the robot. The intellectual index
therefore serves the double function of a useful resource in itself, and the
essential means of `feeding' the search engine.
- Once a law-oriented web spider has created a searchable index of key law
sites, specific searches over that index for various types of subject matter
can be `embedded' in the intellectual index, thereby making the intellectual
index `self-updating' to a certain extent, and so reducing its maintenance
costs. Such `embedded searches' also cater for inexperienced users who have
difficulty in formulating searches.