2.2. Finding law on the net - Why is it so difficult?
Despite the abundance of valuable legal materials already on the web, and the
rapidity with which these materials are expanding, they are often very
difficult to find, principally because they are scattered across thousands of
web sites located all around the world. The World-Wide-Web is not `one big
database' or even `a few big databases' like LEXIS and WestLaw. It is not like
a Library in that there is no central organisation of its contents. No one is
in charge of content.
The extent of legal information on the web summarised in the preceding overview
was not known to the authors of this Report, nor to the OGC of the Bank, when
discussions concerning Project DIAL commenced in April 1997. We estimated that
there might be useful legislation collections from 15-20 countries on the web,
and were uncertain about the quantity of case law and other materials. After
considerable research, it is apparent that much more information is available,
and it is rapidly growing. However, there were no available research tools
which could make this information easy to find at the time Project DIAL
There are essentially only two types of tools which help users find legal
materials on the Internet, `intellectual' indexes (also called `directories'
and `catalogues') and `robot' or automated indexes (and associated search
`Intellectual' indexes, or directories, are where each web site of interest
across the world is classified by a person according to various classificatory
schemes, once that web site has been found. It is usually the finding, rather
than the classification, which is the time-consuming process requiring
expertise, in contrast to the tasks of book indexing and library cataloguing
(the closest `paper-based' analogies). Usually, such directories only provide
the title, URL and perhaps a brief
description of each site indexed. They provide a list of hypertext links to the
sites which are indexed. Yahoo!http://www.yahoo.com/]
is a well known example of a general intellectual index of the web (ie one
which is not law-specific).
`Robot' or automated indexes, where a program (variously called a `web robot'
or `web spider') traverses the web, downloading every page it encounters, so
that every word on every page can be indexed by a remotely located search
engine. When the search engine displays a URL as a result of a search, that URL
is to the original site, not to a mirror on the remote site. Alta
is perhaps the best known general example of such an `Internet-wide' search
engine that searches an index created by a web spider. The principle advantage
of this approach is that it is possible to search every word that has been
indexed, not just the titles and a brief summary of what is on the site.
rest of this Chapter is mainly concerned with the extent to which these two
types of search tools are effective to find legal materials on the Internet at
present, particularly legislation-related materials. This review is intended to
answer two questions: