Reading Guide: Hypertext and Retrieval
5. Global legal research on the Internet

[Previous] [Next] [Title]
[This Part is not yet complete for 2000 - it will be revised and expanded considerably.]

5.1. What makes internet research different

Some of the factors that make legal research on the internet different from legal research using a CD-ROM, a single dial-up system (eg Lexis), or a litigation support system are: This Part first discusses internet-wide search and indexing tools, and then moves to a discussion of specialist tools dealing with legal information.

Two general resources referred to often in this Part are:

5.2. Web spiders - the basis of internet search engines

Internet-wide search engines are based on the interaction of at least three programs: a web spider (also known as web robot or web crawler), a program that creates a concordance (index) for searching, and a search engine. Here are some explanations of how they interact, as background to later more detailed discussion of search engines:

5.3. Why is internet legal research difficult?

There is relatively little research as yet about the difficulties of legal research on the internet, as distinct from internet research generally (see below), on which much has been written.

See 2. The Problem - Internet Legal Research is Difficult (in Greenleaf et al 1999) for a 1999 discussion of the difficulties of using internet-wide indexes and search engines and indexes for legal research.

For an older (1998) but in some respects more detailed discussion, see Chapter 2.- Legal research via the Internet - Potential and problems' (Greenleaf 1998). The parts of this Chapter that are most useful here are:

In Greenleaf et al 1999, these problems of general search engines for legal research were summed up as: Are these reasons still correct, even just a year later?

5.4. General research on Internet-wide search engines and directories

There is a great deal of published research on the effectiveness of various internet-wide research tools (ie those which are not restricted to legal information).

Some of the more useful sources are discussed below, with the most recent information first::

Search Engine Showdown (2000)

Greg Notess Search Engine Showdown site appears to do attempt an independent evaluation of the claims of coverage of search engines, as seen from the following pages (as at 7 July 2000): A commonly quoted estimate of the total number of web pages as at mid-2000 is 1 billion web pages, which if correct would means that the best search engines are now covering about 30-35% of the total web. In Lawrence and Giles 1999 study (below), the estimate was that the best search engine then covered 15% of a total 800M web pages.

Search Engine Watch (2000)

One of the most extensive general resources is Search Engine Watch edited by Danny Sullivan. It's a huge site and easy to get lost, so you might like to start with the guide to first-time visitors. It is updated constantly, and has a very informative monthly newsletter to which anyone who is seriously interested in search engines is likely to subscribe.

Aspects of the site that you might wish to look at include:

Lawrence and Giles study (1999) - 15% coverage only?

A study by Lawrence and Giles (Lawrence S and Giles C L (1999) 'Accessibility of information on the Web' Nature, Vol 400, 8 July 1999, pgs107-9 (Macmillan, UK)), argued that the best search engines (in early 1999) only covered 15% of the estimated 800M web pages, after adjustment for bad links. Northern Light, Snap and AltaVista were roughly equal. It is important that this research was based on the author's independent assessments of what percentage of web pages each search engine covered, not the search engines' self-reporting assessments of what they cover.

The article is not available in full for free on the web, but see some of the following:

Feldman's survey (1998)

Susan Feldman Web search services in 1998: Trends and challenges (1998) Searcher Vol 6 No 6 - A superb survey which includes criteria for evaluation of search engines, trends, and a comparative table of major search engines.

Some items to note from Feldman's paper.:

These conclusions seem generally consistent with what is argued in the Project DIAL Report.

Feldman's Search Engine Feature Chart (at the end of the article) compares the features of eight search engines (and associated directories).

Other valuable articles by Susan Feldman:

Other resources on search engines

5.5. Targeted web spiders - A new approach to internet legal research

The problems of internet-wise research tools have led to a search for discipline-specific approaches to finding internet-wide information. One solution has been to combine an internet index for a discipline such as law with a web spider that only goes to sites listted in that index and does not 'wander off' to other sites. This is variously called a 'limited area search engine' (LASE) or a 'targeted web spider'

3.5 Other Limited Area Search Engines (Greenleaf et al 1999) discusses a number of such LASEs.

<IMG ALIGN=MIDDLE SRC="http://www2.austlii.edu.au/itlaw/required.gif"> 3.1 A New Technical Solution - A Limited Area Search Engine for Law (Greenleaf et al 1999) explains the approach taken in AustLII's World Law since 1997.

Danny Sullivan Getting off the beaten track: Specialized web search engines (Searcher October 1998) reviews various types of specialised search engines.

5.6. Case study - SOSIG Law Gateway

The Institute of Advanced Legal Studies in London and the University of Bristol Law Library have since late 1999 started the SOSIG Law Gateway as part of the UK' s Social Science Information Gateway (SOSIG) project:
The SOSIG Law Gateway provides guidance and access to global legal information resources on the Internet.The service aims to identify and evaluate legal resource sites offering primary and secondary materials and other items of legal interest.'
The most detailed description of the Law Gateway is Steven Whittle A National Law Gateway : developing SOSIG for the UK Legal Community, Commentary 2000 (2) The Journal of Information, Law and Technology (JILT).

Law Gateway home page - <http://www.sosig.ac.uk/law/>

Some noteable aspects of the Law Gateway: Whittle's article states that:
A Social Science Search Engine is also provided - this is a limited area search engine drawn from a supplementary database of over 50,000 links gathered by a web robot which visits each of the quality websites featured in the Catalogue (including all the Law section sites) and follows any links it finds for those pages and automatically indexes the content.
It seems that the way this works is that, if a search using the search engine on each page of the Law Gateway fails to find a record in the catalog, only then does the system offer another search over the full text of law (and other) sites to which the web robot has been sent. For example, try a search for 'domain names', and it gets no results by searching the catalog, but finds four items from the full text search.

5.7. Case study - AustLII's World Law / Project DIAL

World/ World Law is the global index and search facility on AustLII. h Project DIAL s a part of World Law (the Asian Development Bank funds the legislation and Asian emphasis, and training in some Asian countries) but there is no signigicant technical difference between World Law and DIAL.

Graham Greenleaf et al Solving the Problems of Finding Law on the Web: World Law and DIAL', 2000 (1) The Journal of Information, Law and Technology (JILT) gives the most detailed recent (1999) account of World Law.

There is a User Guide or a Quick Guide which helps explain how it works.

The main features of World Law that are worth noting are:

Other materials on World Law / DIAL:

5.8. Case study - Law sections of ODP (Open Directory Project)

The Open Directory Project (ODP) explains in <IMG ALIGN=MIDDLE SRC="http://www2.austlii.edu.au/itlaw/required.gif"> About the Open Directory Project that its goal is 'to produce the most comprehensivedirectory of the web, by relying on a vast army of volunteer editors' because 'As the web grows, automated search engines and directories with small editorial staffs will be unable to cope with the volume of sites. '

Features of the ODP are:

Like any community, you get what you give. The Open Directory provides the opportunity for everyone to contribute. Signing up is easy: choose a topic you know something about and join. Editing categories is a snap. We have a comprehensive set of tools for adding, deleting, and updating links in seconds. For just a few minutes of your time you can help make the Web a better place, and be recognized as an expert on your chosen topic.
The <IMG ALIGN=MIDDLE SRC="http://www2.austlii.edu.au/itlaw/required.gif"> Law subdirectories of ODP have over 11,000 entries (as at August 2000). The best way to gain an appreciation of whether ODP is likely to be effective to produce an effective legal research tool is to look at a selection of ODP entries and gain an idea of the range and quality of its coverage. Is it genuinely international? What do most of the 11,000 entries concern?

For further information, see:

5.9. Case study - Library of Congress' GLIN (Global Legal Information Network)

The US Library of Congress' GLIN (Global Legal Information Network) is not a limited area search engine, but is the most ambitious centralised collection of laws yet undertaken other than the commercial LEXIS system. Its home page explains its purpose:
The Global Legal Information Network (GLIN) provides a database of national laws from contributing countries around the world accessed from a World Wide Web server of the U.S. Library of Congress. The database consists of searchable legal abstracts in English and some full texts of laws in the language of the contributing country. It provides information on national legislation from more than 35 countries, with other countries being added on a continuing basis.
See 2.7.1. Global Legal Information Network (GLIN) (Greenleaf 1998) for a summary of GLIN (as at 1998).

5.10. Case study - 'WorldLII', a proposed standards-based network of PLIIs

[This part is not finished for 2000 - more to come]

  • [Previous] [Next] [Title]