[Previous] [Next] [Up] [Title]

7. A new approach - A targeted web spider


Our approach to reducing the problems of internet legal research rests on these propositions: The key to effective legal research on the internet may therefore be a tight integration of an intellectual index and a search engine based on a web spider, a symbiotic relationship in which each builds on the features provided by the other[30].

AustLII's web indexing and web spider software

AustLII personnel[31] have developed software to implement this approach:

AustLII's World Law - World Law Index & World Law Search

The first opportunity for extensive testing of the targeted web spider was in Project DIAL[33]http://www.austlii.edu.au/au/special/dial/], a project for the Asian Development Bank which aims to increase the accessibility of legislation on the internet. Its search facility, DIAL Search[34]http://www.austlii.edu.au/au/special/dial/DIALsearch.html], was released on the web in August 1997 and allows searches of legislation and legislation-related material from many countries.

The more general verions of the targeted web spider is now about to play a significant role in other aspects of AustLII's future developments, and was made available from AustLII's home page as `World Law Search' on 5 Februrary 1998.

The World Law facility includes both World Law Index and World Law Search, to emphasise the inter-relationship between the two components. At this stage, some aspects of the World Law pages (such as the search results page) still appear as Project DIAL pages, but their appearance will be made more consistent over time.

The World Law home page, accessible from AustLII's front page

A history of the development of this approach, and of the early stages of Project DIAL, can be found in `Future-proofing a global internet index by a targeted web spider and embedded searches'[35]http://www2.austlii.edu.au/~graham/Futureproof/indexers.html]. The rest of this paper provides examples of the more interesting aspects of the project to date.

Other targeted web spiders

AustLII's World Law Search is one of a few examples of what we call `targeted web spiders' and others have called `Limited Area Search Engines'. There are a couple of examples of this technology being used in the field of law.

JURIST, launched in January 1998 what it calls a `Limited Area Search Engine (LASE)' to make searchable all home pages of law Professors and course pages of law subjects in its index.

JURIST Search page

JURIST uses the Argos LASE which, according to its developers, was `the first peer-reviewed, limited area search engine (LASE) on the World-Wide Web'[36]http://argos.evansville.edu/about.htm] at the time of its release in October 1996. Argos was developed to provide a more precise means of searching for scholarly literature on the ancient and medieval world. It's development was prompted by the same shortcomings with Internet-wide search engines as we have identified in the preceding discussion[37]http://argos.evansville.edu/about.htm]:
At the time of this writing, a search for "Plato" on the Internet search engine, Infoseek, returned 1,506 responses. Of the first ten of these, only five had anything to do with the Plato that lived in ancient Greece, and one of these was a popular piece on the lost city of Atlantis. ... Add to this broad range of responses the fact that Infoseek returns ten entries per page, making it necessary to examine one hundred and fifty one pages of entries, many of which are irrelevant to a scholarly search of "Plato," and the result is a process that is frustrating and inefficient.

They identify advantages of the alternative, `targeted' approach:

By limiting the range of the search engine, a LASE strips out many unwanted references... The result is a higher quality index built for a specific purpose and for a smaller audience. Furthermore, the quality of the index, its purpose and the level of specialization expected of its intended audience are variables that can be manipulated with LASE technology.

They also note that, because of its limited scope, it is possible for Argos to update weekly the indexes to all the sites it covers, rather than the couple of months that (in their experience) are taken by Internet-wide search engines to update sites. They estimate that this means that 98% of all links found by their search engine work at any given time.

Knowledge Basket, the New Zealand Internet content provider released Legal Search New Zealand[38]http://202.37.88.18/search/legsearch.html] in December 1997. It uses the Verity web robot and Topic search engine, and makes searchable 25 New Zealand law sites at present. The advantages they claim for this approach are similar to those described above[39]http://www.knowledge-basket.co.nz/kete/nzsearch.html].

[30] Aspects of such an approach, in a pre-Internet context, are explored in Greenleaf G, Mowbray A and van Dijk P (1995) 'Representing and using legal knowledge in integrated decision support systems - DataLex WorkStations' Artificial Intelligence and Law , Kluwer, Vol 3, Nos 1-2, 1995, 97-124

[31] In particular, Geoffrey King, Daniel Austin and Andrew Mowbray

[32] The names `Wendolyn' and `Wensleydale' have been reserved for future software developed for this project. For further details see

[33]

[34]

[35] Australian Society of Indexers Annual Conference 'The Futureproof Indexer' 27-28 September 1997, Katoomba, Australia -

[36]

[37]

[38]

[39]


[Previous] [Next] [Up] [Title]