[Previous]
[Next]
[Up]
[Title]
3. A new approach to finding law on the internet
The
Australasian Legal Information Institute (AustLII) operates one of the largest
free access law sites on the internet[9], and
its personnel also carry out research into the computerisation of law, and
teach legal research and computerisation of law. It is operated jointly by two
University law faculties[10]. The AustLII web
site contains over 3 gigabytes of Australian legal materials, including the
full texts of over 65,000 court and tribunal decisions and 500,000 sections of
legislation, with over 13M automatically inserted hypertext links within those
materials. The site receives over 75,000 `hits' per working day, nearly 2M
`hits' per month. A full description of AustLII and its research is available
on the web in `The AustLII Papers'[11]http://elj.warwick.ac.uk/jilt/LegInfo/97_2gree/default.htm].
AustLII has also developed since 1995 its own `intellectual indexes' to law on
the internet, now containing about 2,000 links, about half of which are to
Australian legal materials, and the balance to legal materials around the world.
AustLII's approach to solving the problems of finding law on the internet rests
on these propositions:
- Web spider indexing of remote law sites, and a sufficiently powerful
search engine, are necessary;
- Searching robot indexed sites will work much better if (i) only law sites
are indexed (to remove non-legal `noise'); and (ii) such sites are indexed
comprehensively ;
- Significant law sites which normally exclude robots may allow a
law-oriented robot to index them, by request. The number of requests may be
manageable.
- A comprehensive intellectual index is needed to identify the law sites
worth indexing, and therefore to `target' the robot.
- Once a law-oriented robot has created a searchable index of key law sites,
specific searches over that index for various types of subject matter can be
`embedded' in the intellectual index, thereby making the intellectual index
`self-updating' to a certain extent, and so reducing its maintenance
costs.
AustLII's personnel have developed the necessary tools to carry out
these tasks:
- a suitable search engine (SINO), which has the full range of boolean and
proximity search commands, optional relevance ranking of search results, and a
facility for limiting the scope of searches to specific databases or
collections of databases;
- internet indexing software (`Feathers') with sufficient sophistication to
allow remote updating by multiple contributors to an index, full search
facilities over the index entries, and a facility to `target' a robot to fully
index specific sites identified in the index;
- a robot or `web spider' (called `Gromit'), and a `harness' or means of
controlling it (called `Wallace').
- a sufficiently extensive index of law on the internet (particularly, at
this stage, for Australian law) to provide a good starting point for the
targeting of the web spider.
The targeted web spider will soon play a
significant role in AustLII's future developments both in relation to
Australian legal materials and international materials. Some of our research
on internet law indexing is supported by an Australian Research Council small
grant for 1997.
The rest of this paper describes the components of AustLII's targeted web
spider, and its first major test, as one of the elements in the development of
Project DIAL.
[9] AustLII's web site is at
http://www.austlii.edu.au/
[10] The University of Technology, Sydney and
the University of New South Wales
[11] - the Journal of Information, Law
& Technology (JILT)
[Previous]
[Next]
[Up]
[Title]