[Previous] [Next] [Up] [Title]

6. Why legal research on the internet is difficult

Despite the existence of these research aids, finding legal information on the internet is difficult, for at least the following reasons: So the problems of finding legal materials world-wide are that it is both difficult to find which useful sites exist for a particular country or subject, and also difficult to find what is on such sites as are known. These research problems are very substantial even for the most expert `internet savvy' lawyers and law librarians. They are much worse for inexperienced users.

[23] See http://www.austlii.edu.au/links/World/Indices/ for many examples.

[24] Such factors led to estimates in 1996 that even the largest internet-wide search engines only indexed about 20% of the estimated 150 million web pages. However, the most recent figures published by Search Engine Watch (at include claims by Alta Vista to index 100 M pages, HotBot 80 M, Excite 55 M and others less than that. The estimated total number of web pages is now in excess of 200 M, so, whatever the exact situation may be, it is still the case that no search engines can claim to index all pages on the web.

[25] See `How big are the search engines?' (Search Engine Watch) at and references linked therefrom, for detailed discussion of all these matters.

[26] The claim by John Pike, webmaster of the American Federation of Scientists, and the reply by Alta Vista are available at and discussed in `The Alta Vista Size Controversy' on Search Engine Watch at

[27] See Martin Koster `The Web Robots Pages' at for details of the operation of web robots

[28] See the `Robots Exclusion' page, dealing with both the standard and the Meta Tag for robot exclusion at

[29] For example, on Alta Vista, a search for Vietnamese legal materials requires a search which is limited to materials which are located on a server in Vietnam (the `domain:vn' delimiter) or contain `Vietnam or Viet Nam' - and this is still somewhat hit or miss.


[Previous] [Next] [Up] [Title]