[Previous]
[Next]
[Up]
[Title]
8. World Law Index - the intellectual index
World Law Index, the intellectual index or catalogue aspect of World Law has a
`Yahoo-like' interface. It is searchable for both categories and individual
entries. The editing interface is via the web, so contributing editors can be
located anywhere with web access.
All pages in the index have a [Translate] button that takes the user to Alta
Vista's automated translation service, provided by Systran translation software
and ensures that the Systran page has inserted in it the correct URL for the
DIAL page that the user was just viewing (in the example below, the DIAL Index
page). The user then only has to select to which language the DIAL page is to
be translated, press the `Translate' button, and then be returned to the DIAL
page translated into the language of choice.
The resulting translation seems adequate to convey the meaning of most of the
items on the page.
The
DIAL Index page translated automatically to French by Systran
The Alta Vista/Systran translation facility is at present
limited to translations from English to French, Spanish, Portuguese, Italian,
or German, and vice-versa. This translation facility is also only a prototype,
and sometimes has inadequate processing power to translate very long pages. It
is also not recommended to use it to translate documents with complex grammar,
or where accuracy is vital (such as legislation). However, for pages such as
menus, or lists of search results, it is usually extremely helpful. The result
for a world index is revolutionary: instead of being an `English only'
facility, it is now effectively available in six of the most pervasive
European languages.
Editing entries in the index also involves the editor deciding whether to send
the Gromit web spider to index every word on the site which has been
`targeted'. The harness program (Wallace) reads the list of instructions from
the web indexing software, and then sends of multiple instances of the web
spider program, each to download the content of a particular web site. The
harness program ensures that only one instance of the web spider software is
ever downloading from a particular site, to avoid saturating that site with
spider requests and denying access to other users. The harness ensures that the
web spider is 'well behaved', causing minimum impact on the sites from which it
downloads web pages. We call this a targeted web spider, as is not designed to
traverse the web generally, its downloading being limited to the site specified
in the original URL specified when it is invoked.
Targeting the web spider to start indexing at the correct page, so that it when
it indexes all other pages to which it its starting page is directly or
indirectly linked, but are equal to or below the start page in the server's
file hierarchy, it indexes all and only the desired pages, is a complex task.
Some desired sets of data cannot be indexed because of the `noise' they will
bring with them. For others, it is impossible to find an appropriately located
`table of contents' page to use as the `start page'. Other `problems of
targeting' have also been identified[40]http://www2.austlii.edu.au/~graham/Futureproof/indexers-6.html#Heading22]
Users can see at any time what content has been added recently to World Law
Index (and to World Law Search) by checking the `New Additions' page from the
World Law home page or from the [New] button on the button bars in the system.
below means that Gromit has been sent to index that site.
The
`New Additions ' page - http://www.austlii.edu.au/links/new.html
[40]
[Previous]
[Next]
[Up]
[Title]