[Previous]
[Next]
[Up]
[Title]
6.4. Display of search results
Search results for both Simple Search and Advanced Search are displayed as
shown below, simply as a numbered list of titles of the documents found, ranked
in order of likely relevance to the search terms. In this example (a search
over All Libraries), legislation is found from Mongolia, Zambia (the Value
Added Tax Act and Preferential Claims in Bankruptcy Act items), and
Vietnam (the `Enterprise', `Social Matters' and `Taxation and Fee' legislation
tables of contents - those items beginning `http://coombs'), as well as law
reform reports from other jurisdictions (Canada and Australia), and Internet
law index pages. The items are ranked in likely order of relevance
Example of the current display of search results
The present method of displaying results relies upon the remote site attaching
informative titles to its HTML pages, as it is these titles that are displayed.
While most sites do achieve this to a reasonable degree (as can be seen from
the above example), some sites fail to provide any title at all, so that only
the URL of the site can be displayed in default (eg item 11 above). This is not
informative enough, so a number of alternatives are under consideration. One is
to display the first 50 words or so of the document, similarly to what is done
in Alta Vista. A second is to include the name of the database (added during
the indexing process) from which each entry comes, after the display of the
title, so it is at least easier to recognise which countries and systems
particular items are from. These choices have significant processing overheads,
and we are concerned not to unduly slow the delivery of search results.
Further utilisation of database and Library headings will make possible
additional display options which allow search results to be displayed grouped
by database of origin rather than by relevance ranking, in either long form or
short form[133].
Search results are displayed with every web page that satisfies the search
request listed in an estimated order of relevance to the search request.
In order to choose good search terms, it is useful to understand how the
relevance ranking works:
- All commonly occurring words are ignored (eg but, the, an, of, not, in,
and, or).
- All remaining (significant) words, and the documents in which they occur,
are then assessed according to these criteria:
- The number of significant words occurring in the document.
- If the word occurs rarely in the whole document collection.
- The number of times the word occurs in the document.
- How close to the start of the document (eg the title) the word occurs in
the document.
- To the extent that these criteria are satisfied, the percentage 'score' of
a document increases.
- Documents are then ranked in order of their scores.
There is a
significant difference between Simple Searches and Advanced Searches in how the
relevance ranking operates. In Simple Searches, any occurrence of even one
significant word means that the document will be in the list of found
documents. The results lists for Simple Searches are therefore likely to
contain thousands of documents, only the first few of which are very relevant
to the enquiry. A search for `bank and mortgage near chattel' will (since it
ignores 'and' and `near') find over 2000 documents and try to rank them. In
Advanced Searches, the only documents in the list of found documents are those
that satisfy every aspect of the search enquiry, so only those are ranked. An
search for `bank and (mortgage near chattel)' will find only 18 documents
containing all three terms, and then rank them. They are not the same 18 as are
ranked first in the Simple Search, because of the use of `near'.
As a rough rule of thumb, it will usually be worth looking at items that have a
score of more than 50%, and not often worth looking at them if they have a
score less than that. However, exceptions will occur, particularly with
Advanced Searches.
Results) .
[133] These display options are currentl y
avaialable on AustLII as Boolean (Short Results) and Boolean (Long Rssults)
[Previous]
[Next]
[Up]
[Title]