G.Greenleaf 'An emerging law of cyberspace?', IIR Conferences, 1996 - 6. Privacy

6. Privacy - is anyone liable for anything (II)?

6.1. What privacy laws will apply in cyberspace?
6.2. New privacy issues in cyberspace
- Search engines and robot exclusion standards
- Between you and your browser
6.3. New Privacy Principles for cyberspace?

It has been well said that `in cyberspace, everyone will be anonymous for 15 minutes'[27]. Most privacy issues that cyberspace will create have probably not been invented yet, but the privacy laws and practices we are now starting to develop will have to deal with them.

6.1. What privacy laws will apply in cyberspace?

At present, the only substantial Australian privacy law is the Privacy Act 1988 (Cth), dealing primarily with the Commonwealth public sector and with credit reporting practices. With no privacy laws to speak of dealing with the private sector, cyberspace is by and large a privacy vacuum in Australia.

The new Federal Coalition government's election policy requires reform of our privacy laws as 'a matter of the utmost priority' and requires 'a consistent Australia-wide approach'[28]. However, it is as vague as the previous Labor policy[29] on how this is to be achieved, simply stating that the Coalition will 'in consultation with the States and Territories, ensure the implementation of a privacy law regime in Australia comparable with best international practice', and that the Federal government will 'work with industry and the states to provide a co-regulatory approach to privacy within the private sector' (Justice Policy). The Online Services Policy says that the Online Government Council will consider 'the merits of a national Privacy Code of Practice, binding both public and private sectors'.

New South Wales Attorney-General, Jeff Shaw, has announced details of a `revolutionary' NSW Privacy and Data Protection Bill 1996 which he says will be soon introduced[30]. If enacted as promised, the NSW move will take the initiative from the Federal Government in setting privacy standards in the private sector in Australia, and will force a reaction from other Australian governments. It will also set the standard for privacy protection in the State and Territory public sectors. The new privacy law will be administered by a new NSW human rights Commission, which will merge the existing NSW Privacy Committee with the Anti-Discrimination Board, and will be headed by Mr Chris Puplick. The Bill has not yet been released, and not all details have yet been finally approved by Cabinet.

Data Protection Principles would apply directly to NSW government agencies, but could be modified for specific public sector bodies by regulations. Insofar as the private sector is concerned, the procedure would be different in that no enforceable principles would apply until the Commissioner drew up Codes of Conduct applying to various parts of the private sector. These Codes would be issued by means of regulations. The Commissioner would normally draw up such Codes in consultation with representative private sector bodies, but a Code could be imposed on a sector that was dragging its feet.

6.2. New privacy issues in cyberspace

The following are just a couple of cyberspace privacy problems that have been topical during the first few months of 1996.

Search engines and robot exclusion standards

One of the most difficult privacy problems of the internet is the power of search engines. One of the main protectors of privacy on the internet, as elsewhere, was inefficiency - that it was very difficult to find anything unless someone told you where it was[31]. This changed somewhat with comprehensive indexes of internet sites like Yahoo[32]http://www.yahoo.com/], but has gone forever with the release in December 1995 of DEC's Alta Vista search engine[33]http://www.altavista.digital.com/]. John Hilvert explains the travails of one user of the Alta Vista search engine[34]:

When Internet user, Ed Chilton heard about the hot new search engine, Alta Vista, from Digital Equipment Corporation (DEC), he had to try it out. Alta Vista was introduced as a free service back in December last year to show-case DEC's ability to handle the Internet, no matter how it scaled. Using high end DEC Alpha systems and sophisticated software, Alta Vista gobbles and disgorges in a very accessible way, the entire catalogue of some 22 million web pages (11 billion words) and about the last two months of the content of 13,000 news groups. It handles 5 million search requests a day.

Impressed with Alta Vista's remarkable speed, Chilton tried Alta Vista on the news groups and was sickened. ''What I found with the newsgroups, using my name or email address as search parameters, was a copy of almost every post I've made to Usenet newsgroups since the first week in January,'' he wrote on 6 March. ''That includes my posts to these two newsgroups, and all rejoinders from anyone here who included my name in his or her reply. Make out of that what you wish. My reaction to it is somewhere between disgust and fury.''

Chilton said it was an important feature of newsgroups that users get to know each other's themes, axes to grind, and pet peeves. ''What I do not expect is that the newsgroup clubhouse is bugged, and that what is said there, by any of us, will be recorded and made available to any person on the Internet, for whatever reason persons might have.'' Chilton said DEC's Usenet search engine should be banned and its developers publicly brought to their knees.

The irony of all this is: I came across Chilton's privacy lament using the Alta Vista search engine.

Alta Vista uses robots (also known as spiders or webcrawlers)[35]http://info.webcrawler.com/mak/projects/robots/robots.html] to trawl the internet, creating complete word occurrence indexes of every web page and every item posted to every News group that it is allowed to access. As a result it is now possible to search for any occurrence of a name or phrase occurring anywhere in the text of any web page, or in any News posting.

As Mr Chilton lamented, the privacy issue here is that, although you must technically make such information available to all on the internet (either by posting it to a newsgroup or putting it in a public_html directory) before robots can index it, you do not necessarily expect that it will be read by anyone outside those with whom you have some common experience, or the information used for purposes completely outside the intended purposes for which it was provided. For example, those involved in creating web pages, or involved in newsgroup discussions, concerning (say) gay and lesbian issues or issues relating to minority religious groups, could find that information about them was being systematically compiled and disseminated so as to harm them. Those who once valued the net as an escape from the values of small communities may find there is no longer any escape except behind barricades of secret communications.

Should there be some privacy right not to be indexed? It is a difficult issue which involves freedom of speech and freedom of the press considerations in a new context, and any legislative intervention could be dangerous indeed.

There is a very significant customary limitation on the operation of robots, which at present provides part of the answer to privacy problems here. The Robot Exclusion Standard[36]http://128.163.69.70/notes/robots/norobots.htm], which is not any official internet standard but rather `a common facility the majority of robot authors offer the WWW community to protect WWW server against unwanted accesses by their robots'. The Standard allows a server administrator to define which parts of a web site are allowed to be indexed by robots[37], but the designers recognise that this has its limitations for privacy protection:

A possible drawback of this single-file approach is that only a server administrator can maintain

such a list, not the individual document maintainers on the server. This can be resolved by a local

process to construct the single file from a number of others, but if, or how, this is done is outside

of the scope of this document.

If there was a change to the html mark-up standard so that pages could contain information in their header that excluded robot indexing on a page-by-page basis, then such a technical solution would largely solve the problem - provided all robots obeyed the Robot Exclusion Standard. This would in effect be an `opt out' solution to the problem.

Between you and your browser

Users of the world-wide-web sometimes thought that the fact that they did not have to enter the names or other details in order to access web pages meant that there was a high degree of privacy in the use of the web - that it was virtually anonymous. Few people who read newspapers would be likely to believe that any longer, but it is still worth cataloguing some of the information that your browser typically reveals about you.

With most web browsing software, such as Netscape or Microsoft Explorer, any request to a web sit discloses to the web server accessed[38]http://www.austlii.edu.au/]:

* the network identity of the machine you use to access the web (both its IP address and, if desired, a domain name look-up), thereby identifying geographically where the user has come from and (for single user machines) what is effectively the identity of the browser;

* the browser software used by you, and the operating system of the browser's host;

* the URL of the web page you immediately previously accessed (or other resource such as an ftp site) - the `HTTP_REFERER';

* `cookies' - information stored by the server on the host[39]http://proxima.cs.purdue.edu:7000/remote.htm] used by you, such as a list of previously accessed web pages, or transactional information generated while accessing those web pages (eg what you bought on all the web pages for one store); however, the web site accessed can only retrieve such information from your host as it knows the storage format within the `cookie'[40].

Current browsers don't allow these disclosure mechanisms to be turned off, although it is not obvious why users could not be given the option to turn off any other than the first one listed. Commenting on cookies, but with comments equally applicable to other forms of disclosure, Marc Rotenberg identifies the privacy issue as `data collection practices should be fully visible to the individual ... Any feature which results in the collection of personally identifiable information should be made known prior to operation and ... the individual should retain the ability to disengage the feature if he or she so chooses.'[41]

Another area where web users may have little awareness of who is capable of finding out details of their browsing habits, is caused by the use of proxy servers and proxy caches, where an internet service provider (ISP), in order to preserve bandwidth and costs, caches all pages accessed by users of the ISP, so that subsequent users access copies of the page in the ISP's cache, rather than on the `original' site. However, this means that an ISP who is potentially local to the user - and with whom the user is a client - can record information about the user's browsing habits which the user would rather have known only by a server on the other side of the world. There are many other aspects of monitoring of network usage that also raise privacy issues.

6.3. New Privacy Principles for cyberspace?

John Perry Barlow once described what he regarded as a European predilection for relying on governments to protect privacy by laws and codes as `like having a peeping Tom install your window blinds'[42]http://www.eff.org/pub/Publications/John_Perry_Barlow/HTML/a_pretty_bad_problem_article.html]. I will nevertheless suggest a brief shopping list of some candidates as potential Privacy Principles (or elaborations of existing Principles) which may be needed in cyberspace and the future. It would take a lot more than this paper to examine or justify these possibilities.

A right not to be indexed? - Technical extensions to the Robot Exclusion Standard might make this customary `right' able to be exercised by individuals, not just server administrators, but if a `rogue' robot indexer ignored the standard[43]http://info.webcrawler.com/mak/projects/robots/active.html], should there be any legal right not to be indexed?

A right to effectively encrypt communications - This right is not included in existing Privacy Principles, but various privacy rights relating to communications now need to be included.

A right to fair treatment in Public Key Infrastructures - Unfair exclusion from PKIs will prejudice a person's ability to protect their privacy.

A right to anonymous and pseudonymous transactions - Physical and virtual transaction structures should be required to give the option of anonymous transactions, wherever reasonable.

A right to human checking of adverse automated decisions, and to understand them - These rights are already present in the EU Directive, deriving largely from French law.

A right not to be disadvantaged by exercise of basic privacy principles

Exercise of privacy rights is part of ordinary organisational running costs.

[27] I stole this quip from John Hilvert, via Andy Warhol and who knows who else ...

[28] See (1996) 3 PLPR 1 for details

[29] see 2 PLPR 161 on the 'Innovations' statement

[30] See (1996) 3 PLPR 17 for details

[31] A year ago I heard it compared to removing the spines from all the books in the Library of Congress, just as a tornado hit. You wouldn't say that any more.

[32]

[33]

[34] John Hilvert "Private Lies", Information Age, May 1996, pp 18-23

[35] See `World Wide Web Robots, Wanderers, and Spiders' -

[36] `A Standard for Robot Exclusion' -

[37] In the file on the local URL "/robots.txt" - which only the server administrator could normally access.

[38] Geoff King, Manager of AustLII (), assisted with some of these details.

[39] for example, with Netscape, by way of a file called COOKIES.TXT on Windows machines or "MagicCookie" on the Apple Macintosh; see l to inspect the contents of the cookie on your host, and other information

[40] The most likely use of cookie information is therefore for sites you have previously visited to customise the appearance of their site to take into account what they already know about you.

[41] Quoted in John Hilvert, op cit

[42] John Perry Barlow 'A Pretty Bad Problem: Forward to PGP User's Guide by Phil Zimmerman'

[43] `Rogue' is used tongue-in-cheek because there are many robots that do not adhere to this voluntary standard at present, for technical reasons: see

[Previous] [Next] [Up] [Title]