update: 12 May 2003!
14 March 2003
best practices for search usability and SEO
Creating Google-Free Space that Protects Your Privacy
update: 12 May 2003!
14 March 2003
Anonymity on the Internet is difficult to keep unless you 1) don't own a website; 2) don't post to newsgroups or provide any other form of online feedback (e.g. Amazon book reviews, etc.); 3) have an unlisted phone number; 4) have no friends or contacts who will mention you on their sites, their blogs, newspaper articles, or press releases; 5) don't participate in events that post a list of registrants online (e.g. marathons), etc. If you don't want to be googled (information about you or your web site to appear on Google or other search engines) what can you do?
Typically, I advise clients on how to rank higher on the search engines- and not how to leave no trace. However, the above question, particularly when it comes to privacy concerns, is becoming more and more relevant these days. If, for your own reasons, you prefer obscurity, you can try the techniques I'm about to discuss. While none of these techniques can guarantee anonymity, or that you won't appear on a search engine's database, they're worth implementing before you engage in a discussion with the search engines directly to attempt any sort of data removal. Keep in mind that most search engines cannot remove content that belongs to another site that mentions your information (unless you can demonstrate how the personal information posted is illegal). They can, however, remove a site that you own, at your request from both their database and their cache.
Quick Note: In addition to a source such as the famous Google cache, there may also be companies like Alexa, that provide 'snapshots' of the Web to external off-line sources such as the Library of Congress (for its "Digital Preservation Plan"). That's certainly more difficult to expunge but to my knowledge, this is not yet available online.
Protecting Your Privacy, Personal Information, or Web Site on the Search Engines
If you own a web site and don't wish any of its pages to be indexed by the search engines, you can try (in order of effectiveness):
1. Password-protecting your entire site (there are different ways to encrypt your site, depending on your comfort level).
2. Using the Robots Exclusion Protocol (this option is open to you if you are the site administrator).
3. Using the Robots Meta Tag in the Head section of your html code (anyone can do this to protect individual pages or apply to all the pages on their site):
<meta name="robots" content="noindex,nofollow">
This instructs the search engines to not index that particular page. However, not all search engine robots will obey the above tag although the major ones (e.g. Google, FAST, etc.) will. Also, this is not useful if that specific page has already been indexed by the search engine in question (already appears in their database) or if you link to that page from another page which itself has already been spidered by the robots (and does not contain the above tag).
You should also check the Internet Archive to make sure your pages have not already entered their database. If you wish to remove your pages, follow their instructions (which involve using the above-mentioned robots exclusion protocol). If you're unable to do that, they kindly include a contact email for you to try. The WebArchivist is also worth a mention although their "web spheres of interest" seem to be much more limited in scope currently.
What should you do if information about you is available on someone else's site?
Other things you can do:
- get an unlisted phone number (people's phone numbers and addresses are often easily accessible via online databases and sites such as AnyWho, Switchboard, WhoWhere, etc.)
- use a Domains-By-Proxy Registrar (when purchasing a domain name) so that your personal information is protected (e.g. GoDaddy)
- try anonymous web surfing (e.g. Anonymizer)
- check out the tools resource pages of these online Privacy Organizations:
- Tara Calishain and Rael Dornfest have published a wonderfully useful Google Hacks book which includes specific advice on removing personal material from Google's database.
- Google now provides specific info for removing your personal material from their database. (Added: 6 May 2003)
- How to remove your own newsgroup postings or prevent them from being archived on Google. (Added: 12 May 2003)
The free access to information is one of the things that I love best about the Internet and I've benefitted tremendously from it. However, like most things, that information can be abused. [On a meta-level, in anticipation of how the information on this page might be abused, I simply request potential spammers or scammers to consider the effects of their actions on others (perhaps a naive request) and to allow karmic inheritance to apply.]
Each of us must find for ourselves the correct balance between engaging in online communities (and sharing/spreading personal info, data, opinions, or collected wisdom) and keeping a low or more anonymous profile. Tools such as the search engines can be both useful (helping us to find old friends and family, or protect ourselves from criminals, etc.) and harmful (in the promotion of unfounded gossip, preconceptions, pre-judgment, or the closing of doors and opportunities to persons with a public history on the Internet).
Search-engine-free space on the Internet requires some foresight and effort. In particular, it's important that any pages you wish to protect contain no inbound links from pages that are either themselves unprotected or already indexed (e.g. online directories such as DMOZ or Yahoo). Internet "islands" are becoming rarer as the efficiency of robots to spider the web increases. However, most people online still prefer to be found (via the search engines or links from other sites). May each choice (to be found or to be anonymous) continue to be possible and to be respected.
© 2007 searchethos, all rights reserved