Previous Politech message: http://www.politechbot.com/2004/07/06/boycott-msn-search/ -------- Original Message -------- Subject: Re: [Politech] How to boycott Microsoft (or at least MSN search) Date: Wed, 7 Jul 2004 07:52:55 +0200 From: David Haworth <david.haworth@private> Organization: 3SOFT GmbH To: Declan McCullagh <declan@private> CC: matthew@private References: <40EB7C47.6000106@private> Declan, Matthew, Re: Politech article "A Finger In The Eye Of Microsoft" Unfortunately, the scheme won't work. At least, not for long. Why? Because it relies on the MSN bot being honest & truthful. The robots.txt file is a hint to visiting robots about where they might look for file to index. There's no compulsion for a bot to take any notice of the file, or even to read the file. Admittedly, MSN might get some bad press if they started indexing pages that should not be indexed at all. But there's nothing to stop the MSN bot from using, say, the Google bot's entry in the file. For more information go to http://www.robotstxt.org/wc/norobots.html Dave Haworth -------- Original Message -------- Subject: Re: [Politech] How to boycott Microsoft (or at least MSN search) Date: Wed, 7 Jul 2004 01:02:43 -0500 From: Matthew Hunter <matthew@private> To: David Haworth <david.haworth@private> CC: Declan McCullagh <declan@private>, matthew@private References: <40EB7C47.6000106@private> <20040707055255.GA6206@private> On Wed, Jul 07, 2004 at 07:52:55AM +0200, David Haworth <david.haworth@private> wrote: > Re: Politech article "A Finger In The Eye Of Microsoft" > > Unfortunately, the scheme won't work. At least, not for long. Why? > Because it relies on the MSN bot being honest & truthful. The > robots.txt file is a hint to visiting robots about where they might > look for file to index. There's no compulsion for a bot to take any > notice of the file, or even to read the file. > > Admittedly, MSN might get some bad press if they started indexing > pages that should not be indexed at all. But there's nothing to stop > the MSN bot from using, say, the Google bot's entry in the file. > > For more information go to http://www.robotstxt.org/wc/norobots.html I'm fully aware that the rules in that file are voluntarily implemented, and practically unenforceable. I never expected it to work if Microsoft chooses to be dishonest about it. However, I do not expect them to be dishonest unless the idea really takes off heavily -- frankly the amount of pressure that can be applied this way is fairly low compared to the amount of bad press they might get for violating it. If Microsoft does decide to play dirty, that's going to have consequences in terms of bad press -- and for that matter, wouldn't ignoring the robots.txt file count as "bypassing an access control device" per the DMCA? That might be fun. -- Matthew Hunter (matthew@private) Public Key: http://matthew.infodancer.org/public_key.txt Homepage: http://matthew.infodancer.org/index.jsp Politics: http://www.triggerfinger.org/weblog/index.jsp -------- Original Message -------- Subject: RE: [Politech] How to boycott Microsoft (or at least MSN search ) Date: Wed, 7 Jul 2004 13:12:08 +0100 From: M Grossmann To: Declan McCullagh <declan@private> Declan, I'm kind of surprised to see a Slashdot fanboy type of post resplendent in ad hominem, appeal-to-fear and appeal-to-belief attacks sent out on this mailing list. On Slashdot, crack-smoking moderators (of which I am one) might've moderated it up to "Score:5, Informative", but on Politechbot? Here's my translation and synopsis: "This is my face. I wish to spite it. I shall cut off my nose. You can do the same; here's how..." Unless Matthew's domain is www.*, his "boycott" will do nothing but harm himself, and it does nothing to slow Microsoft's entry into search engines. Perhaps he could take a few minutes to think of ways to prevent MS from automatically hijacking searches with IE and Explorer, then write a script to disable the various event handlers and registry keys necessary for the constant redirection. It shouldn't take much more time than his interesting proposal for link tagging, and it would be much more effective. Regards, M W Grossmann (If you publish this, please munge address or replace with M_W_Grossmann@private) -------- Original Message -------- Subject: [Politech] How to boycott Microsoft (or at least MSN search) Date: Wed, 7 Jul 2004 09:20:36 -0400 From: Richard W. DeVaul <rich@private> To: declan@private CC: politech@private References: <40EB7C47.6000106@private> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >From: Matthew Hunter <matthew@private> > ... >So here's what I have to say: from now on, the Microsoft search >bot will be specifically excluded from my web sites via the >robots.txt file. The following entry should accomplish this: > >User-agent: msnbot/* >Disallow: / > >What that means is that I won't allow Microsoft's search engine >to index my websites. Google can, yahoo can... but Microsoft >can't. It's the website owner's version of an invisible boycott. >And if enough people do it, the Microsoft search engine will find >itself unable to search large segments of the web. And that means >that people who notice this won't want to search using >Microsoft's search engine. Mathiew appears to understand Microsoft's threat without understanding Microsoft's tactics. The robots.txt file does not change the behavior of your web server, it is a convention that is (supposed to be) respected by web crawler clients. Put another way, it isn't a lock on the door, is it is a note saying "dear Microsoft robots, please don't come in." If only a few people do this, you may block Microsoft bots but it won't make much of a difference in their indexing. If lots of people do this, I have no doubt that Microsoft will modify the bots -- either by changing the name, or simply ignoring blanket restrictions specific to only their bot. There is no particular reason to expect that Microsoft bots will respect robots.txt files any more than any one of the numerous other standards Microsoft has bent or outright ignored over the years. Cheers, Rich -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: Processed by Mailcrypt 3.5.6 <http://mailcrypt.sourceforge.net/> iD8DBQFA6/ihcEzhTv/Qc9oRAmCCAJ4jhhScIJvljKjuVpxsFug84kUEbgCfcl/B 5anaFTJqfPjaUeC7zy303gQ= =C7tU -----END PGP SIGNATURE----- -------- Original Message -------- Subject: RE: [Politech] How to boycott Microsoft (or at least MSN search) Date: Wed, 7 Jul 2004 08:37:06 -0500 From: G. Waleed Kavalec <Greg@private> Reply-To: <Greg@private> To: 'Declan McCullagh' <declan@private> CC: <matthew@private> Declan & Matthew This would be a wonderful idea if we could count on Microsoft playing by the rules. It's a no-brainer to build a bot that ignores or simply disobeys robots.txt. Peace G. Waleed Kavalec -------- Original Message -------- Subject: Re: [Politech] How to boycott Microsoft (or at least MSN search) Date: Wed, 07 Jul 2004 10:50:54 -0400 From: Jake Baillie <jake@private> Organization: Priva To: Declan McCullagh <declan@private> References: <40EB7C47.6000106@private> On Wed, 07 Jul 2004 00:29:59 -0400, Declan McCullagh <declan@private> wrote: > What that means is that I won't allow Microsoft's search engine > to index my websites. Google can, yahoo can... but Microsoft > can't. It's the website owner's version of an invisible boycott. > And if enough people do it, the Microsoft search engine will find > itself unable to search large segments of the web. And that means > that people who notice this won't want to search using > Microsoft's search engine. What that also does is cut out roughly 20-40% (depending on your sector/industry) of the people on the Internet that use search engines from finding you. MSN has a big share in the search space. No commercial website in their right mind would EVER ban MSNBot from indexing their site. Most non-commercials wouldn't either. You see, most non-commercial sites either pay for themselves or make a profit by advertising. Why would you drop 1/3 of your potential users on the ground, and hand them off to a competitor? Not very smart, folks. -- jake -------- Original Message -------- Subject: Re: [Politech] How to boycott Microsoft (or at least MSN search) Date: Wed, 07 Jul 2004 10:52:27 -0500 From: Michael Roberts <michael@private> Organization: Vivtek To: Declan McCullagh <declan@private> References: <40EB7C47.6000106@private> Declan McCullagh wrote: > What that means is that I won't allow Microsoft's search engine > to index my websites. Google can, yahoo can... but Microsoft > can't. It's the website owner's version of an invisible boycott. > And if enough people do it, the Microsoft search engine will find > itself unable to search large segments of the web. "Unable?" Come on. The robots.txt specification is just a suggestion. It doesn't *prevent* Microsoft from indexing whatever they want. If it's affecting their bottom line or strategy, does anybody on God's green Earth think they won't violate your robots.txt in a heartbeat? Heck, msnbot doesn't even have to identify itself in User-Agent -- it could just as easily call itself googlebot. You might be able to screen based on IP block, but even there, you're playing with fire. I agree with the sentiment. But for practical purposes, this suggestion seems ... less than useful. Michael -------- Original Message -------- Subject: RE: [Politech] How to boycott Microsoft (or at least MSN search) Date: Thu, 8 Jul 2004 15:35:28 +1200 From: Chris Auld <chris@private> To: Declan McCullagh <declan@private> CC: <matthew@private> > So here's what I have to say: from now on, the Microsoft > search bot will be specifically excluded from my web sites > via the robots.txt file. The following entry should accomplish this: > > User-agent: msnbot/* > Disallow: / > Matthew Hunter (matthew@private) Declan, What a fantastic idea! It's like search engine optimisation in reverse... How to reduce traffic to your site- hell I think the Progressives Aginst Progress would see it as helping to 'democratize bandwidth' on the internet! I shall immediately switch to MSN search in the hope that my searches shall never again return results from sites run by people such as Mr Hunter... I wonder if the following might be useful in my robts file... User-agent: OpenSourceBigotBot/* Disallow: / Chris ------------------------------------------- Kognition Consulting Limited - Thought Meets Technology Chris J.T. Auld - Managing Director Microsoft MVP (Windows Mobile Devices) Phone: +64 3 453 0064 Mobile: +64 21 500 239 Email: chris@private MSN Messenger: chris@private Skype: callto://chris_auld Blog: http://www.syringe.net.nz -------- Original Message -------- Subject: RE: [Politech] How to boycott Microsoft (or at least MSN search) Date: Sat, 10 Jul 2004 16:10:43 -0700 From: Ilya Haykinson <ilyah@private> To: 'Declan McCullagh' <declan@private> Or more likely what will happen is that the web sites (and points of views) of Linux enthusiasts and advocates will simply not show up in MSN Search and thus not be visible to the masses who will continue to use that search engine. -ilya haykinson -------- Original Message -------- Subject: Re: [Politech] How to boycott Microsoft (or at least MSN search) Date: Sun, 11 Jul 2004 14:22:06 -0500 (CDT) From: J.A. Terranson <measl@private> To: Declan McCullagh <declan@private> CC: politech@private References: <40EB7C47.6000106@private> On Wed, 7 Jul 2004, Declan McCullagh wrote: > So here's what I have to say: from now on, the Microsoft search > bot will be specifically excluded from my web sites via the > robots.txt file. The following entry should accomplish this: > > User-agent: msnbot/* > Disallow: / > > What that means is that I won't allow Microsoft's search engine > to index my websites. Google can, yahoo can... but Microsoft > can't. Not quite. What this does is politely *request* that Microsoft not index your site, it cannot force them not to index your site. This is an important difference. Currently, there is at least one search engine spider that is well known for ignoring the robots.txt requests, and I have no doubt that as this competition gets more and more belligerent, this number will increase. The only way to *guarantee* that Microsoft can't index your site is to disallow their spidering programs at either the router or the web server - relying on a voluntary compliance policy - at least where Microsoft is concerned - has not been shown to be efficacious in the real world. -- Yours, J.A. Terranson sysadmin@private "...justice is a duty towards those whom you love and those whom you do not. And people's rights will not be harmed if the opponent speaks out about them." Osama Bin Laden _______________________________________________ Politech mailing list Archived at http://www.politechbot.com/ Moderated by Declan McCullagh (http://www.mccullagh.org/)
This archive was generated by hypermail 2.1.3 : Mon Jul 12 2004 - 22:18:45 PDT