Look, I'm the google bot.
Look, I'm the google bot.
What you'll need: Firefox, The Modify Headers extension for firefox.
What you can do: Many sites allow google to look into their forums, in order to get more traffic by getting more google search results. By pretending to be the google search indexer (aka spider, bot) we can access parts of these sites off limits to guests.
Example URL: http://www.tuts4you.com/forum/index.php?showtopic=9959 I found this address today doing a google search, and noticed that although I didn't have access, google had a cached version. Head over to this url, and you should get the same error as me.
First, head over to google and type in 'browser headers'. The first link should take you to a page which gives you all the information your browser sends to web servers. This information, specifically the User-Agent field, is how the google bot tells web servers what it is. Go back to the google search, and click on the 'cached' link.
Notice in the 'User-Agent' field, where it once showed your browser info, now it says: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Download the Modify Headers extension. (http://modifyheaders.mozdev.org/) And open it up. Next you need to make a rule, modifying the 'User-Agent' header to say 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)' Once you're done, make sure it's enabled, and browse over to the example url. Even though you're not logged in, you should be able to browse their forums.
Let me know what you think.
Digitalchameleon
bahpomet1105 8 years ago
Thanks for the cool article and Thanks for the user-agent I added it to my crawler