Blog

Fed up of bad bots?

4 noviembre, 2014

Bad bots, ie robots and spiders that index your site in an aggressive mannger with little benefit to yourself, can easily cause problems with bandwidth. To ban the bots, you have a few options.

ROBOTS.TXT
Good bots should obey robots.txt, but bad bots usually don’t so whilst it’s good practice to use this file, it won’t help with the bad guys as they simply ignore it.

.HTACCESS RULES – USER AGENT
Banning a bot via .htaccess rule based on their user agent name is a safe way to ban bots, as long as their user agent name doesn’t keep changing.

To block a bot: [replace “twenga” with the bot name of your choice]

 # let's block twenga
 RewriteEngine on
 RewriteCond %{HTTP_USER_AGENT} ^twenga
 RewriteRule ^.*$ - [F]

 To block multiple bots add an "OR" after each line except the last: [replace the names as necessary]
# let's block bad bots
 RewriteEngine on
 RewriteCond %{HTTP_USER_AGENT} ^twenga OR
 RewriteCond %{HTTP_USER_AGENT} ^Zeus
 RewriteRule ^.*$ - [F]

 To send a bot to a specifc page: [replace "twenga" with the bot name of your choice]
# let's redirect twenga
 RewriteEngine on
 RewriteCond %{HTTP_USER_AGENT} ^twenga
 RewriteRule ^(.*)$ 

Or for the ultimate bad bot blocking script, head over to the AskApache website: http://www.askapache.com/htaccess/blocking-bad-bots-and-scrapers-with-htaccess.html.

Más sobre este tema

Comments are closed.