Bad bots, ie robots and spiders that index your site in an aggressive mannger with little benefit to yourself, can easily cause problems with bandwidth. To ban the bots, you have a few options.
ROBOTS.TXT
Good bots should obey robots.txt, but bad bots usually don’t so whilst it’s good practice to use this file, it won’t help with the bad guys as they simply ignore it.
.HTACCESS RULES – USER AGENT
Banning a bot via .htaccess rule based on their user agent name is a safe way to ban bots, as long as their user agent name doesn’t keep changing.
To block a bot: [replace “twenga” with the bot name of your choice]
# let's block twenga
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^twenga
RewriteRule ^.*$ - [F]
To block multiple bots add an "OR" after each line except the last: [replace the names as necessary]
# let's block bad bots
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^twenga OR
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.*$ - [F]
To send a bot to a specifc page: [replace "twenga" with the bot name of your choice]
# let's redirect twenga
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^twenga
RewriteRule ^(.*)$
Or for the ultimate bad bot blocking script, head over to the AskApache website: http://www.askapache.com/htaccess/blocking-bad-bots-and-scrapers-with-htaccess.html.