Block Bots by User Agent String

From Brian Nelson Ramblings
Jump to: navigation, search

How to block a bot by User Agent Sting

Do you have those bandwidth hogging bots as much as Phil and I do? Did you know you can block them in your .htaccess file?

Block the BOT

Let block the most annoying bot on the internet - Baidu spider

vim .htaccess

Now Add the following to block

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Baidu [NC] 
RewriteRule ^.*$ - [F,L]

That's it save the file, and you are now blocking the Baidu spider

Testing to see if its blocked

One way to do this is to use curl

curl -I http://www.briansnelson.com -A "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"

Now you will get a 403 Forbidden

HTTP/1.1 403 Forbidden
Date: Mon, 06 Jan 2014 19:18:11 GMT