Difference between revisions of "Block Bots by User Agent String"
From Brian Nelson Ramblings
(Created page with "==How to block a bot by User Agent Sting== Do you have those bandwidth hogging bots as much as Phil and I do? Did you know you can block them in your .htaccess file? ===B...") |
(→Via BrowserMatch) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 8: | Line 8: | ||
vim .htaccess | vim .htaccess | ||
− | Now Add the following to block | + | Now Add the following to block bad bots |
+ | ===Via mod_rewrite=== | ||
+ | |||
+ | <Directory /> | ||
+ | RewriteEngine On | ||
RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC,OR] | RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC,OR] | ||
RewriteCond %{HTTP_USER_AGENT} Baidu [NC] | RewriteCond %{HTTP_USER_AGENT} Baidu [NC] | ||
RewriteRule ^.*$ - [F,L] | RewriteRule ^.*$ - [F,L] | ||
+ | </Directory> | ||
+ | |||
+ | ===Via BrowserMatch=== | ||
+ | |||
+ | <Directory /> | ||
+ | BrowserMatchNoCase "Baiduspider" bots | ||
+ | BrowserMatchNoCase "HTTrack" bots | ||
+ | BrowserMatchNoCase "Yandex" bots | ||
+ | BrowserMatchNoCase "AhrefsBot" bots | ||
+ | BrowserMatchNoCase "Pinterestbot" bots | ||
+ | BrowserMatchNoCase "YandexImages" bots | ||
+ | BrowserMatchNoCase "YandexBot" bots | ||
+ | BrowserMatchNoCase "Facebot" bots | ||
+ | BrowserMatchNoCase "DotBot" bots | ||
+ | BrowserMatchNoCase "PetalBot" bots | ||
+ | |||
+ | Order Allow,Deny | ||
+ | Allow from ALL | ||
+ | Deny from env=bots | ||
+ | </Directory> | ||
− | + | That's it save the file, and you are now blocking the Baidu spider | |
===Testing to see if its blocked=== | ===Testing to see if its blocked=== |
Latest revision as of 01:26, 27 October 2020
Contents
How to block a bot by User Agent Sting
Do you have those bandwidth hogging bots as much as Phil and I do? Did you know you can block them in your .htaccess file?
Block the BOT
Let block the most annoying bot on the internet - Baidu spider
vim .htaccess
Now Add the following to block bad bots
Via mod_rewrite
<Directory /> RewriteEngine On RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC,OR] RewriteCond %{HTTP_USER_AGENT} Baidu [NC] RewriteRule ^.*$ - [F,L] </Directory>
Via BrowserMatch
<Directory /> BrowserMatchNoCase "Baiduspider" bots BrowserMatchNoCase "HTTrack" bots BrowserMatchNoCase "Yandex" bots BrowserMatchNoCase "AhrefsBot" bots BrowserMatchNoCase "Pinterestbot" bots BrowserMatchNoCase "YandexImages" bots BrowserMatchNoCase "YandexBot" bots BrowserMatchNoCase "Facebot" bots BrowserMatchNoCase "DotBot" bots BrowserMatchNoCase "PetalBot" bots Order Allow,Deny Allow from ALL Deny from env=bots </Directory>
That's it save the file, and you are now blocking the Baidu spider
Testing to see if its blocked
One way to do this is to use curl
curl -I http://www.briansnelson.com -A "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
Now you will get a 403 Forbidden
HTTP/1.1 403 Forbidden Date: Mon, 06 Jan 2014 19:18:11 GMT