The Arcive of Official vBulletin Modifications Site.It is not a VB3 engine, just a parsed copy! |
|
#21
|
||||
|
||||
![]()
a lot of bots dont recognise robots.txt but follow some of the advice here http://antezeta.com/news/avoid-search-engine-indexing. For .htaccess use this
Code:
RewriteEngine On RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|Baiduspider|yandex|anywordyoulike|like|bing) [NC] RewriteRule .* - [R=403,L] Your .htaccess file must reside in the forum root for this (you might have to set your control panel to view hidden files if you dont see it) |
#22
|
|||
|
|||
![]()
1. i dont want block all useragent especially not google....
i need to block them from only 1101 forumdisplay what ive done with this robot.txt line (hope this is correct): Disallow: /*forumdisplay.php?f=1101&order=desc&page=* but this is the emergency solution because its not a real fix of the bad performance of very big forumdisplay pages. and how i see no one can realy help here . and i allready use your ban spider addon. thanks for it, but we have sometimes problem with it. if server crashes the cloudflare.com 502 error page is shown, after user refresh page. user get blocked/redirected from this addon. you know why? |
#23
|
||||
|
||||
![]()
It will be to do with cloudflare caching i would imagine as my mod doesn't store any user/visitor details. As for your search engines try this, put this in your header template (or even forumdisplay but i think it must always use the header template)
HTML Code:
<if condition="in_array($forumid, array(X,Y,Z)) AND "$show['search_engine']"> <meta HTTP-EQUIV="REFRESH" content="0; url=http://www.mysite.com"> </if> ![]() |
#24
|
|||
|
|||
![]()
just tested htaccess code on the wordpress site on same server. connections/sec dropped from 200 to 10
with blocking this bastards: Baiduspider|yandex|anywordyoulike|like so my robot.txt code not allways work you say? seems to. but i will test your code tomorrow and check querys. thanks, |
#25
|
||||
|
||||
![]()
I put these in anywordyoulike|like to show you can include anything in the string, they're not actual bots
![]() |
#26
|
|||
|
|||
![]()
so baidu ignore robots.txt?
|
#27
|
||||
|
||||
![]()
It seems that way as do many others like AhrefsBot, sosospider, Aboundex and even Bing to name but a few!
--------------- Added [DATE]1403918146[/DATE] at [TIME]1403918146[/TIME] --------------- For a more complete .htaccess block look here http://wpsecure.net/bad-bot-list/ |
#28
|
|||
|
|||
![]()
i test it . SetEnvIfNoCase or rewrite Rules better?
both codes from the link dont work i think coz i see in cloudflare baidu still crawling.... first rewrite gives an error. can i use this bots listed there for your write here? or any better list. in your plugin ive a very big list but i think it will not work with that list because there are not complete spider names RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|Baiduspider|yandex|bing) [NC] RewriteRule .* - [R=403,L] |
![]() |
|
|
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
![]() |
|
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|