vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 4.x Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=245)
-   -   Miscellaneous Hacks - Ban Spiders by User Agent (https://vborg.vbsupport.ru/showthread.php?t=268208)

Simon Lloyd 09-22-2011 04:56 PM

Quote:

Originally Posted by smirkley (Post 2248977)
Still testing but I can say so far,... NICE !!

Thank you.


I am only banning 4 useragnts at the moment, but I wish to ask is there a condensed version of 'must ban' useragents off that list here, as compared to the whole list? I dont want to go crazy and ban too much especially if it hurts my membership or adsense rev.


So far I ban:

Baidu
Yeti
Twiceler
Yandex

99% of the chinese bots will bring no traffic so won't hurt your adsense revenue, on my other sites i ban ALL chinese bots as they index far too agressively, these are the ones i ban at my other sites:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Hope that helps you, but of course its a personal thing ;)

BadgerDog 09-22-2011 05:07 PM

1 Attachment(s)
Quote:

Originally Posted by Simon Lloyd (Post 2248993)
if you want to pm me admin access details and url i'll take a look :)

Well, there's nothing really to look at except your settings ... (see pic)...

Are they correct?

Regards,
Doug

Simon Lloyd 09-22-2011 05:14 PM

That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site :)

smirkley 09-22-2011 05:39 PM

Quote:

Originally Posted by Simon Lloyd (Post 2248995)
99% of the chinese bots will bring no traffic so won't hurt your adsense revenue, on my other sites i ban ALL chinese bots as they index far too agressively, these are the ones i ban at my other sites:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Hope that helps you, but of course its a personal thing ;)

Thank you. Helps.
After checking my session expiration setting, and just watched the lil' critters disapear!

Will watch for the fix upcoming, and if al works after testing, will most certainly vote motm!

BadgerDog 09-22-2011 05:45 PM

Quote:

Originally Posted by Simon Lloyd (Post 2249002)
That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site :)

It's set for default 20 minutes, but PaulM's guest mod is showing dozens of accesses (logins) from those bots that have occurred in the last 24 hours, so am I misunderstanding what this mod is supposed to do?

Shouldn't there be NO logins by Baidu and Yandex spiders for at least 23 hours ago, since this mod has been running with your corrected settings for days?

Thanks .. :)

Regards,
Doug

Simon Lloyd 09-22-2011 05:56 PM

What you forget is that they have to attempt access to your site to get banned (redirected 301) so thats why Pauls mod is showing those to you, also bots don't access homepage then select a forum then select a thread, they just go straight for a thread (or post), so as soon as that happens Pauls mod will log them, but if you look at WOL are they there now?

I doubt it :), Pauls mod is doing the job it's set out to, mine should be doing the job too, did you test that UA i gave above at the link i gave? If so what were the results?

Simon Lloyd 09-22-2011 06:01 PM

Quote:

Originally Posted by smirkley (Post 2249012)
Thank you. Helps.
After checking my session expiration setting, and just watched the lil' critters disapear!

Will watch for the fix upcoming, and if al works after testing, will most certainly vote motm!

I'm close to a fix for this but it will probably mean an additional php file to be uploaded as it seems that it can't work comfortably with the bots being redirected the moment they call the forum to load as it's leaving nothing for the notification to notify, all the others work comfortably together i.e Output.txt logging, email and create thread, it's just when you ban the bot you either ban it late which means it always will be seen in WOL or ban it early so it's very rarely seen there, it's the early bit thats causing the issue!

smirkley 09-22-2011 07:23 PM

Quote:

Originally Posted by Simon Lloyd (Post 2249002)
That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site :)

Using this site and useragent tag to test, I get varying results.

1 - if I use just my home page (cms) it doesnt seem to be working. Not sure if this is even an issue really as my baidu bot count is nil now with this mod, maybe just doesnt work with cms.

2 - when I add the necessary /forums/ to my url on the test page, it seems to be working, but it redirects to google.com.hk (is that normal?)

Simon Lloyd 09-22-2011 07:29 PM

Right, it wont work with cms as thats outside of the /forum folder, and yes they are getting redirected to a chinese google :):)

smirkley 09-22-2011 07:44 PM

Quote:

Originally Posted by Simon Lloyd (Post 2249031)
Right, it wont work with cms as thats outside of the /forum folder, and yes they are getting redirected to a chinese google :):)

Ahh, ok that explains it then.

1 - Are there plans to make this work with the vB suite (ie-cms/forum/blog/groups,etc)?

2 - Can you when you are able, make it so the admin can set where they want the redirect to? (I would rather redirect to baidu themselves, I dont want to play mean with google as they can get real pissy if they were to not like it and track back the redirects. Dont want to be on googles bad side ya know)

3 - (and last I promise) Are the 'redirects' true permenant 301's by definition?


All times are GMT. The time now is 12:35 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01578 seconds
  • Memory Usage 1,756KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (7)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (2)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (10)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete