Go Back   vb.org Archive > vBulletin Modifications > vBulletin 4.x Modifications > vBulletin 4.x Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.2, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 4.x.x Rating:
Released: 08-08-2011 Last Update: 12-17-2014 Installs: 491
Uses Plugins
 
No support by the author.

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost....&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider4x.xml (30.8 KB, 469 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #82  
Old 09-22-2011, 04:56 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by smirkley View Post
Still testing but I can say so far,... NICE !!

Thank you.


I am only banning 4 useragnts at the moment, but I wish to ask is there a condensed version of 'must ban' useragents off that list here, as compared to the whole list? I dont want to go crazy and ban too much especially if it hurts my membership or adsense rev.


So far I ban:

Baidu
Yeti
Twiceler
Yandex
99% of the chinese bots will bring no traffic so won't hurt your adsense revenue, on my other sites i ban ALL chinese bots as they index far too agressively, these are the ones i ban at my other sites:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Hope that helps you, but of course its a personal thing
Reply With Quote
  #83  
Old 09-22-2011, 05:07 PM
BadgerDog BadgerDog is offline
 
Join Date: Oct 2006
Location: Toronto
Posts: 1,789
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
if you want to pm me admin access details and url i'll take a look
Well, there's nothing really to look at except your settings ... (see pic)...

Are they correct?

Regards,
Doug
Attached Images
File Type: jpg Screen Shot 2011-09-22 at 2.03.38 PM.jpg (97.9 KB, 0 views)
Reply With Quote
  #84  
Old 09-22-2011, 05:14 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site
Reply With Quote
  #85  
Old 09-22-2011, 05:39 PM
smirkley smirkley is offline
 
Join Date: Apr 2008
Posts: 627
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
99% of the chinese bots will bring no traffic so won't hurt your adsense revenue, on my other sites i ban ALL chinese bots as they index far too agressively, these are the ones i ban at my other sites:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Hope that helps you, but of course its a personal thing
Thank you. Helps.
After checking my session expiration setting, and just watched the lil' critters disapear!

Will watch for the fix upcoming, and if al works after testing, will most certainly vote motm!
Reply With Quote
  #86  
Old 09-22-2011, 05:45 PM
BadgerDog BadgerDog is offline
 
Join Date: Oct 2006
Location: Toronto
Posts: 1,789
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site
It's set for default 20 minutes, but PaulM's guest mod is showing dozens of accesses (logins) from those bots that have occurred in the last 24 hours, so am I misunderstanding what this mod is supposed to do?

Shouldn't there be NO logins by Baidu and Yandex spiders for at least 23 hours ago, since this mod has been running with your corrected settings for days?

Thanks ..

Regards,
Doug
Reply With Quote
  #87  
Old 09-22-2011, 05:56 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

What you forget is that they have to attempt access to your site to get banned (redirected 301) so thats why Pauls mod is showing those to you, also bots don't access homepage then select a forum then select a thread, they just go straight for a thread (or post), so as soon as that happens Pauls mod will log them, but if you look at WOL are they there now?

I doubt it , Pauls mod is doing the job it's set out to, mine should be doing the job too, did you test that UA i gave above at the link i gave? If so what were the results?
Reply With Quote
  #88  
Old 09-22-2011, 06:01 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by smirkley View Post
Thank you. Helps.
After checking my session expiration setting, and just watched the lil' critters disapear!

Will watch for the fix upcoming, and if al works after testing, will most certainly vote motm!
I'm close to a fix for this but it will probably mean an additional php file to be uploaded as it seems that it can't work comfortably with the bots being redirected the moment they call the forum to load as it's leaving nothing for the notification to notify, all the others work comfortably together i.e Output.txt logging, email and create thread, it's just when you ban the bot you either ban it late which means it always will be seen in WOL or ban it early so it's very rarely seen there, it's the early bit thats causing the issue!
Reply With Quote
  #89  
Old 09-22-2011, 07:23 PM
smirkley smirkley is offline
 
Join Date: Apr 2008
Posts: 627
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site
Using this site and useragent tag to test, I get varying results.

1 - if I use just my home page (cms) it doesnt seem to be working. Not sure if this is even an issue really as my baidu bot count is nil now with this mod, maybe just doesnt work with cms.

2 - when I add the necessary /forums/ to my url on the test page, it seems to be working, but it redirects to google.com.hk (is that normal?)
Reply With Quote
  #90  
Old 09-22-2011, 07:29 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Right, it wont work with cms as thats outside of the /forum folder, and yes they are getting redirected to a chinese google
Reply With Quote
  #91  
Old 09-22-2011, 07:44 PM
smirkley smirkley is offline
 
Join Date: Apr 2008
Posts: 627
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
Right, it wont work with cms as thats outside of the /forum folder, and yes they are getting redirected to a chinese google
Ahh, ok that explains it then.

1 - Are there plans to make this work with the vB suite (ie-cms/forum/blog/groups,etc)?

2 - Can you when you are able, make it so the admin can set where they want the redirect to? (I would rather redirect to baidu themselves, I dont want to play mean with google as they can get real pissy if they were to not like it and track back the redirects. Dont want to be on googles bad side ya know)

3 - (and last I promise) Are the 'redirects' true permenant 301's by definition?
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 10:39 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.05672 seconds
  • Memory Usage 2,377KB
  • Queries Executed 27 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (7)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (2)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (16)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (1)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (2)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete