Go Back   vb.org Archive > vBulletin Modifications > vBulletin 4.x Modifications > vBulletin 4.x Add-ons
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.2, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 4.x.x Rating:
Released: 08-08-2011 Last Update: 12-17-2014 Installs: 491
Uses Plugins
 
No support by the author.

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost....&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider4x.xml (30.8 KB, 469 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #652  
Old 12-11-2014, 05:12 PM
Gadget_Guy Gadget_Guy is offline
 
Join Date: Jun 2010
Posts: 271
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

That is what the spiders.txt file I attached is.

D.
Reply With Quote
  #653  
Old 12-11-2014, 05:17 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by CAG CheechDogg View Post
This is a snapshot of the spiders that are showing up in the whos online:



What exactly do you need a snapshot in the settings Simon?

This is my list of spiders I have banned with your mod:

..........................................
There's one or two duplicates there but that doesn't matter, however rather than ban baiduspider just ban baidu, you'll have more luck with that as not all baiduspiders have the entire name in the UA, that goes for most of the bots, lets say there's a bot called Simon Lloyd Crawl Everything Everywherespider then the following will ban it:
Simon or Lloyd or Crawl or Spider.....etc
The same goes for:
Lloyd Crawl or Everything or Simon Lloyd....etc (case isnt important)

What the mod does is look for the string you entered, so if you want to ban the spider i mentioned above just Simon will do it, howevere lets say you have a friendly bot called Simon Lloyd Crawled Everything Everywherespider then to ban the first bot and allow the other you'd need to enter a string that is unique to the first one so in this case i could be:
Simon Lloyd Crawl Everything
This way it wont pick up the "Crawl" in the friendly bots name as its looking for the exact string you entered.

Hope that helps.
Reply With Quote
  #654  
Old 12-11-2014, 05:22 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Gadget_Guy View Post
That is what the spiders.txt file I attached is.

D.
I need it as asked for, the reason for this is to check for machine charaters and/or leading/trailing spaces, hard returns.....etc
Reply With Quote
  #655  
Old 12-11-2014, 05:25 PM
Gadget_Guy Gadget_Guy is offline
 
Join Date: Jun 2010
Posts: 271
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
What the mod does is look for the string you entered, so if you want to ban the spider i mentioned above just Simon will do it, howevere lets say you have a friendly bot called Simon Lloyd Crawled Everything Everywherespider then to ban the first bot and allow the other you'd need to enter a string that is unique to the first one so in this case i could be:
Simon Lloyd Crawl Everything
This way it wont pick up the "Crawl" in the friendly bots name as its looking for the exact string you entered.

Hope that helps.

This COULD be the issue I have with mine then. When you look at my txt file you will see that I tried putting in multiple variations. That could be negating the effectiveness.

Maybe as part of the mod could be an updated list that we can copy/paste so that people like me who are clueless don't do the wrong thing.

Keeping in mind that we would want the "good" spiders to get through like google, bing, and the legit ones that are important to SEO, Adsense, and other things like that.

I hope you and Ozzy are disscusing the hook thing as well... he seemed to think that may be important with my 4.2.2 site.

I will say that when I had this mod in place for my 3.8.x site it worked perfectly and I didn't get hit hard till I upgraded to 4.2.2

I saw my server loads go way up....

D.
Reply With Quote
  #656  
Old 12-11-2014, 05:32 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

There is already a list included and many throughout this thread, i've also explained the above before. I wouldn't update the list of spiders to ban as i've said probably over a dozen times it's a personal thing on what or who you ban.

If you want to pm me access as i've said before i'll take a look.
Reply With Quote
  #657  
Old 12-11-2014, 05:40 PM
Gadget_Guy Gadget_Guy is offline
 
Join Date: Jun 2010
Posts: 271
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Here you go.
Attached Files
File Type: zip tscspider.zip (1.9 KB, 5 views)
Reply With Quote
  #658  
Old 12-11-2014, 06:11 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Right, i've been through your list, i wont comment on the bots you are banning as thats your preference, what i ahve done is ordered the list, checked for anything that shouldn't be there and removed some bots as they will be taken care of by other entries you have.

What i will say is if you are NOT using Paul Ms "Who has visited" mod and you are still seeing any of the bots on your list appear in WOL then you need to check that spiders UserAgent to see if the name or text you have in your list actually appears in the UA.
Attached Files
File Type: txt gadget_guy_bot_list.txt (2.6 KB, 6 views)
Reply With Quote
  #659  
Old 12-11-2014, 06:48 PM
CAG CheechDogg's Avatar
CAG CheechDogg CAG CheechDogg is offline
 
Join Date: Feb 2012
Location: Riverside, California USA
Posts: 1,080
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
There's one or two duplicates there but that doesn't matter, however rather than ban baiduspider just ban baidu, you'll have more luck with that as not all baiduspiders have the entire name in the UA, that goes for most of the bots, lets say there's a bot called Simon Lloyd Crawl Everything Everywherespider then the following will ban it:
Simon or Lloyd or Crawl or Spider.....etc
The same goes for:
Lloyd Crawl or Everything or Simon Lloyd....etc (case isnt important)

What the mod does is look for the string you entered, so if you want to ban the spider i mentioned above just Simon will do it, howevere lets say you have a friendly bot called Simon Lloyd Crawled Everything Everywherespider then to ban the first bot and allow the other you'd need to enter a string that is unique to the first one so in this case i could be:
Simon Lloyd Crawl Everything
This way it wont pick up the "Crawl" in the friendly bots name as its looking for the exact string you entered.

Hope that helps.
No no ...I am fine Simon, I have "ZERO" traces of baidu ... I think when you asked for the settings and a shot you were asking Gadget Guy and not me ... but I am fine, I have no problems with the mod not blocking any of the bots at all ... Thank you !!!!
Reply With Quote
  #660  
Old 12-11-2014, 07:35 PM
Gadget_Guy Gadget_Guy is offline
 
Join Date: Jun 2010
Posts: 271
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Do you mean this one:

https://vborg.vbsupport.ru/showthread.php?t=232636


Then, yes... I am using it.

D.
Reply With Quote
  #661  
Old 12-11-2014, 07:43 PM
Gadget_Guy Gadget_Guy is offline
 
Join Date: Jun 2010
Posts: 271
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
Right, i've been through your list, i wont comment on the bots you are banning as thats your preference, what i ahve done is ordered the list, checked for anything that shouldn't be there and removed some bots as they will be taken care of by other entries you have.

What i will say is if you are NOT using Paul Ms "Who has visited" mod and you are still seeing any of the bots on your list appear in WOL then you need to check that spiders UserAgent to see if the name or text you have in your list actually appears in the UA.
I have implemented your list.

In regards to your comment about "my list".... I have no idea to be honest.

Those I put in there based on what I was seeing in my WOL and putting things in there to try and block them.

I am sure I was way off base and incorrect in doing so.

So.... in light of this... if you want to provide a "proper" list... I am happy to take your guidance.

I don't know the first thing about any of this stuff and am looking to experts like yourself to assist.

D.
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 04:14 AM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.05805 seconds
  • Memory Usage 2,367KB
  • Queries Executed 27 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (5)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (16)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (1)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (3)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete