Go Back   vb.org Archive > vBulletin Modifications > vBulletin 4.x Modifications > vBulletin 4.x Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.2, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 4.x.x Rating:
Released: 08-08-2011 Last Update: 12-17-2014 Installs: 491
Uses Plugins
 
No support by the author.

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost....&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider4x.xml (30.8 KB, 469 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #302  
Old 03-21-2012, 01:26 AM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

That dude has alot of garbage on his computer.
Reply With Quote
  #303  
Old 03-21-2012, 07:56 AM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Use this tool http://user-agent-string.info/parse it will breakdown the UA in to its component parts
Reply With Quote
  #304  
Old 03-21-2012, 09:08 AM
BadgerDog BadgerDog is offline
 
Join Date: Oct 2006
Location: Toronto
Posts: 1,789
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
Use this tool http://user-agent-string.info/parse it will breakdown the UA in to its component parts
Thank you Simon...

Very useful ... :up:

Regards,
Doug
Reply With Quote
  #305  
Old 03-21-2012, 02:59 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
Use this tool http://user-agent-string.info/parse it will breakdown the UA in to its component parts
U da Man Simon.
Reply With Quote
  #306  
Old 03-21-2012, 04:18 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Lol, thanks!, i try
Reply With Quote
  #307  
Old 03-22-2012, 01:27 AM
baileyjojoms baileyjojoms is offline
 
Join Date: Mar 2011
Posts: 29
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by BadgerDog View Post
Do you have a link?

I can't seem to find the right page....

Thanks ..

Regards,
Doug
Yes, I ensured that the following was in my robots.txt file:

User-agent: Baiduspider
Disallow: /


Then I sent an email to: spiderhelp@baidu.com

Here is the message and reply I received:


Quote:
Dear,

Thank you for your email.
We have updated our DNS record to make our spider behave the way requested in your robots file.
Should you need further assistance, please do not hesitate to contact us.

Best Regards,
Stephy Wu
Baidu Spider Team

________________________________________
re: Continuous Crawling of my site

To whom it may concern;

I have been trying for a month now to halt all crawling of my site by Baidu. I have added the following code to my robots.txt file:
User-agent: Baiduspider
Disallow: /

This was done 3 weeks ago. However I am being crawled daily.

Baidu is daily eating up a ton of Server Resources, and costing me slow load times. I also employed a spider ban modification, and have banned more than 28,000 Baidu spider entries in 3 weeks.

This is ridiculous. I am asking you to immediately halt all crawling of my site by Baidu.
I have not seen hide nor hair of Baidu since this was done, nearly a month ago.

To find the email address I went to their website, translated the page into English, and the searched Baidu Spider. Which took me to a search results page, which lead me to this page:
http://www.baidu.com/search/spider.html

I simply translated to English, and found the info I was looking for.

Baidu was the ONLY spider that was causing major issues, now I am able to use this add-on for other spiders - but Baidu was using massive amounts of resources.

Hope this helps.
Reply With Quote
  #308  
Old 03-22-2012, 10:42 AM
BadgerDog BadgerDog is offline
 
Join Date: Oct 2006
Location: Toronto
Posts: 1,789
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by baileyjojoms View Post

Hope this helps.
Yes, thank you very much ...

Regards,
Doug
Reply With Quote
  #309  
Old 03-28-2012, 04:40 PM
Alan_SP's Avatar
Alan_SP Alan_SP is offline
 
Join Date: Nov 2009
Posts: 1,122
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I have problems with Majestics MJ12bot. I tried to redirect bad spiders to own IP, or to HTML address given in mod. AFAIK all spiders other than Majestics MJ12bot are gone (here and there are new ones, but I remove them).

Do you know why this spider is successful in avoiding this mod?
Reply With Quote
  #310  
Old 03-28-2012, 05:08 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

When the spider appears next click the who's online, at the bottom choose show useragent and copy their entire UA string to the list
Reply With Quote
Благодарность от:
Alan_SP
  #311  
Old 03-28-2012, 06:07 PM
Alan_SP's Avatar
Alan_SP Alan_SP is offline
 
Join Date: Nov 2009
Posts: 1,122
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Thanks for the info. Before I didn't noticed this. I'll wait till it shows again (hopefully never again).

I also installed vB Bad Behavior: https://vborg.vbsupport.ru/showthread.php?t=261498

EDIT: I found this info about it:

Mozilla/5.0 (compatible; MJ12bot/v1.4.2; http://www.majestic12.co.uk/bot.php?+)

I now use only this string in spider list settings: MJ12bot.

I hope it will stop it, maybe Majestics MJ12bot was too much
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 11:32 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.18637 seconds
  • Memory Usage 2,364KB
  • Queries Executed 27 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (5)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (17)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (2)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (1)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete