Go Back   vb.org Archive > vBulletin Modifications > vBulletin 4.x Modifications > vBulletin 4.x Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.2, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 4.x.x Rating:
Released: 08-08-2011 Last Update: 12-17-2014 Installs: 491
Uses Plugins
 
No support by the author.

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost....&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider4x.xml (30.8 KB, 469 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #312  
Old 03-28-2012, 07:50 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Alan_SP View Post
Thanks for the info. Before I didn't noticed this. I'll wait till it shows again (hopefully never again).

I also installed vB Bad Behavior: https://vborg.vbsupport.ru/showthread.php?t=261498

EDIT: I found this info about it:

Mozilla/5.0 (compatible; MJ12bot/v1.4.2; http://www.majestic12.co.uk/bot.php?+)

I now use only this string in spider list settings: MJ12bot.

I hope it will stop it, maybe Majestics MJ12bot was too much
On separate lines.

But I am curious - why do you want to ban this bot? It's one of the better behaved ones out there and it doesn't flood. It obeys robots.txt as well.

How can I block MJ12bot?

MJ12bot adheres to the robots.txt standard. If you want the bot to prevent website from being crawled then add the following text to your robots.txt:

User-agent: MJ12bot
Disallow: /
Reply With Quote
  #313  
Old 03-28-2012, 09:49 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Alan_SP View Post
I hope it will stop it, maybe Majestics MJ12bot was too much
Yes it was because it looks for the entire string you entered and as that doesn't appear in the UA it doesn't get banned

BTW, Max is dead right
Reply With Quote
  #314  
Old 03-29-2012, 06:01 PM
Alan_SP's Avatar
Alan_SP Alan_SP is offline
 
Join Date: Nov 2009
Posts: 1,122
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Because I don't have use of spiders that no one is using. At least I don't know anyone that uses Majestics.

People usually use Google, Bing, Facebook... Other search engines or spiders don't interest me, as they don't interest my users. Same goes for Baidu or similar search engines.

Does people use Majestics? And for what purposes?
Reply With Quote
  #315  
Old 03-29-2012, 09:04 PM
Baf_Jams Baf_Jams is offline
 
Join Date: Mar 2008
Location: Derby UK
Posts: 106
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Installed Thanks
Reply With Quote
  #316  
Old 03-29-2012, 09:36 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Alan_SP View Post
Because I don't have use of spiders that no one is using. At least I don't know anyone that uses Majestics.

People usually use Google, Bing, Facebook... Other search engines or spiders don't interest me, as they don't interest my users. Same goes for Baidu or similar search engines.

Does people use Majestics? And for what purposes?
The link you posted earlier nicely explains it.

It is a friendly, well behaved bot that helps your presence on the web. Baidu is none of these, it is a aggressive, leeching, unfriendly bot attached to a Chinese search engine. There's no comparison between the two.

Ban whatever bots you like, no one's telling you not to. But MJ12 isn't hurting you, it's helping you and it is friendly.
Reply With Quote
  #317  
Old 03-30-2012, 04:43 PM
Alan_SP's Avatar
Alan_SP Alan_SP is offline
 
Join Date: Nov 2009
Posts: 1,122
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Max Taxable View Post
But MJ12 isn't hurting you, it's helping you and it is friendly.
Thanks for your explanation. It's helpful, not with just this post.
Reply With Quote
  #318  
Old 04-14-2012, 10:33 AM
S_E_A S_E_A is offline
 
Join Date: Nov 2010
Posts: 37
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Thank you for a great mod, Simon.

I want to block Deepnet Explorer Spiders. To do this I enter 'Deepnet Explorer' into the spide list? It's okay to block Deepnet Explorer?
Reply With Quote
  #319  
Old 04-14-2012, 10:43 AM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Blocking spiders is all about personal choice, do a little research and find out whether you want to cater for that country and whether they add value to your site!, when Deepnet Explorer are visiting go to who's online and at the bottom there's a dropdown box for "Show Useragent?" select Yes, then check out their useragent, you can enter any or all of the UA string, so if they actually do have Deepnet in the UA then you just enter that on its own line in the list
Reply With Quote
2 благодарности(ей) от:
Alan_SP, Max Taxable
  #320  
Old 04-14-2012, 01:48 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

My current (updated) list of banned user agents entered into this Mod:

baiduspider
beta.statsit.com
statsit
SiteIntel
Yandex
GomezAgent
FunWebProducts
MSIE 1
MSIE 2
MSIE 3
MSIE 4
MSIE 5
MSIE 6
Nesotebot
DCPbot
Opera/1
Opera/2
Opera/3
Opera/4
Opera/5
Opera/6
Opera/7
Opera/8
AOL Advertising R&D
DataCha0s
aiHitBot
Apache-HttpClient
Zend_Http_Client
ReverseGet
Reply With Quote
Благодарность от:
Alan_SP
  #321  
Old 04-18-2012, 06:32 PM
ForceHSS ForceHSS is offline
 
Join Date: Apr 2008
Posts: 6,357
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

xpymep.exe
start.exe
I seen these two as hosts so I added them to the list look strange
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 07:49 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.07711 seconds
  • Memory Usage 2,368KB
  • Queries Executed 29 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (4)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (19)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (3)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (1)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete