Go Back   vb.org Archive > vBulletin Modifications > vBulletin 4.x Modifications > vBulletin 4.x Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.2, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 4.x.x Rating:
Released: 08-08-2011 Last Update: 12-17-2014 Installs: 491
Uses Plugins
 
No support by the author.

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost....&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider4x.xml (30.8 KB, 469 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #522  
Old 11-09-2013, 07:55 PM
Toxic2 Toxic2 is offline
 
Join Date: Jul 2009
Posts: 82
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Simon can you come up with a Viable List of Bad UA's more appropriate for 2013? MSIE1 blocked one of my members, so i deleted MSIE1 from the List.
Reply With Quote
  #523  
Old 11-09-2013, 08:25 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Hi Toxic2, banning UAs is a personal thing, you have to choose which to ban or not, throughout this thread (and the other vb versions of this mod) there are lists, there's a maintained list of spiders a vb.com in a thread by Mosh so pretty much all of the crawlers will be listed there (a download from his site) and you just choose which you want to ban by unveiling their UA when they visit your site.
Reply With Quote
Благодарность от:
Toxic2
  #524  
Old 11-09-2013, 09:55 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

This Mod's default list is a pretty dog gone good list, but like Simon says it might not meet your needs. It depends on your target audience countries, your preferences, all that.
Reply With Quote
Благодарность от:
Simon Lloyd
  #525  
Old 11-09-2013, 10:09 PM
bzcomputers's Avatar
bzcomputers bzcomputers is offline
 
Join Date: Apr 2012
Location: TX
Posts: 503
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Toxic2 View Post
Simon can you come up with a Viable List of Bad UA's more appropriate for 2013? MSIE1 blocked one of my members, so i deleted MSIE1 from the List.
Removing MSIE1 is about the only thing that would definitely be recommended to remove from the old list for this mod. MSIE1 will block users on IE10 and IE11 (which means A LOT of users and growing daily). I haven't seen anything prior to IE5 try to access my sites in a long time, so removing MSIE1 should not negatively impact your site in any way either (but blocking IE10 & 11 will!).

As was mentioned above it is a personal choice on what you block. There are thousands of sites out there that don't block any and are doing just fine. You need to figure out what's important for your site. Being hit hundreds of times a day from bots from regions of the world you probably don't even cater too like China or countries of the former Soviet Union? or Blocking those bots and in turn reducing server load and load times for those users you do want to cater too?

For instance if you block IE6 (MSIE6) you will currently block over 20% of Chinese internet users (average everywhere else in the World is less than 1% using IE6). When you consider China has 1.3 billion people that is a lot of potential website traffic. At the same time China also accounted for nearly 40% of the world's computer hacking attack traffic during 2012. If you have content that is viable for Chinese users which can possibly turn into potential profit (sales / ad clicks) for you then blocking MSIE6 and chinese specific bots like baiduspider, iaskspider, sogou spider, Sosospider, YoudaoBot and others is probably not recommended.

There are plenty of good resources of information in this thread and other places on the net on useragents. Just remember copying someones elses useragent blocking list directly and not specifically taking a little time and tweaking it for your own needs is almost guaranteed to negatively impact your site traffic.

Here are a couple of additional resources:
http://www.useragentstring.com/pages/Crawlerlist/
http://www.user-agents.org/
Reply With Quote
3 благодарности(ей) от:
Max Taxable, Simon Lloyd, Toxic2
  #526  
Old 11-09-2013, 10:18 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by bzcomputers View Post
Removing MSIE1 is about the only thing that would definitely be recommended to remove from the old list for this mod. MSIE1 will block users on IE10 and IE11 (which means A LOT of users and growing daily).
I definitely agree with this.
Reply With Quote
Благодарность от:
Toxic2
  #527  
Old 11-10-2013, 07:29 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

If you really want to block MSIE1 then enter it exactly as that plus a leading space, if I remember rightly I've allowed it to accept spaces rather than ignore them, so doing it like that will not (or shouldn't) ban MSIE10.
Reply With Quote
  #528  
Old 11-10-2013, 10:04 PM
Alan_SP's Avatar
Alan_SP Alan_SP is offline
 
Join Date: Nov 2009
Posts: 1,122
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I have similar problem, it might be solved now.

But anyway, just to ask it, what other than blocking MSIE 1 could block legitimate users in this list? If anyone knows:

Quote:
AhrefsBot
www.archive.org
gooblogsearch
seoprofiler.com
WinHttp
exabot
commoncrawl
80legs
TITAN
Charlotte
http://labs.topsy.com/butterfly/
MSRBOT
Indy Library
Plukkie
Goo Blog Search
SeznamBot
T312461
PostRank
Tweetmeme.com
Twitterbot
WordPress
Windows Live SOEE
MLBot
PycURL
ScoutJet
Ezooms
Deepnet Explorer
Mail.Ru
Majestics MJ12bot
almaden
Anarchie
Artabus
ASPSeek
attach
autoemailspider
BackWeb
Baidu
Bandit
BatchFTP
BlackWidow
Bot\mailto:craftbot@yahoo.com
Buddy
bumblebee
CherryPicker
ChinaClaw
CICC
Collector
Copier
Copyscape
Crescent
DIIbot
DISCo
DISCo\Pump
dotbot
Download\Demon
Download\Wonder
Downloader
Drip
DSurf15a
eCatch
EasyDL/2.99
EirGrabber
email
EmailCollector
EmailSiphon
EmailWolf
Express\WebPictures
ExtractorPro
EyeNetIE
FileHound
FlashGet
FrontPage
GetRight
GetSmart
GetWeb!
gigabaz
GNIP
Go\!Zilla
Go!Zilla
Go-Ahead-Got-It
gotit
Grabber
GrabNet
Grafula
grub-client
HMView
HTTrack
httpdown
.*httrack.*
ia_archiver
Ichiro
Image\Stripper
Image\Sucker
Indy*Library
Indy\Library
InterGET
InternetLinkagent
Internet\Ninja
InternetSeer.com
Iria
JBH*agent
JetCar
JOC\Web\Spider
JustView
larbin
LeechFTP
LexiBot
lftp
Link*Sleuth
likse
//Link
LinkWalker
MSIE 2
MSIE 3
MSIE 4
MSIE 5
Mag-Net
Magnet
Magpie
Mass\Downloader
Memo
Microsoft.URL
MIDown\tool
Mirror
Mister\PiX
Mozilla.*Indy
Mozilla.*NEWT
Mozilla*MSIECrawler
MS\FrontPage*
MSFrontPage
MSIECrawler
MSProxy
Navroad
NearSite
NetAnts
NetMechanic
NetSpider
Net\Vampire
NetZIP
NICErsPRO
Ninja
Nutch
Octopus
Offline\Explorer
Offline\Navigator
omgili
Openfind
PageGrabber
Papa\Foto
pavuk
pcBrowser
Ping
PingALink
Pockey
psbot
Pump
QRVA
RealDownload
Reaper
Recorder
ReGet
Scooter
Seeker
Siphon
sitecheck.internetseer.com
SiteSnagger
SlySearch
SmartDownload
Snake
sogou
Soso
SpaceBison
speedy
Spinn3r
sproose
Stripper
Sucker
SuperBot
SuperHTTP
Surfbot
Szukacz
tAkeOut
Teleport\Pro
URLSpiderPro
Vacuum
VoidEYE
Web\Image\Collector
Web\Sucker
WebAuto
[Ww]eb[Bb]andit
webcollage
WebCopier
Web\Downloader
WebEMailExtrac.*
WebFetch
WebGo\IS
WebHook
WebLeacher
WebMiner
WebMirror
WebReaper
WebSauger
Website
Website\eXtractor
Website\Quester
Webster
WebStripper
WebWhacker
WebZIP
Wget
Whacker
Widow
WWWOFFLE
x-Tractor
Xaldon\WebSpider
Xenu
Yandex
Yeti
YOUDAOBOT
Zeus.*Webster
Zeus
Reply With Quote
  #529  
Old 11-10-2013, 10:27 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Alan_SP View Post
what other than blocking MSIE 1 could block legitimate users in this list? If anyone knows:
Blocking IE 1-6 won't block any legitimate humans, it will block 90% of the botnet zombie computers out there.
Reply With Quote
  #530  
Old 11-11-2013, 11:40 PM
Alan_SP's Avatar
Alan_SP Alan_SP is offline
 
Join Date: Nov 2009
Posts: 1,122
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Actually, if you use string "MSIE 1" it blocks users of MSIE 10 and 11, as product looks for string anywhere, not as a whole word. And, if I'm not mistaken, to block only MSIE 1, you should use string MSIE 1,0.

I had many regular users who couldn't reach my site, probably all of them use MSIE 10 or 11.
Reply With Quote
  #531  
Old 11-12-2013, 12:11 AM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Alan_SP View Post
Actually, if you use string "MSIE 1" it blocks users of MSIE 10 and 11, as product looks for string anywhere, not as a whole word. And, if I'm not mistaken, to block only MSIE 1, you should use string MSIE 1,0.

I had many regular users who couldn't reach my site, probably all of them use MSIE 10 or 11.
The developer covered that here.

But I agree with you, there's really no reason to have IE1 on the ban list, there very likely isn't a functioning computer anywhere that is also online, that runs it.

IE6 is the biggest botnet zombie browser though, and there are MILLIONS of them still online.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 02:04 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.10972 seconds
  • Memory Usage 2,380KB
  • Queries Executed 27 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (5)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (22)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (5)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (1)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete