Go Back   vb.org Archive > vBulletin Modifications > vBulletin 4.x Modifications > vBulletin 4.x Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.2, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 4.x.x Rating:
Released: 08-08-2011 Last Update: 12-17-2014 Installs: 491
Uses Plugins
 
No support by the author.

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost....&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider4x.xml (30.8 KB, 469 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #642  
Old 12-10-2014, 10:29 AM
Black Snow Black Snow is offline
 
Join Date: Jul 2012
Location: Scotland
Posts: 471
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Max Taxable View Post
No sir we have been trying to solve the mystery of why Baidu gets through on some v4 installations, but not all and never a v3, and my hook conflict idea opened a new can of worms for investigation, and Ozz found something very interesting.
I never looked before but it is also getting through on my site.
Reply With Quote
Благодарность от:
Max Taxable
  #643  
Old 12-10-2014, 10:54 AM
CAG CheechDogg's Avatar
CAG CheechDogg CAG CheechDogg is offline
 
Join Date: Feb 2012
Location: Riverside, California USA
Posts: 1,080
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Baidu has kissed my "gluteus maximus" for almost 2 years and change if not more ... So has Yandex and a handful of others as well ... I must have a "magic" forum
Reply With Quote
  #644  
Old 12-10-2014, 11:04 AM
ozzy47's Avatar
ozzy47 ozzy47 is offline
 
Join Date: Jul 2009
Location: USA
Posts: 10,929
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I am re working a couple of things, and then need to test further, I can then share my findings with Simon.
Reply With Quote
2 благодарности(ей) от:
Black Snow, Max Taxable
  #645  
Old 12-10-2014, 01:14 PM
Gadget_Guy Gadget_Guy is offline
 
Join Date: Jun 2010
Posts: 271
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I would be happy to test on my site if it helps the community.

D.
Reply With Quote
  #646  
Old 12-10-2014, 03:55 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by ForceHSS View Post
What interesting thing was found
I don't wanna flap it, already flapped too much. But I am pretty sure the problem is solved.
Reply With Quote
  #647  
Old 12-11-2014, 02:27 AM
Gadget_Guy Gadget_Guy is offline
 
Join Date: Jun 2010
Posts: 271
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

If this helps anyone.... this is a list of what I am seeing in terms of spiders on my site with this installed.

Bing Spiders (6),
Google Favicon Spiders (9),
Proximic Spiders (135),
Baidu Spiders (175),
WinHTTP Spiders (12),
Facebook Spiders (20),
Google AdSense Spiders (7),
Magpie Spiders (9),
linkdexbot/2.0 Spiders (7),
AhrefsBot Spiders (14),
Coccoc Spiders (2),
Google AppEngine Spiders (6),
Google Spiders (40),
Sucuri Spiders (3),
Twitterbot Spiders (4),
Google FeedFetcher Spiders (3),
Apple RSS Spiders (1),
WordPress.com mShots Spiders (1),
Google Web Preview Spiders (3),
Grapeshot Spiders (2),
James BOT WebCrawler Spiders (5),
Netseer crawler/2.0 Spiders (2),
Google Images Spiders (3),
Galaxy Spiders (2),
Feedly Spiders (2),
DotBot Spiders (1),
Yahoo! Slurp Spiders (1),
360Spider Spiders (4),
Netcraft Web Server Survey Spiders (1),
NerdyBot Spiders (2),
Exabot Spiders (1),
Integrity Bot Spiders (1),
ContextAd Bot Spiders (2),
Twitturls.com (Python-urllib) Spiders (1)

I am happy to supply any information that you may find useful to assist in the work you are doing.

D.
Reply With Quote
  #648  
Old 12-11-2014, 05:34 AM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I need a snapshot of your settings for the mod as there is no way all those being entered in the mod would get past the mod!
Reply With Quote
  #649  
Old 12-11-2014, 08:09 AM
CAG CheechDogg's Avatar
CAG CheechDogg CAG CheechDogg is offline
 
Join Date: Feb 2012
Location: Riverside, California USA
Posts: 1,080
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

This is a snapshot of the spiders that are showing up in the whos online:



What exactly do you need a snapshot in the settings Simon?

This is my list of spiders I have banned with your mod:

almaden
Anarchie
Artabus
ASPSeek
attach
autoemailspider
BackWeb
Baidu
Bandit
BatchFTP
BlackWidow
Bot\mailto:craftbot@yahoo.com
Buddy
bumblebee
CherryPicker
ChinaClaw
CICC
Collector
Copier
Copyscape
Crescent
DIIbot
DISCo
DISCo\Pump
dotbot
Download\Demon
Download\Wonder
Downloader
Drip
DSurf15a
eCatch
EasyDL/2.99
EirGrabber
email
EmailCollector
EmailSiphon
EmailWolf
Express\WebPictures
ExtractorPro
EyeNetIE
FileHound
FlashGet
FrontPage
GetRight
GetSmart
GetWeb!
gigabaz
GNIP
Go\!Zilla
Go!Zilla
Go-Ahead-Got-It
gotit
Grabber
GrabNet
Grafula
grub-client
HMView
HTTrack
httpdown
.*httrack.*
ia_archiver
Ichiro
Image\Stripper
Image\Sucker
Indy*Library
Indy\Library
InterGET
InternetLinkagent
Internet\Ninja
InternetSeer.com
Iria
JBH*agent
JetCar
JOC\Web\Spider
JustView
larbin
LeechFTP
LexiBot
lftp
Link*Sleuth
likse
//Link
LinkWalker
Mag-Net
Magnet
Magpie
magpie
Mass\Downloader
Memo
Microsoft.URL
MIDown\tool
Mirror
Mister\PiX
Mozilla.*Indy
Mozilla.*NEWT
Mozilla*MSIECrawler
MS\FrontPage*
MSFrontPage
MSIECrawler
MSProxy
Navroad
NearSite
NetAnts
NetMechanic
NetSpider
Net\Vampire
NetZIP
NICErsPRO
Ninja
Nutch
Octopus
Offline\Explorer
Offline\Navigator
omgili
Openfind
PageGrabber
Papa\Foto
PaperLiBot
pavuk
pcBrowser
Ping
PingALink
Pockey
psbot
Pump
QRVA
RealDownload
Reaper
Recorder
ReGet
Scooter
Seeker
Siphon
sitecheck.internetseer.com
SiteSnagger
SlySearch
SmartDownload
Snake
sogou
Soso
SpaceBison
speedy
Spinn3r
sproose
Stripper
Sucker
SuperBot
SuperHTTP
Surfbot
Szukacz
tAkeOut
Teleport\Pro
URLSpiderPro
Vacuum
VoidEYE
Web\Image\Collector
Web\Sucker
WebAuto
[Ww]eb[Bb]andit
webcollage
WebCopier
Web\Downloader
WebEMailExtrac.*
WebFetch
WebGo\IS
WebHook
WebLeacher
WebMiner
WebMirror
WebReaper
WebSauger
Website
Website\eXtractor
Website\Quester
Webster
WebStripper
WebWhacker
WebZIP
Wget
Whacker
Widow
WWWOFFLE
x-Tractor
Xaldon\WebSpider
Xenu
Yandex
Yeti
YOUDAOBOT
Zeus.*Webster
Zeus
baiduspider
beta.statsit.com
statsit
SiteIntel
Yandex
GomezAgent
FunWebProducts
Nesotebot
DCPbot
AOL Advertising R&D
DataCha0s
aiHitBot
Apache-HttpClient
Zend_Http_Client
ReverseGet
XXX bot Content
vBSEO
spbot
OffByOne
thyroidbuzz
AcoonBot
coccoc
xpymep
proxyproxy2884
AppEngine
start.exe
Semiocast HTTP client
Firefox/3.6.23
TurnitinBot
curl
SwpLc/1.6
GrepNetstat.com
news bot
AskTbPTV
checks
panopta
App3le
PhantomJS
AlwaysOnline
SISTRIX
proximic
CRAWL-E/0.6.4
WebMoney
Maxthon
HTMLParser
oBot
UnisterBot
ERACrawler
Butterfly
Topsy
Butterfly Topsy Crawler
Ezooms
Deepnet
Alexa
Bitlybot
Seznam
Fulltext
Facebook
Sunrise Communications AG
crawl
Crawl
MJ12bot
Bimbot
Snapbot
thunderstone
Thunderstone
grub-client
Bing
MSN
OOZBOT
Wayback Machine
Crowsnest Spider
FlipboardProxy
Feedly
Attached Files
File Type: txt Cheech-banned-spiders-list.txt (2.9 KB, 4 views)
Reply With Quote
  #650  
Old 12-11-2014, 10:48 AM
Gadget_Guy Gadget_Guy is offline
 
Join Date: Jun 2010
Posts: 271
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Here is my stuff:
Attached Images
File Type: jpg snap1.jpg (122.7 KB, 0 views)
Attached Files
File Type: txt spider list.txt (3.2 KB, 8 views)
Reply With Quote
  #651  
Old 12-11-2014, 05:06 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Hi Gadget Guy, remove the second picture as it has your email address in it. I see the settings are ok, now can you just copy the list as you have it (copy straight out of the textbox in the mod) sitck it in a wordpad document, zip it and attach it here so i can check that please.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 11:38 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.12811 seconds
  • Memory Usage 2,378KB
  • Queries Executed 27 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (2)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (19)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (3)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (4)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete