Go Back   vb.org Archive > vBulletin Modifications > vBulletin 3.8 Modifications > vBulletin 3.8 Add-ons

Reply
 
Thread Tools
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.3, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 3.8.x Rating:
Released: 06-08-2011 Last Update: 12-17-2014 Installs: 137
Supported Uses Plugins
 

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

VB4.x Version of Ban Spiders By User Agent

Tested on vb3.7.x, vB3.8.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.2 uploaded, due to rogue process from other mod
18th December 2014 Version 3.1.3 uploaded, due to previous one being VB4 mistakingly uploaded

The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider.xml (30.5 KB, 146 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #72  
Old 02-23-2012, 10:56 AM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

You're welcome
Reply With Quote
  #73  
Old 02-23-2012, 12:14 PM
Lee G Lee G is offline
 
Join Date: Jun 2006
Location: Costa Blanca
Posts: 143
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Dont know if its listed in your bad user agent list, one of the worst I have found is OMGILI
User agent
omgilibot/0.3 +http://www.omgili.com/Crawler.html

If you right click on any page on the search engine site, you will see that all external links are no follow

They were hitting me for over 200 pages a day until I put an ip ban on them
Reply With Quote
  #74  
Old 02-23-2012, 02:41 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Lee G View Post
Dont know if its listed in your bad user agent list, one of the worst I have found is OMGILI
User agent
omgilibot/0.3 +http://www.omgili.com/Crawler.html

If you right click on any page on the search engine site, you will see that all external links are no follow

They were hitting me for over 200 pages a day until I put an ip ban on them
Hmm, it's not added there but folk can tailor their own, i only gave a starting base, i'm sure many have now customised it

I can bet that omgili aren't as bad as Baidu
Reply With Quote
  #75  
Old 02-23-2012, 03:01 PM
Lee G Lee G is offline
 
Join Date: Jun 2006
Location: Costa Blanca
Posts: 143
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

For some reason I get zero baidu hits
Either cloudflare block them or they take note of robots txt
Putting ip bans into the cf control panel saves killing my own server with htaccess bans

There seems to be a mass of junk with no user agent hitting from AmazonAWS at present
I need to find out if anything good comes from their servers, then consider a complete ban on all their ips. Any user agent being tracked is littered with AmazonAWS ips
Reply With Quote
  #76  
Old 03-14-2012, 03:43 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Lee G View Post
For some reason I get zero baidu hits
Either cloudflare block them or they take note of robots txt
Putting ip bans into the cf control panel saves killing my own server with htaccess bans

There seems to be a mass of junk with no user agent hitting from AmazonAWS at present
I need to find out if anything good comes from their servers, then consider a complete ban on all their ips. Any user agent being tracked is littered with AmazonAWS ips
Neither happens. I have CF and robots.txt.

AmazonAWS is a friendly spider, from what I've seen over the years.

And Simon - here I am, marked installed.
Reply With Quote
  #77  
Old 03-15-2012, 04:38 PM
Lee G Lee G is offline
 
Join Date: Jun 2006
Location: Costa Blanca
Posts: 143
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

AmazonAWS cloud hosting

Full range of AmazonAWS ip's
http://www.forumpostersunion.com/showthread.php?t=10490

Some of the bad bots being run out of there
http://www.webmasterworld.com/search...rs/3828718.htm

At least one bot owned by Google also operates out of the AmazonAWS ip ranges
Postrank were brought out by Google
http://www.postrank.com/
Reply With Quote
2 благодарности(ей) от:
Max Taxable, Simon Lloyd
  #78  
Old 04-06-2012, 02:54 PM
final kaoss final kaoss is offline
 
Join Date: Apr 2006
Posts: 1,314
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I'll give it a shot to scare baidu away
Reply With Quote
Благодарность от:
Simon Lloyd
  #79  
Old 04-14-2012, 02:01 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

My current (updated) list of banned user agents entered into this Mod:

baiduspider
beta.statsit.com
statsit
SiteIntel
Yandex
GomezAgent
FunWebProducts
MSIE 1
MSIE 2
MSIE 3
MSIE 4
MSIE 5
MSIE 6
Nesotebot
DCPbot
Opera/1
Opera/2
Opera/3
Opera/4
Opera/5
Opera/6
Opera/7
Opera/8
AOL Advertising R&D
DataCha0s
aiHitBot
Apache-HttpClient
Zend_Http_Client
ReverseGet

Entering the older MSIEs and the older Operas has virtually eliminated bot registration attempts. It's down to just one or two a day, and the "IsBot" Mod is still catching those. It used to be, 40-50 a day that would be caught.
Reply With Quote
Благодарность от:
Simon Lloyd
  #80  
Old 06-22-2012, 11:59 AM
LouiseWilson LouiseWilson is offline
 
Join Date: Oct 2007
Posts: 154
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Installed and working with 4.2.0 P2

thank you for the modification
Reply With Quote
  #81  
Old 07-13-2012, 01:16 AM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Hey Simon.... When this Mod is set to create threads, could it be updated to also include the IP of the bot in the post?
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 09:05 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.07012 seconds
  • Memory Usage 2,348KB
  • Queries Executed 29 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (2)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (1)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (4)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (3)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (1)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete