Go Back   vb.org Archive > vBulletin Modifications > vBulletin 4.x Modifications > vBulletin 4.x Add-ons

Reply
 
Thread Tools
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.2, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 4.x.x Rating:
Released: 08-08-2011 Last Update: 12-17-2014 Installs: 491
Uses Plugins
 
No support by the author.

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost....&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider4x.xml (30.8 KB, 469 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #442  
Old 11-12-2012, 01:28 PM
vb50kgpoo vb50kgpoo is offline
 
Join Date: Aug 2011
Posts: 121
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by ForceHSS View Post
What is there full host name
That is it !
They go by those generic names only, nothing else.
Reply With Quote
  #443  
Old 11-12-2012, 04:31 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by vb50kgpoo View Post
Hi Simon
Yours is a great product. I made the mistake of uninstalling it in order to use AbyssGuard, which is plagued with problems. I have now reinstalled Ban Spiders By User Agent. One question, are there any ramifications in banning \wbot[\/\-] with your mod? I ask as putting \wbot[\/\-] directly into my htaccess banning mecahism causes issues.
Regards / RSVP
Im ,y mod you are banning any useragent that has any occurrence of one of the strings in your list, i very much doubt that \wbot[\/\-] is found in any useragent as the looks like a regular expression, in my mod simply wbot will do if thats in their useragent.

Quote:
Originally Posted by vb50kgpoo View Post
Also.......

Does anyone know what these bots are;

Robot ID - Hits - Bandwidth - Last visit - Hits on robots.txt
robot 772 8576221 20121111093454 0
crawl 699 9556953 20121108085243 0
spider 5 114750 20121106065956 0

Bad bots using generic names?
Thats just stats from cpanel awstats and mean nothing other than spiders were identified with those in their identifier.

Quote:
Originally Posted by vb50kgpoo View Post
That is it !
They go by those generic names only, nothing else.
They are not their useragents, read some of the links that i took time and trouble to post in the mod description on hpw to find the useragent.
Reply With Quote
  #444  
Old 11-20-2012, 03:09 PM
bzcomputers's Avatar
bzcomputers bzcomputers is offline
 
Join Date: Apr 2012
Location: TX
Posts: 503
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by TheSupportForum View Post
wont MSIE 1 block MSIE Beta 10 ?

which means MSIE 8, 9 only visitors
It won't block MSIE Beta 10, but it will block MSIE 10. Best to remove MSIE 1 from your list now. I haven't seen any references in the output.txt file of anything prior to MSIE 5 so removing MSIE 1 probably won't cause any problems.




Edit: If you were wondering what the IP Address 108.2.106.107 is, it is Verizon's Search Engine (www.verizon.net) definitely not something you would want to block.
Reply With Quote
Благодарность от:
Max Taxable
  #445  
Old 11-20-2012, 03:35 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by bzcomputers View Post
It won't block MSIE Beta 10, but it will block MSIE 10. Best to remove MSIE 1 from your list now. I haven't seen any references in the output.txt file of anything prior to MSIE 5 so removing MSIE 1 probably won't cause any problems
Using the time based Mod in my signature, I often see user agents with early IE, such as 3, 4 and 5, but you're right - I haven't seen any IE 1 or 2 in so long, there's no doubt it would be good to remove MSIE 1 from the ban list.

Thanks for the information!
Reply With Quote
  #446  
Old 11-20-2012, 05:57 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by bzcomputers View Post
It won't block MSIE Beta 10, but it will block MSIE 10. Best to remove MSIE 1 from your list now. I haven't seen any references in the output.txt file of anything prior to MSIE 5 so removing MSIE 1 probably won't cause any problems.




Edit: If you were wondering what the IP Address 108.2.106.107 is, it is Verizon's Search Engine (www.verizon.net) definitely not something you would want to block.
It will block MSIE Beta 10 if it appears like that in the useragent, this mod will block ALL instances where the exact string in your list is found in the UA.
Reply With Quote
  #447  
Old 12-15-2012, 09:45 PM
My Hattiesburg My Hattiesburg is offline
 
Join Date: Jun 2012
Posts: 128
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Okay, I've installed this and it seems to be working fine, but I have some questions.

At first the Baidu spider was hammering us, hitting the site about every 8 seconds, but now it seems to have more or less given up on us. Yandex, on the other hand, seems to have intensified it's attempted crawls. At first it was hitting the site about every 25 seconds but over the course of this install has ranged everywhere from every 1 second to where it's at now, about every 1.5 minutes. The 1.5 minute attempts have only occurred in the last couple of days.

A couple of days ago when Yandex was hitting us every 1 second, we had some server load issues. I don't know if this is related but it seems it might be and I'm wondering if logging the blocks to a text file might be counterproductive in that area, writing to the file every second or so.

Also, Yandex is using the same IP address every time, so I thought it might be best to just block it using the .htaccess file, but that doesn't seem to have had any effect. Is this mod redirecting Yandex before it has a chance to read the .htaccess file or is Yandex simply ignoring it?
Reply With Quote
  #448  
Old 12-15-2012, 10:24 PM
smirkley smirkley is offline
 
Join Date: Apr 2008
Posts: 627
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Your server would be the one to decide if Yandex is blocked by the .htaccess file or not.
If the ip is denied in htaccess, your server will block before any vb module loads, or any page opther for that matter.
Reply With Quote
  #449  
Old 12-16-2012, 05:29 AM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Smirkley is right, .htaccess is loaded before anything else. As for my mod using Yandex as a blocking string will block it, if it is still showing then thats because either they dont actually have yandex in the user agent or you've entered more than just Yandex as the string, remember, my mod bans anything that has exactly your string to look for (including spaces), so if your strings in my mod look like this:
Baidu
Yandex
SoSo
.....etc
then yandex will be blocked, however if it looks like this:
Baidu
Yandex123
SoSo
....etc
then any bot with just yandex in their string or any other kind of yandex like YandexWorld will not be blocked, but anything containing yandex123 will be blocked.

As for writing to a file, just turn that bit off, it's there simply for test purposes, trouble shooting or checking on individual user agents.
Reply With Quote
2 благодарности(ей) от:
Max Taxable, smirkley
  #450  
Old 12-16-2012, 08:20 PM
My Hattiesburg My Hattiesburg is offline
 
Join Date: Jun 2012
Posts: 128
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Your mod is blocking Yandex, but I just figured since it was using the same IP address every time it might be better to just go ahead and block that IP.

I guess I didn't do something right in the .htaccess file because Yandex was getting past it and was getting blocked by the mod.
Reply With Quote
  #451  
Old 12-17-2012, 03:52 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

yandex doesn't always use the same IP, somewhere earlier in the pages for this mod i think i posted how to do it in .htaccess if you wish.
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 04:25 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.05491 seconds
  • Memory Usage 2,365KB
  • Queries Executed 27 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (7)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (2)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (19)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (3)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (1)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete