Go Back   vb.org Archive > vBulletin Modifications > vBulletin 4.x Modifications > vBulletin 4.x Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Ban Spiders by User Agent Details »»
Ban Spiders by User Agent
Version: 3.1.2, by Simon Lloyd Simon Lloyd is offline
Developer Last Online: May 2023 Show Printable Version Email this Page

Category: Miscellaneous Hacks - Version: 4.x.x Rating:
Released: 08-08-2011 Last Update: 12-17-2014 Installs: 491
Uses Plugins
 
No support by the author.

What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost....&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost....&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost....&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost....&postcount=318
https://vborg.vbsupport.ru/showpost....7&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost....&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost....&postcount=259
https://vborg.vbsupport.ru/showpost....&postcount=224
https://vborg.vbsupport.ru/showpost....&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers

ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity
  1. Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
  2. Added clickable link to visited thread
22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Download Now

File Type: xml product-ban_spider4x.xml (30.8 KB, 469 views)

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #52  
Old 09-12-2011, 04:24 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

as a side note you don't need the full useragent string anymore to ban them, you can now enter any part of the string:
e.g
bai will result in baidu being banned just as will any string containing "bai"
Entering Mozilla will result in every useragent string containing that to be banned.

So, entering the full bot name but not useragent string will do, enter Baidu for that spider, dont enter Ya as something to ban as Yahoo will be banned just as Yandex will.
Reply With Quote
  #53  
Old 09-12-2011, 06:54 PM
ForceHSS ForceHSS is offline
 
Join Date: Apr 2008
Posts: 6,357
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

ok did not know as it did not say this in your first post
Reply With Quote
  #54  
Old 09-13-2011, 04:23 PM
TheWhite TheWhite is offline
 
Join Date: Nov 2006
Posts: 147
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

A little word of advice, don't get carried away with the bot banning because it might affect your Google revenue in a negative way, start with these for a while and control your traffic wisely.

Baidu
Yeti
Twiceler


Regards
Reply With Quote
  #55  
Old 09-13-2011, 04:45 PM
Boofo's Avatar
Boofo Boofo is offline
 
Join Date: Mar 2002
Location: Des Moines, IA (USA)
Posts: 15,776
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by TheWhite View Post
A little word of advice, don't get carried away with the bot banning because it might affect your Google revenue in a negative way, start with these for a while and control your traffic wisely.

Baidu
Yeti
Twiceler


Regards
Don't forget Yandex.
Reply With Quote
  #56  
Old 09-13-2011, 07:36 PM
niteflyer32 niteflyer32 is offline
 
Join Date: Nov 2008
Posts: 18
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Works great except one thing with the Baidu spider, it knocked them off the forum for about 3 hours. Now I see the Baidu spider back again on the online list.????

Update: Looks like the mod knocked Baidu off again, haven't seen them for 1 hour. There was 4 Baidu spiders all with different IPs that showed up and then nothing. Is this the mod working or should we not even see the bots that are on the blacklist? I don't see other spiders we've blacklisted showing back up yet.
Reply With Quote
  #57  
Old 09-13-2011, 10:55 PM
ForceHSS ForceHSS is offline
 
Join Date: Apr 2008
Posts: 6,357
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I have it like this in the list Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
and they never enter
Reply With Quote
  #58  
Old 09-14-2011, 05:59 AM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Don't worry if you see them in the WOL, what you have to remember is whilst they are redirected the moment they arrive vbulletin may have already registered that they have arrived and for a split second show them in WOL, they only go missing from WOL when your WOL timeout has expired, i have mine set to 900 seconds (15 minutes) so i have to wait beyond that time before i see them disappear.

The fix im working on for "Create thread" to work with "Ban spiders in list" at this moment uses a hook thats compiled later so the bots always show in WOL (until they get the message for the 301) as the minute they are redirected another of their bots try to crawl your site but are instantly redirected, i could release that version but it will amount to loads of messages here saying "I can still see the spiders" so i wont release it until i can make the bots disappear after the WOL timeout.
Reply With Quote
  #59  
Old 09-17-2011, 01:07 PM
niteflyer32 niteflyer32 is offline
 
Join Date: Nov 2008
Posts: 18
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

The Baidu bots do seem to disappear once they are found. We used to have scores of them on the WOL and now just one or two show up and they are gone shortly.

One issue we have with a forum member who is getting blocked and the Google re-direct, he is using IE 8 and on Cox.net

Here is our block list, I'm not sure which one is blocking him. Any ideas? Thanks.

==================================
Attached Files
File Type: txt bot_block_list.txt (3.2 KB, 32 views)
Reply With Quote
  #60  
Old 09-17-2011, 03:38 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

What you'll need to get him to do is visit here http://whatsmyuseragent.com/ and get him to post you the entire contents of the box that says "Your UserAgent" and then see if any part of that string matches any of your blocked bots, post back and i'll take a look

P.S can you edit your last post and remove all those bots and add an attachment of a text file? that way folk don't have to scroll for ages to read the thread
Reply With Quote
  #61  
Old 09-18-2011, 06:28 AM
niteflyer32 niteflyer32 is offline
 
Join Date: Nov 2008
Posts: 18
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Sorry about the long post, got it changed to a text file.

Here is the forum member's response from the "Your UserAgent" link. I'm not sure which one from our list is blocking him, possibly one of the 3 Mozillas on the list?

Thanks for the help.

Your User Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; Creative AutoUpdate v1.40.01)
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 07:29 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.23188 seconds
  • Memory Usage 2,357KB
  • Queries Executed 27 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (4)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (2)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (16)post_thanks_box_bit
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (1)post_thanks_postbit
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (2)postbit_attachment
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_attachment
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete