vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 3.8 Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=235)
-   -   Miscellaneous Hacks - Ban Spiders by User Agent (https://vborg.vbsupport.ru/showthread.php?t=264932)

accludetuner 12-12-2011 08:53 AM

Trying the ban IP mod now.

On a side note, I help several people run other VB sites and I've been installing this spider ban mod as a default thing on all of their sites and putting in my list of user agents to ban/block. I recently installed it on a 3.6 site and it doesn't seem to be working. Is there a different hook I need to use for it to work with 3.6? Thought you might know.

EDIT: Maybe it is working and I spoke too soon. I have that pesky Baidu spider banned due to the amount of indexing it does without any regard to server resources. There were about 50 Baidu bots crawling the site but the number seems to be gradually going down (at about 20 right now). I'll keep checking it to see if Baidu disappears for good and then I'll know for sure whether it's working on 3.6 or not.

lgpaul 01-10-2012 12:18 PM

It works! Thanks!

pitzerwm 01-13-2012 05:27 PM

Thank you for creating a great mod, and thanks to Max for turning me on to it.

A question: in Who is online, in the IP column, right now I have 2 guest where the IP is totally blank. I suppose that there is an explanation for this, but is there anything that you can do about it?

Midohash 01-17-2012 12:53 AM

Fantastic ... marked as Installed ... many thanks
Just a little query! ... If I need to block unwanted visitor or member, should I put his/her IP in the spider list? ... And what is the difference between denying the IP address on cpanel or .htaccess? ... Also can I block a proxy server through this mod?

Simon Lloyd 01-17-2012 07:04 AM

For IP's you should use my other mod https://vborg.vbsupport.ru/showthread.php?t=268146 as this mod processes the useragent string and not the IP address, if you deny in cpanel or .htaccess then the user of that IP will never get to see your site or files, my mod redirects them before your site loads properly. Also with cpanel and .htaccess you have to of course log in and edit them each time you want to ban or remove a ban, mines a little more user friendly :)

Simon Lloyd 01-17-2012 07:07 AM

Quote:

Originally Posted by pitzerwm (Post 2287361)
Thank you for creating a great mod, and thanks to Max for turning me on to it.

A question: in Who is online, in the IP column, right now I have 2 guest where the IP is totally blank. I suppose that there is an explanation for this, but is there anything that you can do about it?

Sorry, again you need this https://vborg.vbsupport.ru/showthread.php?t=268146 for IP's, this deals with the user agent string not IP's, if their IP is blank then they are cloaking it somehow, if you choose to see the User Agent (via the dropdown at the bottom of the WHo's Online window you can copy their user agent string, pop it in the list to ban and then watch them disappear :)

Midohash 01-17-2012 09:02 AM

Quote:

Originally Posted by Simon Lloyd (Post 2288886)
For IP's you should use my other mod https://vborg.vbsupport.ru/showthread.php?t=268146 as this mod processes the useragent string and not the IP address, if you deny in cpanel or .htaccess then the user of that IP will never get to see your site or files, my mod redirects them before your site loads properly. Also with cpanel and .htaccess you have to of course log in and edit them each time you want to ban or remove a ban, mines a little more user friendly :)

Thanks Simon :up:, regarding your other mod is it possible to be modified in the future to block users from anonymous proxy? :confused: ... That will be a great add on :cool: ... However I note a significant drop in the number of my site visitors after installing the mod, is that attributed to blocking bad bots only or could be also some normal visitors are not able now to access my forum? ... Also is the drop in the visitor numbers would affect my over all alexa rank? ... Regards,

pitzerwm 01-17-2012 06:50 PM

I installed this and with Ban IPs apparently have removed a large percentage of my issues.

I was wondering if I get this as the "guest" dhcp-0-24-b2-58-41-4a.cpe.eaglecable.net

Can I put this total "agent" in, and it will ban this person?

Thanks

photonetau 01-17-2012 09:16 PM

I installed this about 9 hours ago and baidu is still hitting at the rate of 200 hits an hour is that usual , how long before they give up?
BTW they are redirected to the default site.

Simon Lloyd 01-18-2012 09:15 AM

Quote:

Originally Posted by Midohash (Post 2288919)
Thanks Simon :up:, regarding your other mod is it possible to be modified in the future to block users from anonymous proxy? :confused: ... That will be a great add on :cool: ... However I note a significant drop in the number of my site visitors after installing the mod, is that attributed to blocking bad bots only or could be also some normal visitors are not able now to access my forum? ... Also is the drop in the visitor numbers would affect my over all alexa rank? ... Regards,

It could be modified to ban those without IP and this one could be modified to ban those without user agent but would need a lot of testing!

Yes, your visitor drop is due tothe banning, you'll find (especially if you installed the spiders on forumhome mod) that most of the visitors were bots in the list like Baiduspider, they're insatiable. Bots don't make up part of your alexa ranking, believe it or not Alexa makes it's assumptions from the amount of visitors to your site that have the Alexa toolbar installed :)

Quote:

Originally Posted by pitzerwm (Post 2289106)
I installed this and with Ban IPs apparently have removed a large percentage of my issues.

I was wondering if I get this as the "guest" dhcp-0-24-b2-58-41-4a.cpe.eaglecable.net

Can I put this total "agent" in, and it will ban this person?

Thanks

Yes of course, the exact agent string will help zero in on that person/bot...etc, i added partial matching because of bots changing their UAs..etc but this mod was always built on using the entire string :)

Quote:

Originally Posted by photonetau (Post 2289142)
I installed this about 9 hours ago and baidu is still hitting at the rate of 200 hits an hour is that usual , how long before they give up?
BTW they are redirected to the default site.

I guess you are still seeing Baidu because you are using Paul M's who visited mod, both his mod and mine are doing their job, his picks them up because they make a direct call to a thread or post...etc and get logged but as soon as they attempt to access the thread or post they are redirected to whichever way you choose with a 301 redirect header :)

If you are seeing Baiduspider and not using Paul's mod then pm me your site url and access details and we'll get you sorted!

Midohash 01-18-2012 10:50 AM

Thanks a lot Simon :up: ... I still have same problem with Baidu :( ... I have checked my raw access log today and still having loads of Baidu spiders in my site 48 hours after installing the mod :eek: ... this is how they appear in my log file:

Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

Any recommendations?

Regarding Alexa, I have certified my site and have pasted their javascript code in the header of my forum as they recommend so they can analyse my website traffic directly regardless of visitors with/without alexa toolbar ... I don't know if my alexa rank will drop due to drop in the overall visitor numbers after banning spiders?

Regards

Simon Lloyd 01-19-2012 10:47 AM

of course your raw logs from cpanel, plesk or whathaveyou will show them visiting your site, remember this mod redirects them when trying to load your site, it does that regardless of whether they are trying to load the homepage or a thread, your Alexa shouldn't drop on account of redirecting spiders, in fact i've been redirecting these spiders for ages on one of my sites that was Page Rank 2 and it's now PR3, so if it's not hurting Google rank it has to be good :)

Can you give me a screenshot of your settings for ths mod?

Midohash 01-19-2012 10:58 PM

Thanks Simon, enclosed is a screenshot of my settings for the mod. My Alexa rank dropped from 83000 to 84000 but not sure if that related to the drop in the visitors count after applying the mod or just an incidental finding :confused:

http://img29.imageshack.us/img29/8616/spidersq.jpg

Simon Lloyd 01-20-2012 02:28 PM

Midohash, if you want to pm me admin access i'll take a look.

Doug Nelson 01-27-2012 08:29 AM

Excellent mod, thank you.

What is the advantage/disadvantage to redirecting spiders back to their own IP?

Simon Lloyd 01-27-2012 02:18 PM

Quote:

Originally Posted by Doug Nelson (Post 2292982)
Excellent mod, thank you.

What is the advantage/disadvantage to redirecting spiders back to their own IP?

You have to send them somewhere and whilst the example i gave you is a fun place to send them and they expect a certain amount of bots there's no guarantees that it will always be there, however, if you redirect to their own IP they will always have somewhere to be redirected to :)

gsmlover4u 01-30-2012 07:56 AM

to block spiders are benificial or not ?

Simon Lloyd 01-30-2012 12:40 PM

Blocking spiders will save you tons on your bandwidth and server load, take the Baiduspider they send hundreds upon hindreds of bots to index your site and they don't obey robots.txt

gsmlover4u 01-31-2012 05:14 AM

working on vbulletin 4.1.10

steviewonder44 02-07-2012 06:12 PM

The ips for baidu I also ban: 123, 124 125 and 58

Simon Lloyd 02-07-2012 06:32 PM

As you've downloaded thsi could you mark it as installed?
This particular mod is for banning User Agents, i do have another that's specifically for IP's https://vborg.vbsupport.ru/showthread.php?t=268146

Keep an eye out for my upcoming geo-banning of users :)

Midohash 02-07-2012 08:39 PM

Quote:

Originally Posted by Simon Lloyd (Post 2297226)
Keep an eye out for my upcoming geo-banning of users :)

That will be another great mod Simon :up:, can I add this phrase to the list of user agents: Baiduspider/2.0 or just Baidu or Baiduspider?

Simon Lloyd 02-07-2012 08:46 PM

simply Baidu will do it :)

I'm curently working on geo-banning and geo-targeting for ads (so you can maximise your revenue).

Midohash 02-07-2012 11:08 PM

Quote:

Originally Posted by Simon Lloyd (Post 2297285)
simply Baidu will do it :)

I'm curently working on geo-banning and geo-targeting for ads (so you can maximise your revenue).

Superb :up: ... does that mean I can allow traffic from some countries to google adsense ads and block traffic to ads from other countries with low click revenue? :confused:

Simon Lloyd 02-08-2012 07:37 AM

When i release it you will be able to target one country with ad_navbar_below and ad_ foorter everyone else will see what they always have seen!, thats the lite version, the full version will cater for multiple countries where you can designate ads to be shown in any current vbulletin ad location :)

lebowski99 02-21-2012 06:40 PM

Simon, I have a dumb question for you. I installed your mod but when I go into the "Who's Online" page I can still see bots hammering away at restricted areas in my site. How do I go about adding the user agent to the spider list on your add-on control panel? I figured out how to display the full user agent string. For example, I have one right now that reports:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 1.1.4322; PeoplePal


Looking at this string, what do I enter into the Spider List in the Ban Spider settings?

Simon Lloyd 02-21-2012 07:10 PM

1 Attachment(s)
Just enter funwebproducts :)
You can use the entire string if you wish or even peoplepal, you will see the bots who are in the banned list for anything up to an hour after install and you may on the odd occassion see them in the list but rest assured that they are being redirected, if you notice them there all the time still then pm me admin access to your site with permissions and i'll sort that for you :)

See the pic for adding to the list :)

Attachment 136597

lebowski99 02-21-2012 07:24 PM

Quote:

Originally Posted by Simon Lloyd (Post 2302054)
Just enter funwebproducts :)
You can use the entire string if you wish or even peoplepal, you will see the bots who are in the banned list for anything up to an hour after install and you may on the odd occassion see them in the list but rest assured that they are being redirected, if you notice them there all the time still then pm me admin access to your site with permissions and i'll sort that for you :)

See the pic for adding to the list :)

Attachment 136597

Sounds easy enough. But here is one from russia and the user agent is listed simply as:

Opera/9.00 (Windows NT 5.1; U; en)

What do I enter for this one? Seems like if I enter the entire string, it would potentially block a legit user.

Simon Lloyd 02-21-2012 07:42 PM

Nope :) enter the entire string!, if by some very remote chance you find you've blocked a legit user then let me know and we'll work something out.

trotskid 02-23-2012 08:14 AM

Installed, thanks!

Simon Lloyd 02-23-2012 10:56 AM

You're welcome :)

Lee G 02-23-2012 12:14 PM

Dont know if its listed in your bad user agent list, one of the worst I have found is OMGILI
User agent
omgilibot/0.3 +http://www.omgili.com/Crawler.html

If you right click on any page on the search engine site, you will see that all external links are no follow

They were hitting me for over 200 pages a day until I put an ip ban on them

Simon Lloyd 02-23-2012 02:41 PM

Quote:

Originally Posted by Lee G (Post 2302627)
Dont know if its listed in your bad user agent list, one of the worst I have found is OMGILI
User agent
omgilibot/0.3 +http://www.omgili.com/Crawler.html

If you right click on any page on the search engine site, you will see that all external links are no follow

They were hitting me for over 200 pages a day until I put an ip ban on them

Hmm, it's not added there but folk can tailor their own, i only gave a starting base, i'm sure many have now customised it ;)

I can bet that omgili aren't as bad as Baidu :)

Lee G 02-23-2012 03:01 PM

For some reason I get zero baidu hits :D
Either cloudflare block them or they take note of robots txt
Putting ip bans into the cf control panel saves killing my own server with htaccess bans

There seems to be a mass of junk with no user agent hitting from AmazonAWS at present
I need to find out if anything good comes from their servers, then consider a complete ban on all their ips. Any user agent being tracked is littered with AmazonAWS ips

Max Taxable 03-14-2012 03:43 PM

Quote:

Originally Posted by Lee G (Post 2302686)
For some reason I get zero baidu hits :D
Either cloudflare block them or they take note of robots txt
Putting ip bans into the cf control panel saves killing my own server with htaccess bans

There seems to be a mass of junk with no user agent hitting from AmazonAWS at present
I need to find out if anything good comes from their servers, then consider a complete ban on all their ips. Any user agent being tracked is littered with AmazonAWS ips

Neither happens. I have CF and robots.txt.

AmazonAWS is a friendly spider, from what I've seen over the years.

And Simon - here I am, marked installed.

Lee G 03-15-2012 04:38 PM

AmazonAWS cloud hosting

Full range of AmazonAWS ip's
http://www.forumpostersunion.com/showthread.php?t=10490

Some of the bad bots being run out of there
http://www.webmasterworld.com/search...rs/3828718.htm

At least one bot owned by Google also operates out of the AmazonAWS ip ranges
Postrank were brought out by Google
http://www.postrank.com/

final kaoss 04-06-2012 02:54 PM

I'll give it a shot to scare baidu away

Max Taxable 04-14-2012 02:01 PM

My current (updated) list of banned user agents entered into this Mod:

baiduspider
beta.statsit.com
statsit
SiteIntel
Yandex
GomezAgent
FunWebProducts
MSIE 1
MSIE 2
MSIE 3
MSIE 4
MSIE 5
MSIE 6
Nesotebot
DCPbot
Opera/1
Opera/2
Opera/3
Opera/4
Opera/5
Opera/6
Opera/7
Opera/8
AOL Advertising R&D
DataCha0s
aiHitBot
Apache-HttpClient
Zend_Http_Client
ReverseGet

Entering the older MSIEs and the older Operas has virtually eliminated bot registration attempts. It's down to just one or two a day, and the "IsBot" Mod is still catching those. It used to be, 40-50 a day that would be caught.

LouiseWilson 06-22-2012 11:59 AM

Installed and working with 4.2.0 P2

thank you for the modification

Max Taxable 07-13-2012 01:16 AM

Hey Simon.... When this Mod is set to create threads, could it be updated to also include the IP of the bot in the post?


All times are GMT. The time now is 04:02 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01498 seconds
  • Memory Usage 1,842KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (11)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (3)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete