vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 4.x Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=245)
-   -   Miscellaneous Hacks - Ban Spiders by User Agent (https://vborg.vbsupport.ru/showthread.php?t=268208)

Gadget_Guy 12-11-2014 05:12 PM

That is what the spiders.txt file I attached is.

D.

Simon Lloyd 12-11-2014 05:17 PM

Quote:

Originally Posted by CAG CheechDogg (Post 2526712)
This is a snapshot of the spiders that are showing up in the whos online:

https://vborg.vbsupport.ru/external/2014/12/30.jpg

What exactly do you need a snapshot in the settings Simon?

This is my list of spiders I have banned with your mod:

..........................................

There's one or two duplicates there but that doesn't matter, however rather than ban baiduspider just ban baidu, you'll have more luck with that as not all baiduspiders have the entire name in the UA, that goes for most of the bots, lets say there's a bot called Simon Lloyd Crawl Everything Everywherespider then the following will ban it:
Simon or Lloyd or Crawl or Spider.....etc
The same goes for:
Lloyd Crawl or Everything or Simon Lloyd....etc (case isnt important)

What the mod does is look for the string you entered, so if you want to ban the spider i mentioned above just Simon will do it, howevere lets say you have a friendly bot called Simon Lloyd Crawled Everything Everywherespider then to ban the first bot and allow the other you'd need to enter a string that is unique to the first one so in this case i could be:
Simon Lloyd Crawl Everything
This way it wont pick up the "Crawl" in the friendly bots name as its looking for the exact string you entered.

Hope that helps.

Simon Lloyd 12-11-2014 05:22 PM

Quote:

Originally Posted by Gadget_Guy (Post 2526788)
That is what the spiders.txt file I attached is.

D.

I need it as asked for, the reason for this is to check for machine charaters and/or leading/trailing spaces, hard returns.....etc

Gadget_Guy 12-11-2014 05:25 PM

Quote:

Originally Posted by Simon Lloyd (Post 2526789)
What the mod does is look for the string you entered, so if you want to ban the spider i mentioned above just Simon will do it, howevere lets say you have a friendly bot called Simon Lloyd Crawled Everything Everywherespider then to ban the first bot and allow the other you'd need to enter a string that is unique to the first one so in this case i could be:
Simon Lloyd Crawl Everything
This way it wont pick up the "Crawl" in the friendly bots name as its looking for the exact string you entered.

Hope that helps.


This COULD be the issue I have with mine then. When you look at my txt file you will see that I tried putting in multiple variations. That could be negating the effectiveness.

Maybe as part of the mod could be an updated list that we can copy/paste so that people like me who are clueless don't do the wrong thing.

Keeping in mind that we would want the "good" spiders to get through like google, bing, and the legit ones that are important to SEO, Adsense, and other things like that.

I hope you and Ozzy are disscusing the hook thing as well... he seemed to think that may be important with my 4.2.2 site.

I will say that when I had this mod in place for my 3.8.x site it worked perfectly and I didn't get hit hard till I upgraded to 4.2.2

I saw my server loads go way up....

D.

Simon Lloyd 12-11-2014 05:32 PM

There is already a list included and many throughout this thread, i've also explained the above before. I wouldn't update the list of spiders to ban as i've said probably over a dozen times it's a personal thing on what or who you ban.

If you want to pm me access as i've said before i'll take a look.

Gadget_Guy 12-11-2014 05:40 PM

1 Attachment(s)
Here you go.

Simon Lloyd 12-11-2014 06:11 PM

1 Attachment(s)
Right, i've been through your list, i wont comment on the bots you are banning as thats your preference, what i ahve done is ordered the list, checked for anything that shouldn't be there and removed some bots as they will be taken care of by other entries you have.

What i will say is if you are NOT using Paul Ms "Who has visited" mod and you are still seeing any of the bots on your list appear in WOL then you need to check that spiders UserAgent to see if the name or text you have in your list actually appears in the UA.

CAG CheechDogg 12-11-2014 06:48 PM

Quote:

Originally Posted by Simon Lloyd (Post 2526789)
There's one or two duplicates there but that doesn't matter, however rather than ban baiduspider just ban baidu, you'll have more luck with that as not all baiduspiders have the entire name in the UA, that goes for most of the bots, lets say there's a bot called Simon Lloyd Crawl Everything Everywherespider then the following will ban it:
Simon or Lloyd or Crawl or Spider.....etc
The same goes for:
Lloyd Crawl or Everything or Simon Lloyd....etc (case isnt important)

What the mod does is look for the string you entered, so if you want to ban the spider i mentioned above just Simon will do it, howevere lets say you have a friendly bot called Simon Lloyd Crawled Everything Everywherespider then to ban the first bot and allow the other you'd need to enter a string that is unique to the first one so in this case i could be:
Simon Lloyd Crawl Everything
This way it wont pick up the "Crawl" in the friendly bots name as its looking for the exact string you entered.

Hope that helps.

No no ...I am fine Simon, I have "ZERO" traces of baidu ... I think when you asked for the settings and a shot you were asking Gadget Guy and not me ... but I am fine, I have no problems with the mod not blocking any of the bots at all ... Thank you !!!!

Gadget_Guy 12-11-2014 07:35 PM

Do you mean this one:

https://vborg.vbsupport.ru/showthread.php?t=232636


Then, yes... I am using it.

D.

Gadget_Guy 12-11-2014 07:43 PM

Quote:

Originally Posted by Simon Lloyd (Post 2526796)
Right, i've been through your list, i wont comment on the bots you are banning as thats your preference, what i ahve done is ordered the list, checked for anything that shouldn't be there and removed some bots as they will be taken care of by other entries you have.

What i will say is if you are NOT using Paul Ms "Who has visited" mod and you are still seeing any of the bots on your list appear in WOL then you need to check that spiders UserAgent to see if the name or text you have in your list actually appears in the UA.

I have implemented your list.

In regards to your comment about "my list".... I have no idea to be honest.

Those I put in there based on what I was seeing in my WOL and putting things in there to try and block them.

I am sure I was way off base and incorrect in doing so.

So.... in light of this... if you want to provide a "proper" list... I am happy to take your guidance.

I don't know the first thing about any of this stuff and am looking to experts like yourself to assist.

D.


All times are GMT. The time now is 09:46 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01733 seconds
  • Memory Usage 1,753KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (5)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (10)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete