vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 4.x Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=245)
-   -   Miscellaneous Hacks - Ban Spiders by User Agent (https://vborg.vbsupport.ru/showthread.php?t=268208)

CAG CheechDogg 12-11-2014 07:51 PM

1 Attachment(s)
Quote:

Originally Posted by Gadget_Guy (Post 2526807)
I have implemented your list.

In regards to your comment about "my list".... I have no idea to be honest.

Those I put in there based on what I was seeing in my WOL and putting things in there to try and block them.

I am sure I was way off base and incorrect in doing so.

So.... in light of this... if you want to provide a "proper" list... I am happy to take your guidance.

I don't know the first thing about any of this stuff and am looking to experts like yourself to assist.

D.

Gadge my Man... what you need to do is ask yourself a few questions ... like how much do rankings mean to you, if you want every single search engine to crawl your site ... are you on a shared or dedicated server and if you are already having problems with bots and especially bad bots hitting your site too much ..

I for one don't care for any other search engine except for google and yahoo ... I have completely blocked most search engines that are foreign and facebook as well ... as you can see from the screenshot I only have like 10 different spiders/bots that even crawl my site and that is just how I want it ...

So put together a list of the engines and spiders that you know are giving you hell then you can add those to the list that I have or Simon and Ozzy have and you can give it one last look and just remove the ones you want your site to be crawled with ....

I added my list in a txt file if you want to try mine out and see how it works for you ...

Here it is as well ...

Simon Lloyd 12-11-2014 07:53 PM

As i've said many times in this thread, the fact that they are showing up in Paul Ms mod doesn't mean they are getting through, his kod logs them as they visit, mine redirects thema t the same time, both mods are working fine!

Just copy CAG CheechDogg's list and prune as needed. There is no "proper" list its all a personal choice!

CAG CheechDogg 12-11-2014 07:57 PM

Quote:

Originally Posted by Simon Lloyd (Post 2526813)
As i've said many times in this thread, the fact that they are showing up in Paul Ms mod doesn't mean they are getting through, his kod logs them as they visit, mine redirects thema t the same time, both mods are working fine!

Just copy CAG CheechDogg's list and prune as needed. There is no "proper" list its all a personal choice!

Right Simon ... it logs the actual visit (detection) ... I check to see what is actually getting through by going to the online.php page and selecting Display: Search Bots from the drop down and then you can see what is actually crawling the site ...

CAG CheechDogg 12-11-2014 08:00 PM

I actually use this mod here by the Great Boobo (RIP) to also display the spiders in the whos online list and it is hell of accurate !!!!

https://vborg.vbsupport.ru/showthread.php?t=243460

Gadget_Guy 12-11-2014 08:20 PM

Guys... I can't thank you enough for all the help and advice.

I feel like a huge blindfold was lifted from my eyes with the last couple posts.

I tried reading all 45 pages of this mod to really understand it... but I missed the point about the detections being just that.

I was scratching my head when WOL didn't really match up with online.php list

I want to be crawled... but only by the right spiders..... the ones that really count for a north american audience and people who use the "traditional" engines like Yahoo, Google, Bing etc

I certainly don't want to jeopardize my Google adsense ads and things like that.

I want my site to be found.... and be a source when people search for information pertaining to what we do.

D.

ozzy47 12-11-2014 08:26 PM

I took a quick look at your list, it looks like you are only blocking bad bots, so it should be ok. :)

Max Taxable 12-11-2014 10:19 PM

On my own vB 4, the only time I ever see Baidu in either Paul's mod or in WoL, is when I turn off this mod.

Paul's mod fires before this one, that is true. But that is not the reason some people get Baidu there. Baidu is also in WoL, if you look, on boards that are showing Baidu in Paul's mod.

ozzy47 12-11-2014 10:21 PM

Well I think I might have taken care of the issues, as well as added a few more things to the mod, but I need to test it a bit more to be totally sure.

Gadget_Guy 12-11-2014 10:23 PM

WOL seems to be pretty clean now....

What is this spider? I see a lot of entries for it on WOL

Proximic Spider
54.175.33.76
Mozilla/5.0 (compatible; proximic; +http://www.proximic.com/info/spider.php)

Edit...

This one as well:

Magpie Spider
94.228.34.203
magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)


D.

Max Taxable 12-11-2014 10:28 PM

Harmless crawlers but if you want them blocked you can put them on the list.


All times are GMT. The time now is 04:47 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01711 seconds
  • Memory Usage 1,746KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (2)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (10)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete