vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 4.x Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=245)
-   -   Miscellaneous Hacks - Ban Spiders by User Agent (https://vborg.vbsupport.ru/showthread.php?t=268208)

Simon Lloyd 12-06-2014 09:56 AM

Quote:

Originally Posted by CAG CheechDogg (Post 2525962)
The Baidu spider can take up to a couple if not few months to completely disappear and actually obey the no crawl rule when adding this mod or even blocking it through your robots.txt ... I have it blocked everywhere and it took maybe 3 days before I didn't see it again ... the best thing to do for me was also add a huge IP block to my htaccess file that completely blocks all of China and a couple other Asian countries from accessing my site ...

Believe it or not you can actually go to their site and ask them not to crawl your site :)

tanzeelniazi 12-06-2014 10:38 AM

Quote:

Originally Posted by Simon Lloyd (Post 2525990)
That appears correct.

you say i am correct but i see also these spiders in whois online :( why ? also i see i link showing like this
showthread?=1' ???
i mean 1 spider see this link showthread?=1' ???

Simon Lloyd 12-06-2014 01:32 PM

Where are you seeing that? make sure that your list of bad bots has no leading or trailing spaces on each name. If you are still having trouble you can pm me temporary admin login details with rights to administer plugins and i'll take a look :)

Max Taxable 12-06-2014 06:20 PM

Quote:

Originally Posted by CAG CheechDogg (Post 2525962)
The Baidu spider can take up to a couple if not few months to completely disappear and actually obey the no crawl rule when adding this mod or even blocking it through your robots.txt ... I have it blocked everywhere and it took maybe 3 days before I didn't see it again ... the best thing to do for me was also add a huge IP block to my htaccess file that completely blocks all of China and a couple other Asian countries from accessing my site ...

It's not a no crawl rule, it is a outright block, with this mod. There's no obeying or disobeying. It's not robots.txt. When you install this mod and with baidu on the list, it should be blocked. Gone.

I believe for the incidents of it still appearing in v4s after this mod is installed, must have something to do with interference from some other mod. Else, how to explain my v4 getting NO appearances by baidu after I installed this?

Max Taxable 12-06-2014 06:55 PM

I'll add to the above - I suspect a hook conflict actually. One that only happens when some other mod calls the same hook, and it's not all the time because from what we saw at OzzModz, Baidu was greatly reduced in appearances by this mod, just not totally gone. Occasionally one or two of them slip through during the time of the hook conflict.

It doesn't happen in v3 at all, for the same reason I believe.

Gadget_Guy 12-06-2014 07:34 PM

Anything in particular I can look for in terms of a hook that might conflict?

Would a list of my mods help?

D.

Max Taxable 12-06-2014 08:06 PM

In v3 and v4 this mod calls "style_fetch."

CAG CheechDogg 12-06-2014 08:20 PM

Well I haven't had Baidu appear on my site in over 2 years ... way before I even installed this mod which has helped me a lot regardless of how it does it lol ... my point is that "I" haven't had Baidu for a very long time ...

Max Taxable 12-06-2014 08:31 PM

Well sure if you block China and other countries via .htaccess you probably won't see Baidu.

I used to have a massive .htaccess with country blocks.

Simon Lloyd 12-07-2014 04:50 AM

banning by .htaccess is fine if you only have a few things in it because it is read with every single server request, so if you have 10 blocks in your .htaccess and lets say you have a web page with 30 elements (icons, css, containers, includes.....etc) then each one of those that tries to access that page has 30 checks made just to load that page.

Now consider your own landing page and check how many things load to make that page up and you'll soon see why having a lot of bans in your .htaccess can be detrimental particularly if you are on shared hosting or limited vps.

@Gadget_Guy & Max Taxable
The hook is style_fetch, you can try changing the hook for one of the others that loads before all the others but you may not see the result your looking for, doesn't hurt to try :)


All times are GMT. The time now is 02:42 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.02006 seconds
  • Memory Usage 1,748KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (3)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (10)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete