Thread: Miscellaneous Hacks - Ban Spiders by User Agent
View Single Post
  #307  
Old 03-22-2012, 01:27 AM
baileyjojoms baileyjojoms is offline
 
Join Date: Mar 2011
Posts: 29
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by BadgerDog View Post
Do you have a link?

I can't seem to find the right page....

Thanks ..

Regards,
Doug
Yes, I ensured that the following was in my robots.txt file:

User-agent: Baiduspider
Disallow: /


Then I sent an email to: spiderhelp@baidu.com

Here is the message and reply I received:


Quote:
Dear,

Thank you for your email.
We have updated our DNS record to make our spider behave the way requested in your robots file.
Should you need further assistance, please do not hesitate to contact us.

Best Regards,
Stephy Wu
Baidu Spider Team

________________________________________
re: Continuous Crawling of my site

To whom it may concern;

I have been trying for a month now to halt all crawling of my site by Baidu. I have added the following code to my robots.txt file:
User-agent: Baiduspider
Disallow: /

This was done 3 weeks ago. However I am being crawled daily.

Baidu is daily eating up a ton of Server Resources, and costing me slow load times. I also employed a spider ban modification, and have banned more than 28,000 Baidu spider entries in 3 weeks.

This is ridiculous. I am asking you to immediately halt all crawling of my site by Baidu.
I have not seen hide nor hair of Baidu since this was done, nearly a month ago.

To find the email address I went to their website, translated the page into English, and the searched Baidu Spider. Which took me to a search results page, which lead me to this page:
http://www.baidu.com/search/spider.html

I simply translated to English, and found the info I was looking for.

Baidu was the ONLY spider that was causing major issues, now I am able to use this add-on for other spiders - but Baidu was using massive amounts of resources.

Hope this helps.
Reply With Quote
 
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01341 seconds
  • Memory Usage 1,767KB
  • Queries Executed 11 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD_SHOWPOST
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (2)bbcode_quote
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)post_thanks_box
  • (1)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (1)post_thanks_postbit_info
  • (1)postbit
  • (1)postbit_onlinestatus
  • (1)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • reputationlevel
  • showthread
Included Files:
  • ./showpost.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showpost_start
  • bbcode_fetch_tags
  • bbcode_create
  • postbit_factory
  • showpost_post
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • showpost_complete