Go Back   vb.org Archive > vBulletin 3 Discussion > vB3 General Discussions
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 03-23-2008, 11:49 AM
MetalMilitia's Avatar
MetalMilitia MetalMilitia is offline
 
Join Date: Mar 2004
Location: Minnesota
Posts: 58
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default Yahoo Slurp Spiders

http://www.armageddononline.org
http://forums.armageddononline.org

I run a fairly "mid level" site.... ~3000 unique hits / day....

What is the logic behind Yahoo having 500 or so spiders on the forums at one time? Google seems to get the job done better with 5-10 / day....

I know there are other threads, but if someone could lay it all out here I would appreciate it much.



-Matt
Reply With Quote
  #2  
Old 03-23-2008, 12:03 PM
snakes1100 snakes1100 is offline
 
Join Date: Dec 2001
Location: Michigan
Posts: 3,733
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

You would have to ask yahoo why they cant figure out how to send an appropriate amount of bots to a new site they are seo'ing

You can limit the bots by using a robots.txt file and setting limits for the yahoo bots.
Reply With Quote
  #3  
Old 03-23-2008, 12:06 PM
MetalMilitia's Avatar
MetalMilitia MetalMilitia is offline
 
Join Date: Mar 2004
Location: Minnesota
Posts: 58
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I know you can change the times / limits.... but I mean good lord

500+ on easter sunday morning? lols.

I'm sure it's a healthy testament to the server, but why the hell do then need 200x more crawlers than google... from which gives me 90% of the traffic anyways?

It's like yahoo hits a dead end somewhere (per say a closed thread or error) - and then proceeds to call in 5 more spiders to check out why. Is it not somewhat ridiculous?

How bad do you guys with bigger sites get hit?

-MM-
Reply With Quote
  #4  
Old 03-23-2008, 12:10 PM
snakes1100 snakes1100 is offline
 
Join Date: Dec 2001
Location: Michigan
Posts: 3,733
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

It's really hard to say why yahoo sends in 5 platoons of troops to crawl a site, it's still a question they would have to answer.

I've encountered that on a few sites i admin for, amending the robots file will stop them, but if i recall its a 30 day waiting period for yahoo to check the file again, an easy and fast way to get rid of them is to ban all but one of yahoo bot ips, i think they use about 30 ips to.
Reply With Quote
  #5  
Old 03-23-2008, 12:15 PM
MetalMilitia's Avatar
MetalMilitia MetalMilitia is offline
 
Join Date: Mar 2004
Location: Minnesota
Posts: 58
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Yup.

As much as it bothers me watching them do 200x the work google does, I don't exactly want to trash a search engine hit of any kind.

.... but there in lies the question that no one seems to know the answer to. Why DOES yahoo do that? As stated, and hits to the boards of keywords come from primarily (75%+) google.... same with the main site and articles / news.

If I only get ~3000 unique hits / day for everywhere on the site, how bad do bigger sites get hit?
Reply With Quote
  #6  
Old 03-23-2008, 12:19 PM
snakes1100 snakes1100 is offline
 
Join Date: Dec 2001
Location: Michigan
Posts: 3,733
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I doubt your going to get a real answer as to why yahoo bots do that, you could search google for an answer though

Most big sites are going to implement a robots.txt file and not worry about it any further once yahoo reads the file, the bots will comply.
Reply With Quote
  #7  
Old 03-23-2008, 12:23 PM
MetalMilitia's Avatar
MetalMilitia MetalMilitia is offline
 
Join Date: Mar 2004
Location: Minnesota
Posts: 58
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I still leave the question open for those that want to answer though...

What is the most ridiculous amount of bots / spiders you have seen on your boards? Include the size of the boards too please

-MM
Reply With Quote
  #8  
Old 03-23-2008, 12:39 PM
Guest210212002
Guest
 
Posts: n/a
Default

To stop Slurp's, er, slurping, toss this in your robots.txt:

Code:
User-agent: Slurp
Crawl-delay: 60
My site today, and it's only 9:40AM.

Quote:
Guest Visits Today: 4,667
Visitors (3813), Yahoo! Slurp Spiders (813), Google AdSense Spiders (8), Google Spiders (23), AskJeeves Spiders (1), MSNBot Spiders (9)
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 02:20 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.04429 seconds
  • Memory Usage 2,223KB
  • Queries Executed 11 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)ad_showthread_firstpost
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (1)bbcode_code
  • (1)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)navbar
  • (3)navbar_link
  • (120)option
  • (8)post_thanks_box
  • (8)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (8)post_thanks_postbit_info
  • (8)postbit
  • (7)postbit_onlinestatus
  • (8)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete