vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vB3 General Discussions (https://vborg.vbsupport.ru/forumdisplay.php?f=111)
-   -   Yahoo! Slurp Spider, Sucking Us Dry (https://vborg.vbsupport.ru/showthread.php?t=102566)

Retell 12-09-2005 01:27 AM

Yahoo! Slurp Spider, Sucking Us Dry
 
I know that spiders are generally good, and about the robot.txt, except I don't want to stop spiders, just have less of them. There is not a single time where we have less than 15 Yahoo! Slurp Spiders on our site. Is there somewhere I can report this? I can understand having 1-3 on once every now and then, but theses guys have been here for more than 2 years!!!


Code:

Yahoo! Slurp Spider  Viewing Thread
Going to be at Quakecon!!!  68.142.250.109 
10:10 PM Yahoo! Slurp Spider  Viewing Thread
Web to Be Major Player at Olympic Games and Euro 2004 (Reuters)  68.142.250.11 
10:23 PM Yahoo! Slurp Spider  Viewing Thread
Thief: Deadly Shadows bug reported  68.142.250.167 
10:18 PM Yahoo! Slurp Spider  Viewing Thread
Cracked Side Panel, what can be done?  68.142.251.154 
10:11 PM Yahoo! Slurp Spider    Viewing User Profile
imbusion  68.142.251.101 
10:24 PM Yahoo! Slurp Spider    Viewing Thread
pics of nzxt's next case  68.142.249.144 
10:13 PM Yahoo! Slurp Spider  Viewing Thread
HDD fan and SYS fan control speed (Nemesis elite)  68.142.251.82 
10:24 PM Yahoo! Slurp Spider  Viewing Thread
Free ipods?  68.142.250.99 
10:23 PM Yahoo! Slurp Spider  Viewing Thread
Future Comp Specs... Good?  68.142.251.163 
10:17 PM Yahoo! Slurp Spider  Viewing Thread
Microsoft to Let Partners Own CE Changes (Reuters)  68.142.249.190 
10:10 PM Yahoo! Slurp Spider    Replying to Thread
Battery dead agian  68.142.249.135 
10:18 PM Yahoo! Slurp Spider  Viewing Thread
Nemesis Elite problems: I love it, but Grrr!  68.142.250.151 
10:12 PM Yahoo! Slurp Spider  Viewing Thread
Star Wraith: Shadows of Orion v1.7 Demo  68.142.249.121 
10:21 PM Yahoo! Slurp Spider  Viewing Thread
Nokia's New N-Gage May Sell Cheaper Than Announced (Reuters)  68.142.251.49 
10:20 PM Yahoo! Slurp Spider    Viewing Thread  68.142.251.43 
10:17 PM Yahoo! Slurp Spider  Viewing Thread
Intel Launches Advanced Notebook PC Processor (Reuters)  68.142.250.74 
10:11 PM Yahoo! Slurp Spider  Viewing Printable Version
Other products  68.142.250.137 
10:24 PM Google Spider  Viewing Thread
My lighted Trinity!  66.249.66.197 
10:19 PM Yahoo! Slurp Spider  Viewing Thread
Guardian Reviews  68.142.250.89 
10:21 PM Yahoo! Slurp Spider  Viewing Thread
my take on the nemesis elite  68.142.251.31


Please Help

Retell :pirate:

Paul M 12-09-2005 01:30 AM

Why don't you want them ?

Most people would kill to have so much search engine attention ....

Guest190829 12-09-2005 01:37 AM

Yes spiders are a good thing, I would let them "slurp" away. :)

Dan 12-09-2005 02:32 AM

I've yet to not see Yahoo Spidering my main site since I reopened it a year ago.

Retell 12-09-2005 08:02 AM

WEll about once a month we somehow end up with 95 slurps... I think that is DEF pushing it... Hehe sorry for stealing all the slurps from you guys :D

Andreas 12-09-2005 08:13 AM

At the moment:
1x Adsense
1x MSN
1x Gigablast (? Don't know that)
8x Google
23x Yahoo!

... less than average.

Guest210212002 11-05-2006 06:12 PM

Pardon the old bump, but I'm getting slurped out myself. I had 81 of them on my site this morning.

ForumDog 11-06-2006 12:27 PM

Yahoo Slurp! (and a few others) generally make a mass invasion around once a month so don't take that to be the norm unless it really is the norm. Otherwise it's a common complaint that Yahoo Slurp! drains more resources than most spiders and there's really nothing you can do aside from blocking them entirely.

Paul M 11-06-2006 02:09 PM

Check our WOL - atm we have around 750 Yahoo spiders online !

4yBak 11-06-2006 04:38 PM

I thinks that Yahoo SLurp! is stupid crawler, because he don't follow with rules in robots.txt. For example: I was restrict access for all crawlers for few files:
Code:

Disallow: /attachment.php
Disallow: /cron.php
Disallow: /newreply.php
Disallow: /newthread.php

and all crawlers are follow with this instructions and don't call this files, but Yahoo Slurp! is still call this files.

Also, few days ago at my board was online ~125 Yahoo Slurp! crawlers (each 2 seconds come new Yahoo Slurp! crawler) - I think it was tempopary bug with crawler :)

Tiberonmagnus 11-06-2006 08:31 PM

I'm trying to get some slurps, come to pappa!! Added my forums to a bunch of search engines.. just wondering how long it will take. :D

Nomadite 02-05-2007 10:41 PM

Quote:

Originally Posted by Tiberonmagnus (Post 1111985)
I'm trying to get some slurps, come to pappa!! Added my forums to a bunch of search engines.. just wondering how long it will take. :D

On one site that I administer, we get an average of 50-90 Slurps at any given time along with the usual fanfare of other search bots. But it's all good because each is spidering different section.

On my own forum (a different forum), which is about 5 months old, I've noticed that each time I add new content (cellphone wallpapers) I get hit with anywhere from 2 - 5 of them within a few mintutes of the new upload, almost without fail and they go directly to the section of the forum where I added the new content.

If I get some spare Sluprs I'll send 'em your way :).

da420 02-05-2007 11:09 PM

Send some of those guys my way :cool:

Artificial_Alex 02-05-2007 11:13 PM

Gaminggutter.com has atleast 40-50 spiders on at once. :P

Andem 05-14-2007 10:33 PM

You think 15 spiders online is bad? Slurp has had about 500 spiders on my site since yesterday; they're still there.. around double the normal amount!

Guest210212002 05-15-2007 02:07 AM

User-agent: Slurp
Crawl-delay: 60

Has solved all of my woes. :) I'm not sure what's been up in the last week, but with any other CD I've had hundreds of them online. I've no idea WHY that works, or if it will work for someone else, but it's working for me. (knock on wood)

Edit: It does still ignore my disallows though. :\

On the upside, I had 13,947 indexed pages in Yahoo two weeks ago, and currently I have 48,000, albeit after a LOT of SEO work, but it does seem like they're updating their content a bit more rigorously.

Sownman 05-20-2007 02:12 AM

I agree that spiders are a problem. I am having trouble with my host for using too much resource because of too many persistant DB connections. I always have lots of spiders.
I don't want to dissallow them in robots.txt as they are good. But, I must cut back on DB connections and I don't want to have 30 Yahoo spiders all at once while members can't get on.

IS THERE ANYTHING that will limit Slurps to ten at a time anyway. ??

Thanks

Steve

Lynne 05-20-2007 04:23 PM

Quote:

Originally Posted by Sownman (Post 1251169)
IS THERE ANYTHING that will limit Slurps to ten at a time anyway. ??

Thanks

Steve

Yes, put in the crawl delay as suggested above. If you have your cookie timeout set to 900 seconds and you only want 10 of those spiders on at a time, then set the crawl delay to 900/10=90.

Sownman 05-22-2007 08:18 PM

Quote:

Originally Posted by Lynne (Post 1251511)
Yes, put in the crawl delay as suggested above. If you have your cookie timeout set to 900 seconds and you only want 10 of those spiders on at a time, then set the crawl delay to 900/10=90.


I'll try that, thanks. Higher number should equal fewer spiders ?

Steve

ZomgStuff 05-22-2007 09:57 PM

I once had about 800 yaho0 slurpp spiders at one time, I was panicking, thought it was a Ddos.

chriscoleman 05-25-2007 12:48 PM

<div align="center">Another way of shutting out SLURP is by using the noindex meta-tag. Yahoo SLURP obeys this command in the document's head, and the code inserted in between the head tags of your document is

<META NAME=”robots” CONTENT=”noindex”>

This snippet will ensure that that Yahoo SLURP does not index the document in the search engine database. Another useful command is the nofollow meta-tag. The code inserted is

<META NAME=”robots” CONTENT=”nofollow”>

This snippet ensures that the links on the page are not followed.</div>
I found this on an SEO site.

Chachacha 08-10-2008 04:46 PM

Quote:

Originally Posted by Chris-777 (Post 1247788)
User-agent: Slurp
Crawl-delay: 60

Has solved all of my woes. :) I'm not sure what's been up in the last week, but with any other CD I've had hundreds of them online. I've no idea WHY that works, or if it will work for someone else, but it's working for me. (knock on wood)

Edit: It does still ignore my disallows though. :\

On the upside, I had 13,947 indexed pages in Yahoo two weeks ago, and currently I have 48,000, albeit after a LOT of SEO work, but it does seem like they're updating their content a bit more rigorously.

Where do I put that? In my htaccess?

Lynne 08-10-2008 04:51 PM

Quote:

Originally Posted by Chachacha (Post 1595592)
Where do I put that? In my htaccess?

No, you put it in your robots.txt file. You might want to search the site for more info on this.

Chachacha 08-10-2008 04:54 PM

Quote:

Originally Posted by Lynne (Post 1595596)
No, you put it in your robots.txt file. You might want to search the site for more info on this.

Ohh ok, thanks!

Alfa1 08-10-2008 08:17 PM

I have 2500 slurp spiders online at any given time. It got way more than it used to after I increased the crawl delay to 10. This may be coincidence.

Yahoo does not use more bandwidth than Google though. Yahoo just needs more spiders/IP's. Very annoying if half of your online users are bots.

Tharos 08-14-2008 03:16 AM

I think they are really good for our website's rank, I don't understand why you want to have less of them :s

TerryMason 08-19-2008 03:21 PM

From my forum:
Quote:

Currently Active Users: 1168 (3 members, 29 guests and 1136 spiders)

View Who's Online
Most users ever online was 2,369, 08-14-2008 at 10:55 PM.
I'd love to just have the text changed. It looks like I've just overinflated my stats.

Tharos 09-16-2008 09:16 PM

Could you tell us how did you get so many spiders in your web? :p


All times are GMT. The time now is 10:29 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01146 seconds
  • Memory Usage 1,796KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (2)bbcode_code_printable
  • (7)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (28)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete