PDA

View Full Version : 549 spiders on my site


dtv100
01-19-2009, 03:29 PM
549 spiders at the same time is that a good thing or bad ?

my server load is 1.46 because of this .

snakes1100
01-19-2009, 03:42 PM
Use a robots.txt file to limit the bots, i wont hazard a guess, but i would say Yahoo is visiting you today.

Typically i block all but one of the Yahoo's bot server IP's until they start following the robots.txt file you most likely didn't upload to your site.

dtv100
01-19-2009, 07:35 PM
most of then gone now. right now i have the average i always have between 10 to 30 spiders and is much better.

edermix
01-19-2009, 08:59 PM
How did you get so many spiders? How was your robots.txt file? Do you have a mod so that you did the visits of so many spiders?

I would much all these visits from spiders in my forum. Help me.

glennybee
01-19-2009, 11:53 PM
Why would you want to restrict spiders? It's spiders that get you organic traffic.

nexialys
01-20-2009, 12:07 AM
two different spiders, two different presences on your site:

1- Googlebot: a single bot which visit each link of your site page by page
2- Yahoobot: a pyramidal bot which will call one friend per new link found on a page

Google will appear once, on a long period because it visit the site on its own... if you see multiple google bots, they usually are driven by references (too much pages to see, would be too slow if a single bot)

Yahoo bot will have so much links to visit that your sessions are overloaded like if there is really millions of visitors, but it's just a cascade of different IP used by a single engine.

dtv100
01-20-2009, 01:23 AM
How did you get so many spiders? How was your robots.txt file? Do you have a mod so that you did the visits of so many spiders?

I would much all these visits from spiders in my forum. Help me.

i not using no hack for this .
usually i just get from 10 to 30 spiders a hour don't know what happen today .

edermix
01-22-2009, 09:50 PM
My robots. See:

http://www.gcert.com.br/robots.txt

3 spiders. :(

edermix
02-08-2009, 11:55 AM
Help-me Somewere??

kNeeLy
02-08-2009, 10:43 PM
yea..if sumone could explain the whole spider thing...lol...I know my server will make me a robot.txt file or w/e it is..but i have no idea how to use it...can anyone help? :-D

dtv100
02-09-2009, 05:15 AM
here a copy of my robot:


User-agent: *
User-agent: Slurp
User-agent: Mediapartners-Google*

Disallow: /forums/referrers.php
Disallow: /forums/arcade.php
Disallow: /forums/ajax_cron.php
Disallow: /forums/ajax.php
Disallow: /forums/attachment.php
Disallow: /forums/calendar.php
Disallow: /forums/cron.php
Disallow: /forums/editpost.php
Disallow: /forums/global.php
Disallow: /forums/image.php
Disallow: /forums/inlinemod.php
Disallow: /forums/joinrequests.php
Disallow: /forums/login.php
Disallow: /forums/member.php
Disallow: /forums/memberlist.php
Disallow: /forums/misc.php
Disallow: /forums/moderator.php
Disallow: /forums/newattachment.php
Disallow: /forums/newreply.php
Disallow: /forums/newthread.php
Disallow: /forums/online.php
Disallow: /forums/poll.php
Disallow: /forums/postings.php
Disallow: /forums/printthread.php
Disallow: /forums/private.php
Disallow: /forums/post_thanks.php
Disallow: /forums/payment_gateway.php
Disallow: /forums/profile.php
Disallow: /forums/register.php
Disallow: /forums/report.php
Disallow: /forums/reputation.php
Disallow: /forums/rules.php
Disallow: /forums/search.php
Disallow: /forums/sendmessage.php
Disallow: /forums/showgroups.php
Disallow: /forums/subscription.php
Disallow: /forums/threadrate.php
Disallow: /forums/usercp.php
Disallow: /forums/admin/
Disallow: /forums/infernoshout.php
Disallow: /forums/redbar
Disallow: /forums/chat
Disallow: /forums/members/
Disallow: /forums/member-*

Sitemap:http://www.domain.com/forums/sitemap_index.xml.gz

kNeeLy
02-09-2009, 06:22 PM
so..Is it better to disallow a few of your links?

Brandon Sheley
02-09-2009, 06:41 PM
so..Is it better to disallow a few of your links?

You don't need the spiders checking out the pages or content that isn't directly related to your site ;)

The robots.txt file posted above is good.. Everyone should have a copy of that in their root domain.. Yousite.com/robots.txt

Replace the domain info for your own with regards to the sitemap ;)

here is the one in my root
http://www.vbulletinsetup.com/robots.txt
and since my forum is a subdomain, I have one for it as well
http://forum.vbulletinsetup.com/robots.txt

ilrglen
08-17-2009, 01:32 PM
This may be a stupid quest but what is the gz for in these links to your sitemap?

# BEGIN XML-SITEMAP-PLUGIN
Sitemap: http://www.vbulletinsetup.com/sitemap.xml.gz
Sitemap: http://forum.vbulletinsetup.com/sitemap_index.xml.gz
# END XML-SITEMAP-PLUGIN

User-agent: Mediapartners-Google*
Disallow:
User-agent: *

Disallow: /pics
Disallow: /wp-admin
Disallow: /wp-login
Disallow: /wp-register
Disallow: /2006/
Disallow: /2007/
Disallow: /2008/