Log in

View Full Version : robots.txt - can't stop slurp!


memobug
11-28-2005, 08:09 PM
I am running my vB forum in a subdomain. I added robots.txt to both my root and subdomain folder, but I can't seem to stop slurp (inktomi) from trying to access all the wrong pages.

in my subdomain (http://forum.mydomain.com/robots.txt) , I didn't know if I needed the preceding / so I have

User-agent: *
Disallow: /attachment.php
Disallow: /newattachment.php
Disallow: /avatar.php
Disallow: /editpost.php
Disallow: /login.php
Disallow: /member.php
Disallow: /member2.php
Disallow: /misc.php
Disallow: /moderator.php
Disallow: /newreply.php
Disallow: /newthread.php
etc.


Do I have the path wrong Should I take out the leading slash / ? The spider seems to be ignoring my disallows and its trying to edit posts, visit the print views and all the other banned stuff. The path to my forum is like http://forum.mydomain.com/index.php


in my root (http://www.mydomain.com/robots.txt) Just in case, I also I have
User-agent: *
Disallow: /forums/attachment.php
Disallow: /forums/newattachment.php
Disallow: /forums/avatar.php
Disallow: /forums/editpost.php
Disallow: /forums/login.php
Disallow: /forums/member.php
Disallow: /forums/member2.php
Disallow: /forums/misc.php
etc.

memobug
12-14-2005, 09:03 PM
I could sure use some help with this. My ISP is going to force me to change service contracts over this issue. I have like 80 spider users showing at any given time and 60 of them are *.inktomisearch.com

samsons
12-16-2005, 03:26 AM
Hi
i`m not shure if this helps, i only know german pages

here (http://www.searchengineworld.com/cgi-bin/robotcheck.cgi) you can check your robots.txt and some useful informations

and a very useful page about robots and what to do klick (http://www.robotstxt.org/wc/robots.html)

it`s not a lot, but maybe it helps

MRGTB
12-16-2005, 09:28 PM
If I'm right it can take a month before your robots.txt file will have effect in stopping them (read that somewhere).

I think the best option is to use a .htaccess file to deny them access to the server

Paul M
12-16-2005, 10:03 PM
I could sure use some help with this. My ISP is going to force me to change service contracts over this issue. I have like 80 spider users showing at any given time and 60 of them are *.inktomisearch.comHuh ? You ISP is complaining that search engines are spidering you ??? Are they completely dumb ? It's a normal part of the web, we rarely have < 150 at any one time. I think you need to look for a new ISP.

MRGTB
12-16-2005, 11:08 PM
Huh ? You ISP is complaining that search engines are spidering you ??? Are they completely dumb ? It's a normal part of the web, we rarely have < 150 at any one time. I think you need to look for a new ISP.

Are you on a dedicated server though?

Zia
12-17-2005, 05:13 AM
can any one help me....to generate a nice robots.txt , that stop known bad robots & image.

we are getting Google,Yahoo!Slurp,MsnBot --mostly..

but i really dont know a lot abt robots...which is bad or which is good....

can any one help me ?/