There are several ways to keep the spiders out if you want to.
You can use a meta tag on any page to keep robots from crawling that page or any links they find on it.
<meta name="robots" content="noindex, nofollow">
You can place a text file called "robots.txt" in the web root (and only in the web root) that says
User-agent: inktomisearch.com
Disallow: /
or you can use an .htaccess file in any directory you don't want them in - like this:
Order allow,deny
allow from all
deny from inktomisearch.com
Notice there is no space between allow and deny - that's important. You can also use this in httpd.conf if you're running Apache.
But - if you'd prefer not to do DNS lookups for everyone that hits your site you can do it like this -
Order allow,deny
allow from all
deny from 209.131.63
but you'd substitute the subnet the spiders are coming from.
|