Quote:
Originally Posted by Southernphuk
Any clues or ideas for the clueless here? I've got a robot.txt file but that is to keep them out of certain areas that they really have no business bothering with (profiles, pm's, etc) but have shied away from outright cutting them off from the site as I figure that is a good way to have a presence in the search engines which of course everyone wants them to funnel their way.
Anyhow, extremely curious about this and wouldn't mind some thoughts and insight on this.
|
Most of the evil ones won't respect robots.txt anyway. You should care about them though - the nastier ones are just stealing your content and bringing you nothing in return. At worst they can bring your site to a standstill if they are poorly written (I've had this happen to me on a few occassions). I block a big list of the bad ones at my webserver and my forum response time increased immediately. I still allow all search engines through unless they are specifically blocked (I think I block around 200).
I also wrote a
small mod to remove spiders from the statistics on your forum. This doesn't block anything, but it is useful to get rid of the 100's of sessions started by spiders like Yahoo's slurp.