The Arcive of Official vBulletin Modifications Site.It is not a VB3 engine, just a parsed copy! |
|
#1
|
||||
|
||||
Prevent folders being crawled
I have a test site that is in a folder within my forum root but I notice that there are 5 google spiders crawling certain bits of it?
How does one prevent a folder being crawled by any spiders? Ta.. |
#2
|
||||
|
||||
Do you have a robots.txt file in your forum root?
If not add the following to it: Code:
User-agent: * Disallow: /forums/MY FOLDER NAME/ Change this, MY FOLDER NAME to the name of the folder. If you already have a robots.txt file, just add this line to it: Code:
Disallow: /forums/MY FOLDER NAME/ |
#3
|
||||
|
||||
robots.txt?
Bloody ell, something else to know more about. Just googled it, just done it. Thanks.. I came across this robots.txt a while ago and completely forgot about it. Ta.. |
#4
|
||||
|
||||
Not a problem, glad to help. Also since you did not have one in place, you may want to add this to it also to prevent the bots going to these pages/folders:
Code:
Disallow: /cgi-bin/ Disallow: /activity.php Disallow: /admincp/ Disallow: /announcement.php Disallow: /calendar.php Disallow: /cron.php Disallow: /editpost.php Disallow: /joinrequests.php Disallow: /login.php Disallow: /misc.php Disallow: /modcp/ Disallow: /moderator.php Disallow: /newreply.php Disallow: /newthread.php Disallow: /online.php Disallow: /printthread.php Disallow: /private.php Disallow: /register.php Disallow: /search.php Disallow: /sendmessage.php Disallow: /showgroups.php Disallow: /showpost.php Disallow: /subscription.php Disallow: /subscriptions.php Disallow: /threadrate.php Disallow: /usercp.php |
#5
|
||||
|
||||
I did an online robots.txt generating script, it came up with this..
Code:
Sitemap: https://www.xxxxxxxxxxx.com/xxxxxxxx/vbulletin_sitemap_blog_0.xml.gz User-agent: Baiduspider Disallow: / User-agent: * It now looks like this. Code:
Sitemap: https://www.xxxxxxxxxxx.com/xxxxxxxx/vbulletin_sitemap_blog_0.xml.gz User-agent: Baiduspider Disallow: / User-agent: * Disallow: /cgi-bin/ Disallow: /activity.php Disallow: /admincp/ Disallow: /announcement.php Disallow: /calendar.php Disallow: /cron.php Disallow: /editpost.php Disallow: /joinrequests.php Disallow: /login.php Disallow: /misc.php Disallow: /modcp/ Disallow: /moderator.php Disallow: /newreply.php Disallow: /newthread.php Disallow: /online.php Disallow: /printthread.php Disallow: /private.php Disallow: /register.php Disallow: /search.php Disallow: /sendmessage.php Disallow: /showgroups.php Disallow: /showpost.php Disallow: /subscription.php Disallow: /subscriptions.php Disallow: /threadrate.php Disallow: /usercp.php Cheers for that. |
#6
|
||||
|
||||
Keep in mind, robots.txt is alot like gun control laws - only the law abiding pay any attention to it. Bad spiders such as Baidu and 100s of others completely ignore robots.txt. It's not a blocker, it is a list.
To block bad bots get the "Ban Spiders by User Agent" mod it is linked at the link in my sig. |
#7
|
||||
|
||||
Cheers for that..
But, in my entire duration, I have had no spam whatsoever. None have registered ever. Must be going on for 2 years now. Nothing. I am slightly worried. |
#8
|
||||
|
||||
The Ban Spiders Mod isn't really a anti-spam mod per-se, it just blocks bad spiders and also anything else you put on the list. That makes it useful as part of a overall anti-spam battlement.
|
#9
|
||||
|
||||
If I ever get spam, I will consider it. Thanks.
I think my main problem now is ranking, or more precisely, the lack of it. This is my next adventure into the wilderness. |
#10
|
||||
|
||||
Test sites should be password protected. If they are, then they won't get indexed.
|
2 благодарности(ей) от: | ||
Max Taxable, ozzy47 |
|
|
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
More Information | |
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|