View Full Version : Yet another ROBOTS.txt file question (sorry)
invitezone
07-09-2012, 06:42 PM
Can someone give me some advice on this please.
I have a public section and a private section of my forum.
I want to allow bots from search engines to crawl and index my public forum in order to get some traffic through it.
But I also want to absolutely deny any bots at all from access and crawling anything else.
Can I specify a robots.txt that says allow 1 forum category and disallow everywhere else?
I have searched but can't find any information on these specifics. Obviously I found lots of robots.txt stuff, but nothing that can be of specific help.
Thanks for your help.
Zachery
07-09-2012, 06:55 PM
Nope, that's not what robots.txt is for anyway.
invitezone
07-09-2012, 08:08 PM
ok thanks, so how can I achieve what I described?
any help on that?
Thanks
nhawk
07-09-2012, 08:22 PM
ACP->Forums & Moderators->Forum Permissions
Change the 'Unregistered/Not Logged In' permissions for forums you don't want viewed to all 'No'.
With that setting, the forums won't even exist to robots or unregistered users.
invitezone
07-09-2012, 09:25 PM
oh as simple as that, ok didn't realise, im still new to most of this. So what is the point of the robots.txt other than to deny all bots? THANKS
nhawk
07-10-2012, 05:34 PM
Robots.txt is good for asking robots not to access things like register.php, search.php, etc.
Note I said 'asking'. Not all robots obey robots.txt.
invitezone
07-10-2012, 09:59 PM
for what reason would you do that? BTW thanks again for your patience and help
nhawk
07-11-2012, 09:21 AM
Many pages serve no purpose to a robot and are just a waste of bandwidth when they crawl those pages.
Register.php, login.php, search.php, subscription.php, profile.php are just a few of those types of pages.
Macsee
07-11-2012, 12:24 PM
There are threads around with lists of the pages you might want to include in your robots.txt block.
Note that the blocking suggested above blocks not just robots but also unregistered visitors. If that is your intention, fine. If you want to allow everyone to see a particular forum and just don't want that forum's threads appearing in search engines then I don't believe you can do that with vB.
nhawk
07-11-2012, 01:53 PM
Note that the blocking suggested above blocks not just robots but also unregistered visitors. If that is your intention, fine. If you want to allow everyone to see a particular forum and just don't want that forum's threads appearing in search engines then I don't believe you can do that with vB.
I said that. ;)
Macsee
07-11-2012, 02:02 PM
You said what, that it can't be done with vB? ;)
In fact, I made that post hoping someone would tell me I was wrong and point to a way one could allow visitors into a forum but prevent that forum's threads from showing up in the SEs.
ForceHSS
07-11-2012, 02:08 PM
If the forum is set to private spiders will not be able to access it anyway
Macsee
07-11-2012, 02:28 PM
Setting it to private blocks unregistered visitors.
I know the logic may sound screwed up, but there is sense in such a setup. For example, where you want to allow links in the sub-forum but want to dissuade people posting links in there for the SEO benefit.
nhawk
07-11-2012, 02:30 PM
You said what, that it can't be done with vB? ;)
In fact, I made that post hoping someone would tell me I was wrong and point to a way one could allow visitors into a forum but prevent that forum's threads from showing up in the SEs.
I said the this part of your first post...
.....
Note that the blocking suggested above blocks not just robots but also unregistered visitors. If that is your intention, fine....
invitezone
07-11-2012, 04:21 PM
ok all good info thanks for your time everyone.
I actually want a small section of my forum to be PUBLIC and I want search engines to index it.
I just dont want to have 50 bots killing my bandwidth.
any answer to that?
Thanks a million for your help.
Nichtofen
08-04-2012, 06:47 PM
bump...
1. Create a robots.txt to eliminate pages like profiles, search, etc from spiders (This will also help eliminate profiles and other pages from coming up in searches by potential traffic.)
2. Create permissions per forum for unregistered visitors like 'nhawk' described and be careful as visitors will not be able to view these forums either depending on configurations.
3. If you really wish to control which robots/spiders are accessing your forums and utilizing bandwidth, consider this add on. I do however recommend you read all 350 posts in order to guarantee your success. There is a lot to read on user agents even beyond the resources available within the tread in order to properly use this program and ensure you are not shutting out spiders that could help your traffic and cause.
Ban Spiders by User Agent (https://vborg.vbsupport.ru/showthread.php?t=268208&page=7) by Simon Lloyd
zascok
08-05-2012, 03:31 PM
here is the start for you, then get robots you don't like covered by
User-agent: Name of Robot
Disallow: /
or ban them ^ just like said above
robot.txt
User-agent: *
Crawl-delay: 10
Disallow: /*.js
Disallow: /clientscript/
Disallow: /customgroupicons/
Disallow: /packages/
Disallow: /signaturepics/
Disallow: /customprofilepics/
Disallow: /store_sitemap/
Disallow: /vb/
Disallow: /cpstyles/
Disallow: /cron.php
Disallow: /customavatars/
Disallow: /customprofilepics/
Disallow: /includes/
Disallow: /images/
Disallow: /ajax.php
Disallow: /album.php
Disallow: /announcement.php
Disallow: /api.php
Disallow: /apichain.php
Disallow: /asset.php
Disallow: /assetmanage.php
Disallow: /attachment.php
Disallow: /attachment_inlinemod.php
Disallow: /blog_attachment.php
Disallow: /calendar.php
Disallow: /ckeditor.php
Disallow: /clear.gif
Disallow: /converse.php
Disallow: /cron.php
Disallow: /css.php
Disallow: /editor.php
Disallow: /editpost.php
Disallow: /entry.php
Disallow: /external.php
Disallow: /faq.php
Disallow: /favicon.ico
Disallow: /global.php
Disallow: /group.php
Disallow: /group_inlinemod.php
Disallow: /groupsubscription.php
Disallow: /image.php
Disallow: /infraction.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /LICENSE
Disallow: /list.php
Disallow: /login.php
Disallow: /member.php
Disallow: /member_inlinemod.php
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /mobile.php
Disallow: /moderation.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /payment_gateway.php
Disallow: /payments.php
Disallow: /picture.php
Disallow: /picture_inlinemod.php
Disallow: /picturecomment.php
Disallow: /poll.php
Disallow: /posthistory.php
Disallow: /postings.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /profile.php
Disallow: /register.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /subscription.php
Disallow: /threadrate.php
Disallow: /threadtag.php
Disallow: /uploadprogress.gif
Disallow: /usercp.php
Disallow: /usernote.php
Disallow: /visitormessage.php
Disallow: /widget.php
Disallow: /xmlsitemap.php
invitezone
08-05-2012, 06:09 PM
thanks a lot for this zascok.
if I dont want to ban any spiders can I just copy and paste this into a txt file and leave it at that?
zascok
08-05-2012, 07:12 PM
yup and nope you can't leave it at that you gotta up it into the root of your forum :)
Nichtofen
08-05-2012, 07:58 PM
Indeed. The file should be located at www.yourdomain.com/robot.txt . Right in your root assuming that you have that access. That is where Google, as well as other well behaved bots will look for it. It gets more complicated if you are on a shared server through a provider that gives you a default address such as www.sharedservercompany.com/yourusername or something similar. In that situation, you would require you host to assist if they are able.
--------------- Added 1344202148 at 1344202148 ---------------
Curiously,
Within your code you do not have a "forums/" prefix on your items. That would be required if your forum was located within a forums folder in the root, correct?
zascok
08-05-2012, 08:41 PM
Curiously,
Within your code you do not have a "forums/" prefix on your items. That would be required if your forum was located within a forums folder in the root, correct?
all the same with /forums on front of each line for the forum itself, the rest is up to what you have in the root. I just don't have anything else but forum :) so it's right in the top.
Disallow: /forums/*.js
...
...
Disallow: /forums/xmlsitemap.php
Nichtofen
08-05-2012, 08:50 PM
To clarify:
Disallow: /forums/*.js
...
...
Disallow: /forums/xmlsitemap.php
Has the same exact effect as:
Disallow: /*.js
...
...
Disallow: /xmlsitemap.php
Just wanted to make sure I understood correctly. I will not bother with changing unless it is necessary. If it automatically seeks a sub-folder on the server anywhere with that name then I will just leave it.
Thanks in advance zascok!
--------------- Added 1344206047 at 1344206047 ---------------
I just don't have anything else but forum :) so it's right in the top.
Gotcha, thanks!
invitezone
08-06-2012, 01:08 PM
yup and nope you can't leave it at that you gotta up it into the root of your forum :)
hehehe, erm yeah I understand that much :p
I meant if I don't want to ban any bots or spiders from my site, I just want to limit them to the usual stuff, I can just leave your example file as it is, unedited, and upload that to forum root right?
Thanks
Nichtofen
08-06-2012, 09:08 PM
That is correct. Put robots.txt into your root with the contents that zascok gracefully provided. :)
vBulletin® v3.8.12 by vBS, Copyright ©2000-2025, vBulletin Solutions Inc.