Go Back   vb.org Archive > Community Discussions > Forum and Server Management
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 04-23-2009, 02:37 AM
vbplusme vbplusme is offline
 
Join Date: Sep 2008
Location: CyberSpace
Posts: 332
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default robots.txt for 3.8.2 - Any Ideas?

Hello and Greetings,

I have just noticed that Google Webmaster Tools is complaining about a LOT of URLs being restricted by my robots.txt file.

Is anyone else having this problem? If not, can I get an example of a robots.txt written for 3.8.2?

I tweaked mine thinking that I was fixing a duplicate content problem but I apparently crossed the line on it

Any ideas, suggestions greatly appreciated.

TIA
Reply With Quote
  #2  
Old 04-23-2009, 05:23 AM
Dismounted's Avatar
Dismounted Dismounted is offline
 
Join Date: Jun 2005
Location: Melbourne, Australia
Posts: 15,047
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

What is it currently?
Reply With Quote
  #3  
Old 04-23-2009, 07:20 AM
vbplusme vbplusme is offline
 
Join Date: Sep 2008
Location: CyberSpace
Posts: 332
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Sorry, should have thought to post it:

Quote:

User-agent: *

#Crawl-Delay: 10

Disallow: /admincp/
Disallow: /ajax.php
Disallow: /announcement.php
Disallow: /archive/
Disallow: /attachment.php
Disallow: /calendar.php
Disallow: /cgi-bin/
Disallow: /chat/
Disallow: /clientscript/
Disallow: /converse.php
Disallow: /cpstyles/
Disallow: /cron.php
Disallow: /customavatars/
Disallow: /customgroupicons/
Disallow: /customprofilepics/
Disallow: /editpost.php
Disallow: /faq.php
Disallow: /forumdisplay.php?daysprune
Disallow: /forumdisplay.php?do
Disallow: /forumdisplay.php?order
Disallow: /forumdisplay.php?page
Disallow: /forumdisplay.php?pp
Disallow: /forumdisplay.php?sort
Disallow: /gallery/
Disallow: /global.php
Disallow: /group_inlinemod.php
Disallow: /groupsubscription.php
Disallow: /images/
Disallow: /includes/
Disallow: /infraction.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /member.php
Disallow: /member_inlinemod.php
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /modcp/
Disallow: /moderation.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /payment_gateway.php
Disallow: /payments.php
Disallow: /personal/
Disallow: /printthread.php
Disallow: /profile.php?do
Disallow: /register.php
Disallow: /report.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showpost.php
Disallow: /showthread.php?goto
Disallow: /showthread.php?mode
Disallow: /showthread.php?p
Disallow: /showthread.php?page
Disallow: /showthread.php?post
Disallow: /showthread.php?pp
Disallow: /signaturepics/
Disallow: /subscription.php

User-Agent: msnbot
Crawl-Delay: 10

User-Agent: Slurp
Crawl-Delay: 10
Reply With Quote
  #4  
Old 04-23-2009, 08:25 AM
veenuisthebest's Avatar
veenuisthebest veenuisthebest is offline
 
Join Date: Mar 2008
Location: India
Posts: 1,416
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Two points in addition to above robots.txt:-

1. We should not give out our admincp directory in robots.txt as it makes the location displayable to the world. What is the use of renaming admincp feature then?

2. Also its good to give a referance to our sitemap at the end of robots.txt

Sitemap: http://site.com/sitemap_index.xml.gz
Reply With Quote
  #5  
Old 04-23-2009, 08:54 AM
vbplusme vbplusme is offline
 
Join Date: Sep 2008
Location: CyberSpace
Posts: 332
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Thanks for the comments, had not thought about the sitemap reference in there. thanks for that. I double password protect my admincp folder though I could easily take it out of the list altogether as the bots can not access it anyway so thanks for that comment as well.
Reply With Quote
  #6  
Old 04-24-2009, 06:36 PM
vbplusme vbplusme is offline
 
Join Date: Sep 2008
Location: CyberSpace
Posts: 332
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Anyone see any problem with the content of this robots.txt or have any idea how to fix the "google" complaining about the restrictions? TIA
Reply With Quote
  #7  
Old 04-24-2009, 10:33 PM
hambil's Avatar
hambil hambil is offline
 
Join Date: Jun 2004
Location: Seattle
Posts: 1,719
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by vbplusme View Post
Thanks for the comments, had not thought about the sitemap reference in there. thanks for that. I double password protect my admincp folder though I could easily take it out of the list altogether as the bots can not access it anyway so thanks for that comment as well.
It's not an issue of whether they access it or not, it is whether they try. Everything they try and fail at is wasted bandwidth and resources. If you password protect your admincp and modcp directories there is no reason to leave them out of robots.txt.

I pretty much followed the advice in this article, and have had not complaints from google: http://www.theadminzone.com/forums/s...ad.php?t=19872
Reply With Quote
  #8  
Old 04-25-2009, 01:16 AM
vbplusme vbplusme is offline
 
Join Date: Sep 2008
Location: CyberSpace
Posts: 332
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

As a matter of fact I did use those guidelines to construct my robots.txt (and the follow on suggestions in that thread). I forgot about that, thanks for reminding me about it.
Reply With Quote
  #9  
Old 04-26-2009, 08:35 AM
vbplusme vbplusme is offline
 
Join Date: Sep 2008
Location: CyberSpace
Posts: 332
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I do have a follow on question on the robots.txt file that I am currently using. I have the vbulletin blog software installed on this site as well as wordpress. I have not disallows in this robots.txt for any blog files. I would not have thought anything about it except that I just looked at my sitemap and see a huge number of URLs for blog stuff that doesn't really exist like archives from 1983?

Anyone have a suggestion about a sensible robots.txt entry for both the vbulletin blog and wordpress?

TIA for any ideas.
Reply With Quote
  #10  
Old 04-26-2009, 11:12 AM
hambil's Avatar
hambil hambil is offline
 
Join Date: Jun 2004
Location: Seattle
Posts: 1,719
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Sounds like a sitemap issue not a robot.txt issue. I know that's not an answer per say, but I'd be looking at why your sitemap contains links that don't exist, instead.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 01:22 AM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.04476 seconds
  • Memory Usage 2,255KB
  • Queries Executed 11 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)ad_showthread_firstpost
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (2)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)navbar
  • (3)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (1)pagenav_pagelink
  • (10)post_thanks_box
  • (10)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (10)post_thanks_postbit_info
  • (10)postbit
  • (10)postbit_onlinestatus
  • (10)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete