Go Back   vb.org Archive > vBulletin 4 Discussion > vB4 General Discussions
  #1  
Old 09-23-2013, 04:16 PM
crazyboy1661 crazyboy1661 is offline
 
Join Date: Jan 2011
Location: India
Posts: 135
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default robots.txt file

Hi guys, here is the Robots.txt code given for those who wants it.

PHP Code:
User-agentMediapartners-Google
Disallow
:

User-agent: *
Disallow: /go/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /admincp/
Disallow: /modcp/
Disallow: /attachment.php
Disallow
: /search.php
Disallow
: /newreply.php
Disallow
: /newthread.php
Disallow
: /editpost.php
Disallow
: /profile.php
Disallow
: /register.php
Disallow
: /login.php
Disallow
: /subscription.php
Disallow
: /private.php
Disallow
: /report.php
Disallow
: /sendmessage.php
Disallow
: /member.php
Disallow
: /memberlist.php
Disallow
: /misc.php
Disallow
: /moderator.php
Disallow
: /postings.php
Disallow
: /sendtofriend.php
Disallow
: /threadrate.php
Disallow
: /usercp.php
Disallow
: /showgroups.php
Disallow
: /announcement.php 
Also refer these links
https://vborg.vbsupport.ru/showthrea...51#post2446551
https://vborg.vbsupport.ru/showthread.php?t=302483
Reply With Quote
  #2  
Old 09-24-2013, 12:06 AM
bzcomputers's Avatar
bzcomputers bzcomputers is offline
 
Join Date: Apr 2012
Location: TX
Posts: 503
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

DO NOT place your admincp and modcp in the robots.txt file! This will only alert hackers that they exist in that exact position. Leave them out completely, this also goes if you have renamed them.

Robots can ignore your robots.txt file. Especially malware robots that scan the web for security vulnerabilities and email address harvesters used by spammers. The robots.txt file is a publicly available file. Anyone can see what sections of your site you don't want robots to got to, so don't go broadcasting directions to secure areas of your site that you don't want anyone to go to (either with good intentions or bad).


The purpose of robots.txt file is to inform the "good" robots of your site layout and what you do and don't want to be indexed.

Each person's robots.txt file should be a little different. Depending on how your forum was installed (in the root or not) and what add-ons and changes have been made since the initial install.

You should always include a reference to your sitemap. There are robots.txt "validators" that will not validate unless the sitemap is included within the file. Both Google and MSNbot (Bing / Yahoo) use validators and will look for a sitemap reference. Your sitemap needs to be the full url. Here is a sitemap reference example:

Code:
Sitemap: http://www.yoursite.com/sitemap.xml
Your sitemap name and location may be different. I recommend placing the sitemap in the first line of your robots.txt file.




The robots.txt file below is a little more in-depth than the first post above and will yield you better results.


This example that can be used for an initial forum install:
Code:
Sitemap: http://www.yoursite.com/sitemap_index.xml.gz


User-agent: Mediapartners-Google
Disallow:


User-agent: *
Disallow: clientscript/
Disallow: cpstyles/
Disallow: customavatars/
Disallow: customgroupicons/
Disallow: customprofilepics/
Disallow: customsignaturepics/
Disallow: forumrunner/
Disallow: images/
Disallow: includes/
Disallow: install/
Disallow: members/
Disallow: mobiquo/
Disallow: sitemap/
Disallow: ajax.php
Disallow: attachment.php
Disallow: calendar.php
Disallow: cron.php
Disallow: editpost.php
Disallow: global.php
Disallow: image.php
Disallow: inlinemod.php
Disallow: joinrequests.php
Disallow: login.php
Disallow: member.php
Disallow: memberlist.php
Disallow: misc.php
Disallow: moderator.php
Disallow: newattachment.php
Disallow: newreply.php
Disallow: newthread.php
Disallow: online.php
Disallow: poll.php
Disallow: postings.php
Disallow: printthread.php
Disallow: private.php
Disallow: profile.php
Disallow: register.php
Disallow: report.php
Disallow: reputation.php
Disallow: search.php
Disallow: sendmessage.php
Disallow: subscription.php
Disallow: threadrate.php
Disallow: usercp.php
Disallow: usernote.php



This is a modified example of the above:

This example shows how a robots.txt could look for a forum install outside the root with some added mods, some additional image directory changes and the complete blocking of user-agent Baiduspider.


I recommend if you don't cater to an audience in China to block all Baiduspiders otherwise they will hammer your site hundreds of times a day. For additional details on Baiduspider: http://chineseseoshifu.com/blog/what...iduspider.html.


There are other mods for vB available for blocking user-agents I recommend this one: https://vborg.vbsupport.ru/showthread.php?t=268208. In the meantime Baiduspider is the only user-agent I block that I know does not ignore the robots.txt block.


Code:
Sitemap: http://www.yoursite.com/sitemap_index.xml.gz


User-agent: Mediapartners-Google
Disallow:


User-agent: Baiduspider
Disallow: /



User-agent: *
Disallow: /doubleclick/
Disallow: /eyeblaster/
Disallow: /forum/archive/
Disallow: /forum/clientscript/
Disallow: /forum/cpstyles/
Disallow: /forum/customavatars/
Disallow: /forum/customgroupicons/
Disallow: /forum/customprofilepics/
Disallow: /forum/customsignaturepics/
Disallow: /forum/dbtech/
Disallow: /forum/forumrunner/
Disallow: /forum/images/
Disallow: /forum/includes/
Disallow: /forum/install/
Disallow: /forum/members/
Disallow: /forum/mobiquo/
Disallow: /forum/sitemap/
Disallow: /forum/vbseo/
Disallow: /forum/ajax.php
Disallow: /forum/attachment.php
Disallow: /forum/calendar.php
Disallow: /forum/cron.php
Disallow: /forum/editpost.php
Disallow: /forum/global.php
Disallow: /forum/image.php
Disallow: /forum/inlinemod.php
Disallow: /forum/joinrequests.php
Disallow: /forum/login.php
Disallow: /forum/member.php
Disallow: /forum/memberlist.php
Disallow: /forum/misc.php
Disallow: /forum/moderator.php
Disallow: /forum/newattachment.php
Disallow: /forum/newreply.php
Disallow: /forum/newthread.php
Disallow: /forum/online.php
Disallow: /forum/poll.php
Disallow: /forum/postings.php
Disallow: /forum/printthread.php
Disallow: /forum/private.php
Disallow: /forum/profile.php
Disallow: /forum/register.php
Disallow: /forum/report.php
Disallow: /forum/reputation.php
Disallow: /forum/search.php
Disallow: /forum/sendmessage.php
Disallow: /forum/subscription.php
Disallow: /forum/threadrate.php
Disallow: /forum/usercp.php
Disallow: /forum/usernote.php
Disallow: /javascript/
Disallow: /misc/
Disallow: /styles/
Disallow: /xcache/
Reply With Quote
2 благодарности(ей) от:
crazyboy1661, synseal
  #3  
Old 09-28-2013, 06:42 AM
crazyboy1661 crazyboy1661 is offline
 
Join Date: Jan 2011
Location: India
Posts: 135
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Hi bzcomputers, Now i got it. you have shared such a nice and very important information. really it is worth to read and follow.

thanks a lot.

--------------- Added [DATE]1380357369[/DATE] at [TIME]1380357369[/TIME] ---------------

As you have specified in your second modified code i.e Sitemap: http://www.yoursite.com/sitemap_index.xml.gz, I have added to the robots.txt file. But I pointed to that url and it is showing some error.
PHP Code:
Error loading stylesheetAn unknown error has occurred (805303f4)http://telugudosti.com/vbseo_sitemap/sitemap.xsl 
I don't found any extension called .xml.gz in the root directory.

I found something like the below one in the said path.

PHP Code:
store_sitemap/vbulletin_sitemap_thread_0.xml.gz 

so, i have added the same to my robots.txt. I don't know whether it is correct or not. I need your advice in this regard pls.


thanks

--------------- Added [DATE]1380357941[/DATE] at [TIME]1380357941[/TIME] ---------------

Sorry I am confused.
Reply With Quote
  #4  
Old 10-01-2013, 12:45 PM
bzcomputers's Avatar
bzcomputers bzcomputers is offline
 
Join Date: Apr 2012
Location: TX
Posts: 503
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

When using vbSEO Sitemap you must have some information included in your .htaccess file. This information will allow for keeping your vbSEO sitemap creation files secure while still allowing public access to the sitemap themselves.

If you already have an .htaccess file in your root you need to add this:
Code:
RewriteRule ^((urllist|sitemap_).*\.(xml|txt)(\.gz)?)$ vbseo_sitemap/vbseo_getsitemap.php?sitemap=$1 [L]
If you do not have an .htaccess file in your root:
Code:
RewriteEngine On

RewriteRule ^((urllist|sitemap_).*\.(xml|txt)(\.gz)?)$ vbseo_sitemap/vbseo_getsitemap.php?sitemap=$1 [L]
...I also notice on your site that you do not force or remove a url prefix (the www.). There are some benefits to forcing the prefix one way or the other, this includes a major SEO related one. Forcing a prefix (one way or the other) will stop search engines from splitting the weight of your urls. Right now your site is being indexed as both: http://telugudosti.com/ and http://www.telugudosti.com/, in turn every url beneath it also has the possibility of being indexed both with www. and without www. This can cause the weight of your urls to be split between to the two addresses causing their overall page ranking to be much lower. By forcing the prefix one way or the other will eliminate this and in short time your duplicate listings will be merged within the search engines creating a single listing for each page and in turn the much higher ranking they deserve.

Although there are many opinions on which way to go (with or without www.). It is really up to you but you should definitely go one way or the other. To do this again you will be editing your .htaccess file by adding this:

To force with www.:
Code:
RewriteCond %{HTTP_HOST} ^telugdosti.com$
Code:
RewriteRule (.*) http://www.telugdosti.com\/$1 [R=301]

To force without www.:
Code:
RewriteCond %{HTTP_HOST} ^www.telugdosti.com$
RewriteRule (.*) http://telugdosti.com\/$1 [R=301]



So if you didn't have an htaccess prior it would now look like:
Code:
RewriteEngine On
Code:
RewriteCond %{HTTP_HOST} ^telugdosti.com$
RewriteRule (.*) http://www.telugdosti.com\/$1 [R=301]

RewriteRule ^((urllist|sitemap_).*\.(xml|txt)(\.gz)?)$ vbseo_sitemap/vbseo_getsitemap.php?sitemap=$1 [L]


...
Reply With Quote
Благодарность от:
CAG CheechDogg
  #5  
Old 10-02-2013, 06:19 AM
crazyboy1661 crazyboy1661 is offline
 
Join Date: Jan 2011
Location: India
Posts: 135
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

bzcomputers, thanks for your information. I just sent a PM to you. Pls find the details.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 04:01 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.04267 seconds
  • Memory Usage 2,235KB
  • Queries Executed 11 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)ad_showthread_firstpost
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (10)bbcode_code
  • (3)bbcode_php
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)navbar
  • (3)navbar_link
  • (120)option
  • (5)post_thanks_box
  • (3)post_thanks_box_bit
  • (5)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (2)post_thanks_postbit
  • (5)post_thanks_postbit_info
  • (5)postbit
  • (5)postbit_onlinestatus
  • (5)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete