Go Back   vb.org Archive > vBulletin 4 Discussion > vB4 General Discussions
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 03-08-2014, 11:20 AM
DemOnstar's Avatar
DemOnstar DemOnstar is offline
 
Join Date: Dec 2012
Posts: 859
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default Prevent folders being crawled

I have a test site that is in a folder within my forum root but I notice that there are 5 google spiders crawling certain bits of it?

How does one prevent a folder being crawled by any spiders?

Ta..
Reply With Quote
  #2  
Old 03-08-2014, 11:48 AM
ozzy47's Avatar
ozzy47 ozzy47 is offline
 
Join Date: Jul 2009
Location: USA
Posts: 10,929
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Do you have a robots.txt file in your forum root?
If not add the following to it:

Code:
User-agent: *
Disallow: /forums/MY FOLDER NAME/
Remove this, forums and the trailing slash behind it if your site is in the public_html folder, or change it to your folder name that the site is in.

Change this, MY FOLDER NAME to the name of the folder.

If you already have a robots.txt file, just add this line to it:
Code:
Disallow: /forums/MY FOLDER NAME/
Of course following the above.
Reply With Quote
  #3  
Old 03-08-2014, 12:36 PM
DemOnstar's Avatar
DemOnstar DemOnstar is offline
 
Join Date: Dec 2012
Posts: 859
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

robots.txt?
Bloody ell, something else to know more about.
Just googled it, just done it. Thanks..

I came across this robots.txt a while ago and completely forgot about it.

Ta..
Reply With Quote
  #4  
Old 03-08-2014, 12:38 PM
ozzy47's Avatar
ozzy47 ozzy47 is offline
 
Join Date: Jul 2009
Location: USA
Posts: 10,929
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Not a problem, glad to help. Also since you did not have one in place, you may want to add this to it also to prevent the bots going to these pages/folders:

Code:
Disallow: /cgi-bin/

Disallow: /activity.php

Disallow: /admincp/

Disallow: /announcement.php

Disallow: /calendar.php

Disallow: /cron.php

Disallow: /editpost.php

Disallow: /joinrequests.php

Disallow: /login.php

Disallow: /misc.php

Disallow: /modcp/

Disallow: /moderator.php

Disallow: /newreply.php

Disallow: /newthread.php

Disallow: /online.php

Disallow: /printthread.php

Disallow: /private.php

Disallow: /register.php

Disallow: /search.php

Disallow: /sendmessage.php

Disallow: /showgroups.php

Disallow: /showpost.php

Disallow: /subscription.php

Disallow: /subscriptions.php

Disallow: /threadrate.php

Disallow: /usercp.php
Reply With Quote
  #5  
Old 03-08-2014, 01:20 PM
DemOnstar's Avatar
DemOnstar DemOnstar is offline
 
Join Date: Dec 2012
Posts: 859
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I did an online robots.txt generating script, it came up with this..

Code:
Sitemap: https://www.xxxxxxxxxxx.com/xxxxxxxx/vbulletin_sitemap_blog_0.xml.gz


User-agent: Baiduspider
Disallow: /
User-agent: *
Somebody advised me to add Baiduspider and the rest is pretty much the same as you previously posted.

It now looks like this.

Code:
Sitemap: https://www.xxxxxxxxxxx.com/xxxxxxxx/vbulletin_sitemap_blog_0.xml.gz


User-agent: Baiduspider
Disallow: /
User-agent: *
Disallow: /cgi-bin/
Disallow: /activity.php
Disallow: /admincp/
Disallow: /announcement.php
Disallow: /calendar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /misc.php
Disallow: /modcp/
Disallow: /moderator.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /register.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /showpost.php
Disallow: /subscription.php
Disallow: /subscriptions.php
Disallow: /threadrate.php
Disallow: /usercp.php
Which now has me totally baffled.

Cheers for that.
Reply With Quote
  #6  
Old 03-08-2014, 02:15 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Keep in mind, robots.txt is alot like gun control laws - only the law abiding pay any attention to it. Bad spiders such as Baidu and 100s of others completely ignore robots.txt. It's not a blocker, it is a list.

To block bad bots get the "Ban Spiders by User Agent" mod it is linked at the link in my sig.
Reply With Quote
  #7  
Old 03-08-2014, 04:08 PM
DemOnstar's Avatar
DemOnstar DemOnstar is offline
 
Join Date: Dec 2012
Posts: 859
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Cheers for that..
But, in my entire duration, I have had no spam whatsoever. None have registered ever.
Must be going on for 2 years now. Nothing.

I am slightly worried.
Reply With Quote
  #8  
Old 03-08-2014, 04:10 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by DemOnstar View Post
Cheers for that..
But, in my entire duration, I have had no spam whatsoever. None have registered ever.
Must be going on for 2 years now. Nothing.

I am slightly worried.
The Ban Spiders Mod isn't really a anti-spam mod per-se, it just blocks bad spiders and also anything else you put on the list. That makes it useful as part of a overall anti-spam battlement.
Reply With Quote
  #9  
Old 03-08-2014, 04:35 PM
DemOnstar's Avatar
DemOnstar DemOnstar is offline
 
Join Date: Dec 2012
Posts: 859
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

If I ever get spam, I will consider it. Thanks.

I think my main problem now is ranking, or more precisely, the lack of it.
This is my next adventure into the wilderness.
Reply With Quote
  #10  
Old 03-08-2014, 06:10 PM
Lynne's Avatar
Lynne Lynne is offline
 
Join Date: Sep 2004
Location: California/Idaho
Posts: 41,180
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Test sites should be password protected. If they are, then they won't get indexed.
Reply With Quote
2 благодарности(ей) от:
Max Taxable, ozzy47
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 02:09 AM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.04178 seconds
  • Memory Usage 2,262KB
  • Queries Executed 11 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)ad_showthread_firstpost
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (5)bbcode_code
  • (1)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)navbar
  • (3)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (1)pagenav_pagelink
  • (10)post_thanks_box
  • (2)post_thanks_box_bit
  • (10)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (1)post_thanks_postbit
  • (10)post_thanks_postbit_info
  • (10)postbit
  • (10)postbit_onlinestatus
  • (10)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete