Go Back   vb.org Archive > vBulletin 4 Discussion > vB4 General Discussions
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 09-03-2014, 02:03 PM
jimv8673 jimv8673 is offline
 
Join Date: Mar 2010
Posts: 19
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default Confused !

Google is pretty much non stop hitting my site, but this is what i see when its there

Google Spider 10:54 AM Viewing 'No Permission' Message /calendar.php?do=getinfo&day=2014-9-30&c=1 Viewing Event 66.249.69.169

How can i stop this ??
Reply With Quote
  #2  
Old 09-03-2014, 02:24 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Organise your robots.txt to block them
http://www.vbulletin.com/forum/forum...n-4-robots-txt
Reply With Quote
  #3  
Old 09-03-2014, 02:51 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
Organise your robots.txt to block them
It won't block them, it just asks please don't visit these files/folders.

Google is usually friendly though, and usually obeys robots.txt.

To the OP: Keep in mind it may take a few days before you see the obedience.
Reply With Quote
  #4  
Old 09-03-2014, 04:25 PM
RichieBoy67's Avatar
RichieBoy67 RichieBoy67 is offline
 
Join Date: Apr 2004
Location: CT - Down in a hole..
Posts: 3,057
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Yep, as the above said..use the robots.txt and add files and directories you do not want crawled..

It is a good thing to see your site being crawled by Google.
Reply With Quote
  #5  
Old 09-03-2014, 04:44 PM
Simon Lloyd's Avatar
Simon Lloyd Simon Lloyd is offline
 
Join Date: Aug 2008
Location: Manchester
Posts: 3,481
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Max Taxable View Post
It won't block them, it just asks please don't visit these files/folders.

Google is usually friendly though, and usually obeys robots.txt.

To the OP: Keep in mind it may take a few days before you see the obedience.
It's just symantics , if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner
Reply With Quote
  #6  
Old 09-03-2014, 04:51 PM
RichieBoy67's Avatar
RichieBoy67 RichieBoy67 is offline
 
Join Date: Apr 2004
Location: CT - Down in a hole..
Posts: 3,057
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
It's just symantics , if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner
True but many bots just ignore the robots.txt file completely.

Blocked does sound meaner!
Reply With Quote
  #7  
Old 09-03-2014, 07:46 PM
Max Taxable's Avatar
Max Taxable Max Taxable is offline
 
Join Date: Feb 2011
Posts: 3,134
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Simon Lloyd View Post
It's just symantics , if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner
Oh I understand that, you do, most all of we more experienced webbers do - but these noobs don't. They see "block" they think it really means, "block" and will be back in a week complaining it didn't work!
Reply With Quote
  #8  
Old 09-03-2014, 07:55 PM
ozzy47's Avatar
ozzy47 ozzy47 is offline
 
Join Date: Jul 2009
Location: USA
Posts: 10,929
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

About /robots.txt

In a nutshell

Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.
It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:
Code:
User-agent: * Disallow: /
The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.
There are two important considerations when using /robots.txt:
  • robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
  • the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.
So don't try to use /robots.txt to hide information.



Why did this robot ignore my /robots.txt?

It could be that it was written by an inexperienced software writer. Occasionally schools set their students "write a web robot" assignments.
But, these days it's more likely that the robot is explicitly written to scan your site for information to abuse: it might be collecting email addresses to send email spam, look for forms to post links ("spamdexing"), or security holes to exploit.


Can I block just bad robots?

In theory yes, in practice, no. If the bad robot obeys /robots.txt, and you know the name it scans for in the User-Agent field. then you can create a section in your /robotst.txt to exclude it specifically. But almost all bad robots ignore /robots.txt, making that pointless.


If the bad robot operates from a single IP address, you can block its access to your web server through server configuration or with a network firewall.


If copies of the robot operate at lots of different IP addresses, such as hijacked PCs that are part of a large Botnet, then it becomes more difficult. The best option then is to use advanced firewall rules configuration that automatically block access to IP addresses that make many connections; but that can hit good robots as well your bad robots.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 12:42 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.07934 seconds
  • Memory Usage 2,244KB
  • Queries Executed 13 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)ad_showthread_firstpost
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (1)bbcode_code
  • (4)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)navbar
  • (3)navbar_link
  • (120)option
  • (8)post_thanks_box
  • (8)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (8)post_thanks_postbit_info
  • (8)postbit
  • (8)postbit_onlinestatus
  • (8)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete