vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vB4 General Discussions (https://vborg.vbsupport.ru/forumdisplay.php?f=251)
-   -   Confused ! (https://vborg.vbsupport.ru/showthread.php?t=314110)

jimv8673 09-03-2014 02:03 PM

Confused !
 
Google is pretty much non stop hitting my site, but this is what i see when its there

Google Spider 10:54 AM Viewing 'No Permission' Message /calendar.php?do=getinfo&day=2014-9-30&c=1 Viewing Event 66.249.69.169

How can i stop this ??

Simon Lloyd 09-03-2014 02:24 PM

Organise your robots.txt to block them :)
http://www.vbulletin.com/forum/forum...n-4-robots-txt

Max Taxable 09-03-2014 02:51 PM

Quote:

Originally Posted by Simon Lloyd (Post 2513625)
Organise your robots.txt to block them

It won't block them, it just asks please don't visit these files/folders.:D

Google is usually friendly though, and usually obeys robots.txt.

To the OP: Keep in mind it may take a few days before you see the obedience.

RichieBoy67 09-03-2014 04:25 PM

Yep, as the above said..use the robots.txt and add files and directories you do not want crawled..

It is a good thing to see your site being crawled by Google. :)

Simon Lloyd 09-03-2014 04:44 PM

Quote:

Originally Posted by Max Taxable (Post 2513629)
It won't block them, it just asks please don't visit these files/folders.:D

Google is usually friendly though, and usually obeys robots.txt.

To the OP: Keep in mind it may take a few days before you see the obedience.

It's just symantics :), if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner ;)

RichieBoy67 09-03-2014 04:51 PM

Quote:

Originally Posted by Simon Lloyd (Post 2513647)
It's just symantics :), if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner ;)

True but many bots just ignore the robots.txt file completely.

Blocked does sound meaner!:D

Max Taxable 09-03-2014 07:46 PM

Quote:

Originally Posted by Simon Lloyd (Post 2513647)
It's just symantics :), if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner ;)

Oh I understand that, you do, most all of we more experienced webbers do - but these noobs don't. They see "block" they think it really means, "block" and will be back in a week complaining it didn't work!:D

ozzy47 09-03-2014 07:55 PM

About /robots.txt

In a nutshell

Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.
It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:
Code:

User-agent: * Disallow: /
The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.
There are two important considerations when using /robots.txt:
  • robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
  • the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.
So don't try to use /robots.txt to hide information.



Why did this robot ignore my /robots.txt?

It could be that it was written by an inexperienced software writer. Occasionally schools set their students "write a web robot" assignments.
But, these days it's more likely that the robot is explicitly written to scan your site for information to abuse: it might be collecting email addresses to send email spam, look for forms to post links ("spamdexing"), or security holes to exploit.


Can I block just bad robots?

In theory yes, in practice, no. If the bad robot obeys /robots.txt, and you know the name it scans for in the User-Agent field. then you can create a section in your /robotst.txt to exclude it specifically. But almost all bad robots ignore /robots.txt, making that pointless.


If the bad robot operates from a single IP address, you can block its access to your web server through server configuration or with a network firewall.


If copies of the robot operate at lots of different IP addresses, such as hijacked PCs that are part of a large Botnet, then it becomes more difficult. The best option then is to use advanced firewall rules configuration that automatically block access to IP addresses that make many connections; but that can hit good robots as well your bad robots.


All times are GMT. The time now is 12:36 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01130 seconds
  • Memory Usage 1,736KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)bbcode_code_printable
  • (4)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (8)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete