The Arcive of Official vBulletin Modifications Site.It is not a VB3 engine, just a parsed copy! |
|
#1
|
|||
|
|||
Confused !
Google is pretty much non stop hitting my site, but this is what i see when its there
Google Spider 10:54 AM Viewing 'No Permission' Message /calendar.php?do=getinfo&day=2014-9-30&c=1 Viewing Event 66.249.69.169 How can i stop this ?? |
#2
|
||||
|
||||
Organise your robots.txt to block them
http://www.vbulletin.com/forum/forum...n-4-robots-txt |
#3
|
||||
|
||||
It won't block them, it just asks please don't visit these files/folders.
Google is usually friendly though, and usually obeys robots.txt. To the OP: Keep in mind it may take a few days before you see the obedience. |
#4
|
||||
|
||||
Yep, as the above said..use the robots.txt and add files and directories you do not want crawled..
It is a good thing to see your site being crawled by Google. |
#5
|
||||
|
||||
It's just symantics , if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner
|
#6
|
||||
|
||||
Quote:
Blocked does sound meaner! |
#7
|
||||
|
||||
Oh I understand that, you do, most all of we more experienced webbers do - but these noobs don't. They see "block" they think it really means, "block" and will be back in a week complaining it didn't work!
|
#8
|
||||
|
||||
About /robots.txt
In a nutshell Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol. It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds: Code:
User-agent: * Disallow: / There are two important considerations when using /robots.txt:
Why did this robot ignore my /robots.txt? It could be that it was written by an inexperienced software writer. Occasionally schools set their students "write a web robot" assignments. But, these days it's more likely that the robot is explicitly written to scan your site for information to abuse: it might be collecting email addresses to send email spam, look for forms to post links ("spamdexing"), or security holes to exploit. Can I block just bad robots? In theory yes, in practice, no. If the bad robot obeys /robots.txt, and you know the name it scans for in the User-Agent field. then you can create a section in your /robotst.txt to exclude it specifically. But almost all bad robots ignore /robots.txt, making that pointless. If the bad robot operates from a single IP address, you can block its access to your web server through server configuration or with a network firewall. If copies of the robot operate at lots of different IP addresses, such as hijacked PCs that are part of a large Botnet, then it becomes more difficult. The best option then is to use advanced firewall rules configuration that automatically block access to IP addresses that make many connections; but that can hit good robots as well your bad robots. |
Thread Tools | |
Display Modes | |
|
|
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
More Information | |
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|