The Arcive of Official vBulletin Modifications Site.

jimv8673 · #1 09-03-2014, 02:03 PM

Google is pretty much non stop hitting my site, but this is what i see when its there

Google Spider 10:54 AM Viewing 'No Permission' Message /calendar.php?do=getinfo&day=2014-9-30&c=1 Viewing Event 66.249.69.169

How can i stop this ??

Simon Lloyd · #2 09-03-2014, 02:24 PM

Organise your robots.txt to block them

http://www.vbulletin.com/forum/forum...n-4-robots-txt

Max Taxable · #3 09-03-2014, 02:51 PM

Quote:

Originally Posted by Simon Lloyd

Organise your robots.txt to block them

It won't block them, it just asks please don't visit these files/folders.

Google is usually friendly though, and usually obeys robots.txt.

To the OP: Keep in mind it may take a few days before you see the obedience.

RichieBoy67 · #4 09-03-2014, 04:25 PM

Yep, as the above said..use the robots.txt and add files and directories you do not want crawled..

It is a good thing to see your site being crawled by Google.

Simon Lloyd · #5 09-03-2014, 04:44 PM

Quote:

Originally Posted by Max Taxable

It won't block them, it just asks please don't visit these files/folders.

Google is usually friendly though, and usually obeys robots.txt.

To the OP: Keep in mind it may take a few days before you see the obedience.

It's just symantics

, if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner

RichieBoy67 · #6 09-03-2014, 04:51 PM

Quote:

Originally Posted by Simon Lloyd

It's just symantics

, if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner

True but many bots just ignore the robots.txt file completely.

Blocked does sound meaner!

Max Taxable · #7 09-03-2014, 07:46 PM

Quote:

Originally Posted by Simon Lloyd

It's just symantics

, if you go to your Google Webmaster tools it doesn't say "Kindly asked not to look at these locations" it says "blocked by robots.txt".....i like blocked it sounds so much meaner

Oh I understand that, you do, most all of we more experienced webbers do - but these noobs don't. They see "block" they think it really means, "block" and will be back in a week complaining it didn't work!

ozzy47 · #8 09-03-2014, 07:55 PM

About /robots.txt

In a nutshell

Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.
It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:

Code:

User-agent: * Disallow: /

The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.
There are two important considerations when using /robots.txt:

robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.

So don't try to use /robots.txt to hide information.

Why did this robot ignore my /robots.txt?

It could be that it was written by an inexperienced software writer. Occasionally schools set their students "write a web robot" assignments.
But, these days it's more likely that the robot is explicitly written to scan your site for information to abuse: it might be collecting email addresses to send email spam, look for forms to post links ("spamdexing"), or security holes to exploit.

Can I block just bad robots?

In theory yes, in practice, no. If the bad robot obeys /robots.txt, and you know the name it scans for in the User-Agent field. then you can create a section in your /robotst.txt to exclude it specifically. But almost all bad robots ignore /robots.txt, making that pointless.

If the bad robot operates from a single IP address, you can block its access to your web server through server configuration or with a network firewall.

If copies of the robot operate at lots of different IP addresses, such as hijacked PCs that are part of a large Botnet, then it becomes more difficult. The best option then is to use advanced firewall rules configuration that automatically block access to IP addresses that make many connections; but that can hit good robots as well your bad robots.

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

X vBulletin 3.8.12 by vBS Debug Information
Page Generation 0.03593 seconds Memory Usage 2,227KB Queries Executed 13 (?)
More Information
Template Usage: (1)SHOWTHREAD (1)ad_footer_end (1)ad_footer_start (1)ad_header_end (1)ad_header_logo (1)ad_navbar_below (1)ad_showthread_beforeqr (1)ad_showthread_firstpost (1)ad_showthread_firstpost_sig (1)ad_showthread_firstpost_start (1)bbcode_code (4)bbcode_quote (1)footer (1)forumjump (1)forumrules (1)gobutton (1)header (1)headinclude (1)navbar (3)navbar_link (120)option (8)post_thanks_box (8)post_thanks_button (1)post_thanks_javascript (1)post_thanks_navbar_search (8)post_thanks_postbit_info (8)postbit (8)postbit_onlinestatus (8)postbit_wrapper (1)spacer_close (1)spacer_open (1)tagbit_wrapper Phrase Groups Available: global inlinemod postbit posting reputationlevel showthread	Included Files: ./showthread.php ./global.php ./includes/init.php ./includes/class_core.php ./includes/config.php ./includes/functions.php ./includes/class_hook.php ./includes/modsystem_functions.php ./includes/functions_bigthree.php ./includes/class_postbit.php ./includes/class_bbcode.php ./includes/functions_reputation.php ./includes/functions_post_thanks.php Hooks Called: init_startup init_startup_session_setup_start init_startup_session_setup_complete cache_permissions fetch_postinfo_query fetch_postinfo fetch_threadinfo_query fetch_threadinfo fetch_foruminfo style_fetch cache_templates global_start parse_templates global_setup_complete showthread_start showthread_getinfo forumjump showthread_post_start showthread_query_postids showthread_query bbcode_fetch_tags bbcode_create showthread_postbit_create postbit_factory postbit_display_start post_thanks_function_post_thanks_off_start post_thanks_function_post_thanks_off_end post_thanks_function_fetch_thanks_start post_thanks_function_fetch_thanks_end post_thanks_function_thanked_already_start post_thanks_function_thanked_already_end fetch_musername postbit_imicons bbcode_parse_start bbcode_parse_complete_precache bbcode_parse_complete postbit_display_complete post_thanks_function_can_thank_this_post_start tag_fetchbit_complete forumrules navbits navbits_complete showthread_complete
Messages:

The Arcive of Official vBulletin Modifications Site.

It is not a VB3 engine, just a parsed copy!