vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 2.x Beta Releases (https://vborg.vbsupport.ru/forumdisplay.php?f=5)
-   -   robots.txt Manager (https://vborg.vbsupport.ru/showthread.php?t=48698)

MUG 02-07-2003 10:00 PM

robots.txt Manager
 
This script allows you to easily create a dynamically generated robots.txt file, based on specified rules.

If you use this hack, please click 'Install' :)

Screenshots will be attached...

MUG 02-08-2003 07:00 PM

Control Panel

MUG 02-08-2003 07:01 PM

Editor (All Robots)

MUG 02-08-2003 07:01 PM

Editor (Specific Robot)

MUG 02-08-2003 07:01 PM

Generated File (:banana:)

Neo 02-08-2003 07:08 PM

Excellent G.

Dean C 02-08-2003 07:31 PM

Wow very nice!

wooolF[RM] 02-08-2003 08:04 PM

]very good job, tho' I can edit one txt file by hands :) No offence, again, very good job :)

May I ask you if you have the list with IPs of some major/all search engines? I kinda need it :) Thanx a lot and again, a nice hack!

MUG 02-08-2003 08:07 PM

The point of the hack is to make administration easier. It keeps track of robots requesting the robots.txt file, allowing you to ban or restrict a bot without having to dig through the server logs. I wrote the hack today, so the only bots included in the database already are the ones that spidered my site during that time. The list is in the .sql file.

wooolF[RM] 02-08-2003 08:37 PM

]thanx for the answer :)

djr 02-08-2003 08:50 PM

Is it suppose to write a new robots.txt file everytime or do the bots see the robots.php file as robots.txt?

If your answer is it's suppose to write a new robots.txt file, it isn't working for me :-(

And: do I still need a robots.txt file?

MUG 02-08-2003 09:06 PM

Quote:

Originally posted by djr
Is it suppose to write a new robots.txt file everytime or do the bots see the robots.php file as robots.txt?

If your answer is it's suppose to write a new robots.txt file, it isn't working for me :-(

And: do I still need a robots.txt file?

It uses mod_rewrite to send requests to robots.php. You have to create an .htaccess file with the following:

RewriteEngine on
RewriteRule robots.txt /robots.php (note: this is for the old version. read the new install file :))

Upload robots.php to the root web directory (usually public_html). Make sure you run robots.sql using phpMyAdmin. :)

djr 02-08-2003 09:33 PM

I did that already ;)
So the robots are redirected to robots.php which is feeding them a perfectly rendered robots.txt file? Sorry 'bout asking, but I don't want to break my (high) ranking(s).

- djr

MUG 02-08-2003 10:10 PM

Quote:

Originally posted by djr
I did that already ;)
So the robots are redirected to robots.php which is feeding them a perfectly rendered robots.txt file? Sorry 'bout asking, but I don't want to break my (high) ranking(s).

- djr

Yup. The only difference is the X-Powered-By header generated automatically by PHP.

BigCheeze 02-08-2003 10:16 PM

Thanks! I just installed it. See if I can control those bot's a little more!!

SphereX 02-09-2003 02:09 AM

very nice!

***installs

djr 02-09-2003 09:51 AM

Hi MUG,

Can you add another column 'Owner' and 'Origin' (or whatever you might want to call it) where we can add the owner and origin of the spider?

For example:
Code:

googlebot | Googlebot/2.1 (+http://www.googlebot.com/bot.html) | 216.239.46.19 | Google | http://www.google.com | 4 Edit - Delete |
Not every spider describes itself fully. e.g. Mercator-2.0 is one of Altavista's robots, but there's no link to Altavista whatsoever.

Thanks,
- djr

djr 02-09-2003 09:54 AM

I found some good overviews of spiders here and here. If anyone has more of these lists, please add them to this thread.

Thanks,
- djr

MUG 02-09-2003 10:24 AM

Quote:

Originally posted by djr
Hi MUG,

Can you add another column 'Owner' and 'Origin' (or whatever you might want to call it) where we can add the owner and origin of the spider?

For example:
Code:

googlebot | Googlebot/2.1 (+http://www.googlebot.com/bot.html) | 216.239.46.19 | Google | http://www.google.com | 4 Edit - Delete |
Not every spider describes itself fully. e.g. Mercator-2.0 is one of Altavista's robots, but there's no link to Altavista whatsoever.

Thanks,
- djr

Ooh, thanks. I was wondering what Mercator-2.0 was. :paranoid:

I'll add a description field, but there's not enough room for it to show on the main page so you'll have to click edit to view it.

MUG 02-09-2003 10:40 AM

Version 1.0 final released. :pirate:

MUG 02-09-2003 11:26 AM

Can this thread be moved to the Full Releases forum?

Velocd 02-09-2003 04:31 PM

I have a slight problem with googlebots, and that is they storm my forum by huge numbers. Currently, for example, I have 7 googlebots crawling my forum. That seems purely excessive to me, and I would like to somehow limit the amount of googlebots to maybe 2.

What is the command line for robots.txt to do this? Or maybe there is some other alternate method.

Thanks ;)

MUG 02-09-2003 04:42 PM

Quote:

Originally posted by Velocd
I have a slight problem with googlebots, and that is they storm my forum by huge numbers. Currently, for example, I have 7 googlebots crawling my forum. That seems purely excessive to me, and I would like to somehow limit the amount of googlebots to maybe 2.

What is the command line for robots.txt to do this? Or maybe there is some other alternate method.

Thanks ;)

Honestly, I don't think that is possible with robots.txt. If you created something that would dynamically insert text into a robots.txt file based on the number of Googlebots spidering your site, Google might "take the hint" and never come back. :ermm:

Velocd 02-10-2003 01:43 AM

Drat.. :ermm:

Wish it were possible somehow, oh well. My current bandwidth is being consumed quicly by these googlebots, so I guess I'll simply have to restrict them from the threads.

Automated 02-10-2003 11:40 AM

Quote:

Originally posted by Velocd
Drat.. :ermm:

Wish it were possible somehow, oh well. My current bandwidth is being consumed quicly by these googlebots, so I guess I'll simply have to restrict them from the threads.

restricting them from the threads :confused: whats the point of getting spidered then ?

djr 02-11-2003 08:04 PM

We have two different domains, but only one MySQL-database. Is it possible to place the robots.php on both the domains (and thus using the same tables)?

- djr

djr 02-13-2003 08:46 AM

Already found it. Just rename the robots_log table to robots_log_domain1 and create another one with _domain2 and update changes in robots.php.

- djr

mheinemann 02-16-2003 02:48 PM

Installed, works great!

MUG 02-16-2003 10:54 PM

Glad that you like it. :cool:

Any suggestions? :)

mheinemann 02-17-2003 12:19 PM

The only suggestion I can think of is being able to import your current robots.txt

I had disallowed "turnitin" and would like to be able to still block them.

MUG 02-17-2003 04:56 PM

Quote:

Originally posted by mheinemann
I had disallowed "turnitin" and would like to be able to still block them.
I thought that I already included TurnitinBot in the .sql file?
Quote:

Originally posted by mheinemann
The only suggestion I can think of is being able to import your current robots.txt
Good idea... it shouldn't be too hard to implement. :)

mheinemann 03-01-2003 11:47 PM

And maybe being able to manually edit it as well.

stryka 03-09-2003 01:18 AM

My current robots.txt file is not being overwritten when I click submit?

I made the changes to the .htaccess file... is there anything else i should look @ ??

Thanx

MUG 04-12-2003 01:05 PM

1.1 Beta released. It includes the following bug fixes / additions:[list=1][*]Stripping of comments from generated file (although it is in the robots.txt specification, some bots choke on them)[*]Repairs newlines in generated file (old version sometimes produced \r\r\n)[*]Cleaner interface for control panel[*]Several other things I can't remember :confused:[/list=1]

Mickie D 05-29-2003 11:36 AM

thanks for this hack its very useful and i think it should be a full release :)

i have one problem and it has nothing to do with your script :)

where can i find info on what bots i should ban i never had turnit in bot banned b4 .. why is that bot bad ???

PixelFx 05-29-2003 03:34 PM

very cool, now I don't need to do this manually all the time ;) thank you for sharing :)

stryka 07-30-2003 08:00 PM

I get an error after i updated to 1.1

Fatal error: db_connect(): Failed opening required '' (include_path='') in /home/name/public_html/robots.php on line 63

MUG 07-30-2003 08:38 PM

Did you change $vB_Config_Path to the correct path?

daFish 10-07-2003 07:06 AM

Great addition.
Ar their any plans for a new version?

-Fish

sabret00the 11-04-2003 07:19 PM

nice little hack this :)


All times are GMT. The time now is 05:02 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.02805 seconds
  • Memory Usage 1,806KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (2)bbcode_code_printable
  • (7)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (1)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete