vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 3.5 Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=113)
-   -   Google sitemap for the vB Archives. Redirect human and robots. (https://vborg.vbsupport.ru/showthread.php?t=93980)

Suiko Jin 10-16-2005 09:33 PM

Quote:

Originally Posted by Suiko Jin
Alright, I did everything and installed it. Then I submitted this link to the google sitemap.

http://guardiansanctuary.net/forums/g_sitemap.xml

So I wait a day and then it says that it is ok but when I check on the stats, it says that it has generated some HTTP Error.

Still can't figure this out...

exceem 10-17-2005 02:28 PM

all 3 steps installed :)

and all working fine (hopefully!!!)

Thanks for the hack :)

exceem 10-17-2005 07:13 PM

been getting a few php errors in my error log:

error is
PHP Fatal error: Class 'vBulletinHook' not found in /home/trevor/public_html/forums/includes/functions.php on line 4322

I think its reffering to this bit of code
Code:

function exec_header_redirect($url)
{
        global $vbulletin;

        $url = create_full_url($url);

        if (class_exists('vBulletinHook'))
        {
                // this can be called when we don't have the hook class
                ($hook = vBulletinHook::fetch_hook('header_redirect')) ? eval($hook) : false;
        }

        $url = str_replace('&', '&', $url); // prevent possible oddity

        if (SAPI_NAME == 'cgi' OR SAPI_NAME == 'cgi-fcgi')
        {
                header('Status: 301 Moved Permanently');
        }
        else
        {
                header('HTTP/1.1 301 Moved Permanently');
        }

        header("Location: $url");
        define('NOPMPOPUP', 1);
        if (defined('NOSHUTDOWNFUNC'))
        {
                exec_shut_down();
        }
        exit;
}

the redirects for me visiting a archieved link works taking me to a "proper" thread

im checking my logs now to see if its not working the other way around, any ideas as to whats causing this error?

007 10-18-2005 02:20 PM

I got this error now:

A Sitemap Index may not directly or indirectly reference itself. Please fix your Sitemap Index before resubmitting.

How does this Google Sitemap hack compare to the other one on vB?

eoc_Jason 10-18-2005 02:53 PM

What is the other one? (URL?)

buro9 10-18-2005 03:16 PM

I got something similar but couldn't see anything wrong with the sitemap... so I posted on the Google Groups with it:
http://groups.google.com/group/googl...7e0aca343fb8ec

Similar error:
Code:

Recursive Index

A Sitemap Index may not directly or indirectly reference itself.
Please fix your Sitemap Index before resubmitting.


eoc_Jason 10-18-2005 03:44 PM

Just checked today and got the same error as you! Hmm...

xtreme-mobile 10-18-2005 05:44 PM

when running sitemap.php i get this

Script can only be run by vB Scheduled Tasks. Set $run_by_vb_Scheduled_Task_only to 0 to call this script directly.

i dont understand this bit how do i change it to what its asking :)

dutchbb 10-18-2005 06:33 PM

Quote:

Originally Posted by xtreme-mobile
when running sitemap.php i get this

Script can only be run by vB Scheduled Tasks. Set $run_by_vb_Scheduled_Task_only to 0 to call this script directly.

i dont understand this bit how do i change it to what its asking :)

Open /archive/forums_sitemap.php file

change: $run_by_vb_Scheduled_Task_only = 0;

to: $run_by_vb_Scheduled_Task_only = 1;

Now you (and everyone else) can run the script directly.
You can also leave it to 0 and run this script in your scheduled tasks manager, that way, you are the only person who can run it.

dutchbb 10-18-2005 06:38 PM

BTW the beta script is working fine AFAICS

Quote:

Originally Posted by lierduh
New beta script.

Threads containing newer posts will have higher sitemap priority with this new script.


eoc_Jason 10-20-2005 07:13 PM

Where is the link to the beta script? Is it burried on one of the many pages?

Quote:

Originally Posted by Triple_T
BTW the beta script is working fine AFAICS

Also, I'm sure you probably already noticed, but you can also gzip the main xml file. I modified my filie to do that, wasn't much effort.

dutchbb 10-20-2005 09:05 PM

yeah beta script is few pages back

falter 10-20-2005 11:17 PM

those wanting better bot detection may want to try the mod I recommend in this post:
http://www.vbulletin.com/forum/showt...396#post993396

This will make use of the spiders XML file that so many work so hard on.

dutchbb 10-20-2005 11:41 PM

I've written the instructions for step 3 in a step by step txt file.

This is an alternative for the coloured diff. Should only be used on a untouched index.php and global.php file.

dutchbb 10-21-2005 01:54 AM

Lierduh,

Is it possible to exclude forums from the sitemap? I don't want to get my chat section on google listed too mutch for instance.

Thanks

PeterKjaer 10-21-2005 02:10 PM

Hi,

Anybody know how to set the premissions on a Windows server running IIS.

It will offcause be easy enough to give access for internet_user, but can you give the primission only for this job?

/Peter

falter 10-21-2005 02:16 PM

Quote:

Originally Posted by PeterKjaer
Hi,

Anybody know how to set the premissions on a Windows server running IIS.

It will offcause be easy enough to give access for internet_user, but can you give the primission only for this job?

/Peter

what kind of access do you have to the server?

PeterKjaer 10-22-2005 03:31 AM

Quote:

what kind of access do you have to the server?
It's my own, so i have admin rights to the server

/Peter

c0d3x 10-22-2005 05:23 AM

Quote:

Originally Posted by c0d3x
hi, i have some "hidden" forums that i want to be shown in the archive, but not shown on the forums page!! what can i do??

they're not hidden to members, but i don't want to show them in the forumhome

up please!

Mr Chad 10-22-2005 09:47 AM

Ok its prolly a stupid thing but all the spiders view the archives like this:

/forums/archive/index.php/t-1192.html

when it should be

/forums/archive/index.php?t-1192.html

eoc_Jason 10-22-2005 05:23 PM

chatbum - actually the proper way is the first that you showed with slashes only. The reason is that it emulates directories and doesn't use a query string (making it look more static to spiders).

The reason your server might be using the query string method is because yours doesn't support the first method. (Check your /archive/global.php file where it checks for SLASH_METHOD.)

I *thought* he implemented the check and both options in the archive generation code, but maybe it might not be working properly. *shrug*

Mr Chad 10-22-2005 11:02 PM

Quote:

Originally Posted by eoc_Jason
chatbum - actually the proper way is the first that you showed with slashes only. The reason is that it emulates directories and doesn't use a query string (making it look more static to spiders).

The reason your server might be using the query string method is because yours doesn't support the first method. (Check your /archive/global.php file where it checks for SLASH_METHOD.)

I *thought* he implemented the check and both options in the archive generation code, but maybe it might not be working properly. *shrug*

When i use the slashes it wont bring it to the thread it just redirects to the index, but when it uses the '?' it works.

lierduh 10-24-2005 08:17 AM

Quote:

Originally Posted by chatbum
Ok its prolly a stupid thing but all the spiders view the archives like this:

/forums/archive/index.php/t-1192.html

when it should be

/forums/archive/index.php?t-1192.html

This is implemented in the beta version. I announced this to all the people who clicked "Install".

lierduh 10-24-2005 08:20 AM

Quote:

Originally Posted by Triple_T
Lierduh,

Is it possible to exclude forums from the sitemap? I don't want to get my chat section on google listed too mutch for instance.

Thanks

No option for this at the moment. I doubt it is a sought after feature. Google WILL index your chat section even if you don't `sitemap` it.

007 10-24-2005 09:52 PM

Any idea on when this will get another update? It works for the most part but that recursive index problems keeps happening.. :(

trackpads 10-24-2005 11:06 PM

Quote:

Originally Posted by 007
Any idea on when this will get another update? It works for the most part but that recursive index problems keeps happening.. :(

mee too, thanks again, Jason

lierduh 10-24-2005 11:28 PM

I don't know what the reason is. I do not have this problem so far. Two things to try:

1) Have you deleted the old sitemap entry from Google sitemap account?
2) Try to change the sitemap index file name.

dutchbb 10-26-2005 08:12 PM

Quote:

Originally Posted by lierduh
No option for this at the moment. I doubt it is a sought after feature. Google WILL index your chat section even if you don't `sitemap` it.

Yep, but a lot more if I send a sitemap. It's also a very big forum that has many links, and I don't want it to get any bigger by sending more traffic.

However, I understand that this is a very personal request, so please contact me if you want to do a payed service for this.

thank you

GT1 10-27-2005 12:45 AM

Well - I have everything done except step 3. I am completely lost in editing my global and index .php files Can anyone DO them for me or something? Would like to get this done asap if at all possible. Would be hugely appreciated.

dutchbb 10-27-2005 02:01 PM

Quote:

Originally Posted by GT1
Well - I have everything done except step 3. I am completely lost in editing my global and index .php files Can anyone DO them for me or something? Would like to get this done asap if at all possible. Would be hugely appreciated.

Do you have a problem with the coloured diff or text files in the original zip?

I posted a step by step instruction here

Still problems? Send me your msn , I'll help you out.

D|ver 10-27-2005 10:14 PM

i have a small question:
is it possible to add additional pages to the sitemap?

buro9 10-28-2005 01:07 PM

I have a feature request... to dump a single text file, gzipped, of ALL of the urls that go into the various sitemaps.

Basically... when we're in the loop to create the various sitemaps, to additionally write a text file, with just one full URL per line.

This is because this would also be good for Yahoo and other spiders. Yahoo specifically asks for such a thing on their submit page:
http://submit.search.yahoo.com/free/request
Quote:

You can also provide the location of a text file containing a list of URLs, one URL per line, say urllist.txt. We also recognize compressed versions of the file, say urllist.gz.
And to me... it seems that the loop to create the Google Sitemap, is the perfect low overhead place to also dump the Archive URL's into a text file for Yahoo and other spiders to feed from.

:Judge: 10-28-2005 07:58 PM

Don't know but maybe this is over my head here :p

Went through and made changes 1 and 2 and that is all and I have no idea to check and see if it is working.

I have no errors so that is a plus.

Do I have to sign up for Google Sitemap? - Forget I said that.

dutchbb 10-29-2005 09:52 AM

Just a warning here... I was reading about seo and step 3 might not be such a good idea. This is called cloaking and it is a black hat technique. It is clearly stated in the google quality terms as a forbidden way to SEO.

Actually I found out after installing this, my pagerank for my homepage did go down 4 pages for a very important keyword, I didn't do anything else that could be suspicious so I removed this!

The sitemap is good, but optimizing pages just for search engines and make them look different from what you human visitors see, is NOT recommended and you take a high risk for being penalized by google or other SE.

falter 10-29-2005 11:13 PM

Quote:

Originally Posted by Triple_T
Just a warning here... I was reading about seo and step 3 might not be such a good idea. This is called cloaking and it is a black hat technique. It is clearly stated in the google quality terms as a forbidden way to SEO.

Actually I found out after installing this, my pagerank for my homepage did go down 4 pages for a very important keyword, I didn't do anything else that could be suspicious so I removed this!

The sitemap is good, but optimizing pages just for search engines and make them look different from what you human visitors see, is NOT recommended and you take a high risk for being penalized by google or other SE.

if you impelemented the robots.txt that is suggested, that is most likely the cause of your PR drop, and not due to the cloaking.

There are many, many things that you can do that are considered cloaking. I don't think google would flip out over this seeing as the content that is provided to the search engine spider is the same content that is provided to the human user. An example of an abuse of cloaking is where, say, a completely different set of content is given to the spider than that which is given to the human.

I'm going to pubcon (http://www.pubcon.com) in a couple weeks; I'll ask around to see what some SEO's think about what we're doing here. I can even ask guys at yahoo and google. Personally, I think it's fine, since the same core content is being given to the search engines and humans.

eoc_Jason 11-01-2005 04:23 PM

The problem is, what you think is the "same content" is different than what a spider thinks is the same. Yes, cloaking is a serious issue and search engines do penaltize sites for doing such. Some search engines (like google) have spiders that look like a regular web broswer so that it can compare results between it and the actual spider results. If they don't match then, well, you get the idea.

I expanded my robots.txt file to exclude a lot of the links that are listed in the notes. And I use the generator script to make the xml files for google, but that's it. I do not believe it trying to redirect bots or users to various pages, that will only end up with bad things happening.

lierduh 11-04-2005 10:20 PM

Cloaking or not, is a long debating topic. The general advise from the experts is to not cloaking due the risk it involves. I would say do not install the step 3 if you are concerned about this.

However nowadays many major sites use cloaking including Amazon and Google itself. Believe or not vBulletin also uses cloaking!

eoc_Jason 11-07-2005 04:01 PM

Yes, but Amazon is a much more reputable site than say, joe bob's bait shack... Plus companies like that work directly with Google to enhance features for both sites.

falter 11-08-2005 09:27 PM

Well! I've backed-out the cloaking after the number of my indexed pages on google went from >40,000 to just over 800. I'm assuming that we got penalized in some form. My PR is still a 5, but that doesn't mean much of anything at all.

I can honestly say that my opinion is reversed on the cloaking side of things. I do not recommend implementing step 3.

Citizen 11-08-2005 11:22 PM

What exactly was step 3 of the hack? I looked over the installation and didn't see a "step 3"


All times are GMT. The time now is 04:06 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01463 seconds
  • Memory Usage 1,840KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (2)bbcode_code_printable
  • (15)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (3)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete