vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 3.5 Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=113)
-   -   Google sitemap for the vB Archives. Redirect human and robots. (https://vborg.vbsupport.ru/showthread.php?t=93980)

David_R 09-03-2005 08:36 PM

Quote:

Originally Posted by Yorixz
It doesn't support vB 3.x, that's why it's in the vB 3.5 category ;)

hi,
something similar exists in vb 3.5 extensions, but does supports 3.x as well, can you compare features of both these hacks ?

Yorixz 09-04-2005 08:46 AM

Quote:

Originally Posted by David_R
hi,
something similar exists in vb 3.5 extensions, but does supports 3.x as well, can you compare features of both these hacks ?

I'm afraid I can't since I can't get this hack working on my forum right now, if it's running here I'll post my experience and such for you.

vauge 09-07-2005 10:12 AM

I really really like this idea, but is the below a concern? I do not want to do negative thing for my forums.

Quote:

Originally Posted by Google
Don't employ cloaking or sneaky redirects.

http://www.google.com/webmasters/guidelines.html

lierduh 09-07-2005 11:11 PM

Quote:

Originally Posted by vauge
I really really like this idea, but is the below a concern? I do not want to do negative thing for my forums.

http://www.google.com/webmasters/guidelines.html

301 Redirect is Google preferred way. Google penalize web site with duplicated contents. So if you have two URLs showing the same contents, Google prefers you redirect from one URL to the other. Have both will attract penlty.

What we do here is not sneaky. We have the actual contents, we just want Google to show one version of it. We do not want Google to give us higher page rank than the pages actually worth, we just want Google to index the actually contents, instead of looping through the endless internal links.

I moved my forums to a new domain a few weeks ago (just before I released this hack). There are so far 150,000 pages indexed by Google already. Without sitemap Yahoo has only indexed just over 20,000 pages.

lierduh 09-07-2005 11:14 PM

Quote:

Originally Posted by Yorixz
This mod looked very interesting, install went fine but now when I try to run it I get this errors:
Code:

Warning: fopen(/home/ftpusers/otfans/html/forums/g_sitemap.xml) [function.fopen]: failed to open stream: Permission denied in /archive/forums_sitemap.php on line 245
What am I doing wrong? I'll be very glad to hear it ;)

As the error suggests. The permission is denied in the '/home/ftpusers/otfans/html/forums/' directory. That is the base vB directory. Where I described very clearly as "the same directory where you find showthread.php etc.".

People keep asking the directory name to change. I do not know, because that can be anything. In your case it is 'forums', others may be 'public_html'...

KarateKid 09-12-2005 04:49 AM

hm, with vb 3.5 rc3, I get the following errors when accessing the forums_sitemap.php:

PHP Code:

Warningarray_keys(): The first argument should be an array in /home/htdocs/web0/html/forum/includes/class_core.php on line 1438

Warning
Invalid argument supplied for foreach() in /home/htdocs/web0/html/forum/includes/class_core.php on line 1438

Warning
array_keys(): The first argument should be an array in /home/htdocs/web0/html/forum/includes/class_core.php on line 1453

Warning
Invalid argument supplied for foreach() in /home/htdocs/web0/html/forum/includes/class_core.php on line 1453



    Unable to add cookies
header already sent.
    
File: /home/htdocs/web0/html/forum/includes/class_core.php
    Line
1438 

Any ideas? :confused:

lierduh 09-13-2005 01:32 AM

No idea, I have upgraded to RC3. The hack works without any further modification.

If you have changed the php file, make sure your upgrade does not copy over them. I did not copy new files to 'archive directory'.

buro9 09-14-2005 02:50 PM

Hey lierduh,

Thanks for your hacks over the years, always fine jobs :)

I'm another one wanting instructions for the archive/index.php and archive/global.php

I've applied the rest of the instructions, and all appears to be working fine. And I do get the concept of removing the PDA crud, and redirecting humans out of the archive... but I'm a little lost with the diff output you've supplied... a more primitive ><+/- lines would've confused me less ;)

If you do ever find time to update the instructions, it will be very appreciated by quite a few of us. And I realise how much of a pain that is, as I'm supposed to be porting my own hacks and really don't like the idea much.

Cheers

DavidK

lierduh 09-14-2005 10:26 PM

Quote:

Originally Posted by buro9
Hey lierduh,

Thanks for your hacks over the years, always fine jobs :)

I'm another one wanting instructions for the archive/index.php and archive/global.php

I've applied the rest of the instructions, and all appears to be working fine. And I do get the concept of removing the PDA crud, and redirecting humans out of the archive... but I'm a little lost with the diff output you've supplied... a more primitive ><+/- lines would've confused me less ;)

If you do ever find time to update the instructions, it will be very appreciated by quite a few of us. And I realise how much of a pain that is, as I'm supposed to be porting my own hacks and really don't like the idea much.

Cheers

DavidK

Hello DavidK,

The reason it has confused you is my diff were based on the RC2 and modified RC2 files. I presume now you have got RC3 files.

I have attached the diff between RC3 and modified RC2 files (this will confuse everyone else, but not you:)). Otherwise, if you still have the RC2 files, then the coloured diff will make a lot of sense.:)

buro9 09-15-2005 06:32 AM

Quote:

Originally Posted by lierduh
Hello DavidK,

The reason it has confused you is my diff were based on the RC2 and modified RC2 files. I presume now you have got RC3 files.

I have attached the diff between RC3 and modified RC2 files (this will confuse everyone else, but not you:)). Otherwise, if you still have the RC2 files, then the coloured diff will make a lot of sense.:)

Much better :D Thanks :)

And yes... it will confuse everyone now ;)

Brandon Sheley 09-15-2005 08:37 AM

i can't seem to get the files to be created :(

from the instructions, which where just about to much info for me..
i upload the forums_sitemap.php to arcives then chmom the arcives folder 775,
then make the sceduled task, run it, and the files should be made..
what part am i missing ? thankyou..

at rc3 now.

Yorixz 09-18-2005 11:01 AM

Code:

Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1438

Warning: Invalid argument supplied for foreach() in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1438

Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1453

Warning: Invalid argument supplied for foreach() in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1453

I'm also still having that errors, weird =/ lierduh, could it be that you've tested it with PHP4 rather than PHP5?

lierduh 09-18-2005 10:46 PM

Quote:

Originally Posted by Yorixz
Code:

Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1438

Warning: Invalid argument supplied for foreach() in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1438

Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1453

Warning: Invalid argument supplied for foreach() in /home/ftpusers/otfans/html/forums/includes/class_core.php on line 1453

I'm also still having that errors, weird =/ lierduh, could it be that you've tested it with PHP4 rather than PHP5?

I do not have php5 to test.

Instead of calling the script directly, have you tried using the Schedule Task's "Run Now" button?

thenetbox 09-18-2005 11:32 PM

thank you very much! I just started trying to do this my self but found this :D yay!

Yorixz 09-20-2005 04:53 PM

Quote:

Originally Posted by lierduh
I do not have php5 to test.

Instead of calling the script directly, have you tried using the Schedule Task's "Run Now" button?

Yes, that results into
Code:

Warning: gzopen(/home/ftpusers/otfans/html/forums/archive/sitemap_11.gz) [function.gzopen]: failed to open stream: Permission denied in /archive/forums_sitemap.php on line 132

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 72

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

Warning: gzwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 87

for like thousand times.

Weird thing is that I'm 100% sure that I chmodded everything correctly. (It's on a debian host, if that is relevant)

lierduh 09-20-2005 11:10 PM

Quote:

Originally Posted by Yorixz
Yes, that results into
Code:

Warning: gzopen(/home/ftpusers/otfans/html/forums/archive/sitemap_11.gz) [function.gzopen]: failed to open stream: Permission denied in /archive/forums_sitemap.php on line 132
for like thousand times.

Weird thing is that I'm 100% sure that I chmodded everything correctly. (It's on a debian host, if that is relevant)

That means the script can't write to archive directory. What is the persmission for this directory?

Yorixz 09-21-2005 05:04 AM

Quote:

Originally Posted by lierduh
That means the script can't write to archive directory. What is the persmission for this directory?

Right now it's 0777 and it's working, I thought I already changed it some days ago, it was 0775 (which should be enough as far as I know)

Thanks for your support ;)

jribz 09-24-2005 08:34 PM

OK I have this installed and it seems to be working as described. When viewing who's online I can see the search engines looking at threads with url's similar to the following.

/archive/index.php/t-6044.html

When clicked by a human user they are redirected to

/showthread.php?t=6044

So I can only assume this works since all the spiders on the board are seeing the archived version, and when users click they are taken to the full version.

I do have a couple of questions however, since I am not too familiar with Google Sitemaps. Does the script automatically upload the sitemap to Google without any further action aside from making the Scheduled Task? I have made the task in the manager and run it (every day at 1AM), and it has created the files ( [xml] in forum root and specific [gz] forums in archive folder).

[upon further thinking, would I be correct in saying I need to let google know about the xml file in the root of the site?]

What is the affect on other search engines? I see yahoo, msn, ask, and others viewing similar archives, so I assume the affect is similar to what is happening with google, but they are not getting a map.

Last question, does this basically mean that other SEO hacks are not required, since the spiders will never see the rewritten urls anyhow?

Allot of assumptions up there. :ermm:

Oh and one last thing, I do use mod rewrite on my server for many sites, and have had no issues, but the command you say to enter to resolve the index.php issue seems to bog the server, making any urls that point directly to it, as in /index.php, not load. I suppose this could be a conflict within my htaccess file, but not too certain where to start looking. (however, I did try it with only the codes you provided (and RewriteEngine on) and have the same problem.

Thanks for your time and the hack.

lierduh 09-25-2005 12:16 AM

Each time the script is run.

1) It re-generates all the sitemaps. Makes sense because you have more threads/posts now.

2) It notifies Google about new sitemaps being available. You will notice Google fetches these files soon afterwards.

If you have the scheduled task logged. The end of the log is the response sent by Google. It should say:

======================
Sitemap Notification Received

Your Sitemap has been successfully added to our list of Sitemaps to crawl. If this is the first time you are notifying Google about this Sitemap, please add it via http://www.google.com/webmasters/sitemaps so you can track its status. Please note that we do not add all submitted URLs to our index, and we cannot make any predictions or guarantees about when or if they will appear.
======================

One thing to remember is under your Google sitemap account. The 'last submitted' does not reflect the auto ping/submit. It only logs the manual submit you do by push the button at Google sitemap account.

Other search engine do not accept sitemaps as far as I know, at least not using Google's sitemap format. The redirects however works for all the major search engine which I believe benefits the indexing.

I do not recommend using SEO at least for existing sites. The chances are Google has already indexed part of your forums using links like /showthread.php?t=12345. Now if you rewrite all the URLs, Google will have two copies of the same contents for that thread. (one with the traditional URL, one from your new rewrite URL). This will lead Google panalizing your site ranking. Some smarter SEO scheme redirect your old URL to the new one does not suffer this, but it becomes a very complicated add-on. It may break every time a mojor vB version is released. I elect not to use such scheme. For the record, I used URL rewrite SEO back in vB2 era. In my .htaccess, I still need to redirect my old rewritten vB2 URLs in fear of Google penalizing my site. Basically the vB archive is very static, it was designed for SEO in the first place anyway. Think about how many clickable links a normal showthread brings to you, it becomes a mess for search engines no matter how smart your SEO is.

For index.php redirect, my working version is:

RewriteEngine on

#...

RewriteCond %{QUERY_STRING} ^$
RewriteRule ^index.php$ / [R=301,L]

If it does not work for you, I would check the http logs. Failing that, log your rewrite! (you need to do this in your http.conf, consult apache manual for log level etc.:))

jribz 09-25-2005 12:40 AM

Thanks for the reply, that clears up alot... I had to verify site ownership via google, the logs for the cron showed exactly that.

One thing I notice however, while looking now, is that the google spider is viewing a few regular threads, while the google adsense spider is viewing the archive, also viewing the archive is msnbot yahoslurp and askjeeves.

Wonder why google is seeing a regular thread now.

Going to look into the htaccess in a bit.

lierduh 09-25-2005 01:12 AM

Quote:

Originally Posted by jribz
Thanks for the reply, that clears up alot... I had to verify site ownership via google, the logs for the cron showed exactly that.

One thing I notice however, while looking now, is that the google spider is viewing a few regular threads, while the google adsense spider is viewing the archive, also viewing the archive is msnbot yahoslurp and askjeeves.

Wonder why google is seeing a regular thread now.

Going to look into the htaccess in a bit.

vB detects search engine by the following string:
google|msnbot|yahoo! slurp (in the init.php file)

Any user agent matchines that will be detected as search engine.
So check your http access log and see what user agent does 'Google' use. It should say "Googlebot".

jribz 09-25-2005 01:38 AM

Quote:

Originally Posted by lierduh
vB detects search engine by the following string:
google|msnbot|yahoo! slurp (in the init.php file)

Any user agent matchines that will be detected as search engine.
So check your http access log and see what user agent does 'Google' use. It should say "Googlebot".

Looking at the logs I have 2 versions of Google coming up...

Googlebot/2.1; +http://www.google.com/bot.html)"
Google/2.1"

The first one looks legit, but why is the second not sending an origen? They both have the same IP.

yessir 10-03-2005 04:11 PM

/me installed

Thank you!

thenetbox 10-03-2005 08:29 PM

I'm not very clear about how to complete step 3. It says:

Quote:

This step involves change of code. There are number of places you need
to change. I have include the diff result between the final files and
original files. Two files involved are archive/global.php and archive/index.php
Where do I find the instructions to make the changes? :)

DefenceTalk 10-04-2005 03:48 PM

Does this work with the latest version? Anyone?

Thanks

thenetbox 10-04-2005 04:06 PM

Quote:

Originally Posted by DefenceTalk
Does this work with the latest version? Anyone?

Thanks

Steps 1 and 2 worked great for me :D . I don't know how to do step 3 (optional)

jribz 10-04-2005 07:02 PM

Quote:

Originally Posted by thenetbox
Steps 1 and 2 worked great for me :D . I don't know how to do step 3 (optional)

Step 3 is actually pretty simple, but not with the instructions in the download, look at the attachments in this post here.

The lines preceeded by << mean remove, or comment out, and replace with the following lines preceeded with >> or add after.

VaaKo 10-05-2005 10:14 AM

will this hack bring spiders to my forum?

dutchbb 10-05-2005 12:23 PM

Ok I will give this a try. We do't have too mutch pages listed (642) and most of them are from pages with no real content (memberlist and such)

So I will install this and let you know what the results are in about 3-4 months.

Ty.

VaaKo 10-05-2005 01:48 PM

this is a very confusing hack :confused:
it's not working I guess, I ran the task and it did create the sitemap files from 36 to 81
but after that it gives me loads of errors, saying it couldn't create g_sitemap.xml and something like that
so I CHMODED the /html/ folder and tried again, it gave me this time an error donno what it is, so I refresh

is it working or no?

VaaKo 10-05-2005 01:50 PM

this is the error:

Code:

Warning: fopen(http://www.google.com/webmasters/sitemaps/ping?sitemap=http%3A%2F%2Fwww.oneforum.org%2Fg_sitemap.xml): failed to open stream: Connection timed out in /archive/forums_sitemap.php on line 264

Warning: feof(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 268

Warning: fread(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 27

....

dutchbb 10-05-2005 02:55 PM

Hmz not easy....

Quote:

I normally assign the permission this way:

#chown apache.MYUSER_GROUP archive
#chmod 775 archive

MYUSER_GROUP is the user group my login belongs to. #ls -l will show that.
I understand chmod 755 (it was actually already set to 777), but what do you mean with: '#chown apache.MYUSER_GROUP archive' and 'MYUSER_GROUP is the user group my login belongs to. #ls -l will show that' ? That's chinese to me !

PLease explain, thank you

lierduh 10-05-2005 08:52 PM

Quote:

Originally Posted by Triple_T
Hmz not easy....


I understand chmod 755 (it was actually already set to 777), but what do you mean with: '#chown apache.MYUSER_GROUP archive' and 'MYUSER_GROUP is the user group my login belongs to. #ls -l will show that' ? That's chinese to me !

PLease explain, thank you

The problem is no one wants to know/learn about the very basics of directory/file permission.

If you have 777, it means everyone can write to it. You won't need to bother with 'chown'. The example I provided was for someone who knows a bit more and wanting to have the more secure way. I will rewrite the instruction at next release and have one set of simple instruction for newbies. In the meantime, if you are still interested about directory permissions, please read:

https://vborg.vbsupport.ru/showpost....6&postcount=27

VaaKo 10-05-2005 09:14 PM

what about my error?

lierduh 10-05-2005 09:32 PM

Quote:

Originally Posted by Lebanese Forces
what about my error?

It means your web server could not connect to Google at port 80. It is possible Google was down at the time, or your web server's firewall prevents the script doing so. I just tried manually:

http://www.google.com/webmasters/sit...Fg_sitemap.xml

It returned:
=========
Sitemap Notification Received

Your Sitemap has been successfully added to our list of Sitemaps to crawl. If this is the first time you are notifying Google about this Sitemap, please add it via http://www.google.com/webmasters/sitemaps so you can track its status. Please note that we do not add all submitted URLs to our index, and we cannot make any predictions or guarantees about when or if they will appear.
===============

You can always comment out those lines in the script and manually submit the sitemaps.

buro9 10-06-2005 04:28 AM

To those unsure about this hack, I would say persevere.

My number of Google spiders hasn't increased dramatically, but they're being far more efficient.

A month ago the number of pages I had in Google was only 66,000, now I have over 846,000 pages indexed: http://www.google.co.uk/search?q=site%3Awww.bowlie.com

It really is worth it, although the code edits in archive/index.php can be a bugger to get your head around at first.

dutchbb 10-06-2005 10:33 AM

Quote:

Originally Posted by lierduh
The problem is no one wants to know/learn about the very basics of directory/file permission.

I do. That's why I replied:
Quote:

PLease explain, thank you
The more I can lurn , the better :)

Quote:

If you have 777, it means everyone can write to it. You won't need to bother with 'chown'. The example I provided was for someone who knows a bit more and wanting to have the more secure way. I will rewrite the instruction at next release and have one set of simple instruction for newbies. In the meantime, if you are still interested about directory permissions, please read:

https://vborg.vbsupport.ru/showpost....6&postcount=27
thank you, I will look into it and try to setup the hack

dutchbb 10-06-2005 11:51 AM

BTW I created a google sitemap account after the task. Is this needed, or do these need to be joined or something? And how does the vb task send the sitemaps if the google account wasn't even setup?

Also:
Quote:

Check your web log and see if the Search Engine visits are being redirected.
You should see 301 (Permanent Redirect) for the actual thread visit and
then 200 (ok) for the archive pages.
What weblog is there in vbulletin, if you mean currently active users: i don't see it there.


Alot of newbie questions again :D

PixelFx 10-06-2005 01:44 PM

Ok I know chmod, and set my archive/internal files to 777 .. I still get

Quote:


Warning: fwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 247

Warning: fwrite(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 253

Warning: fclose(): supplied argument is not a valid stream resource in /archive/forums_sitemap.php on line 254

I also did my root as 750 ..

my direction setup is url/forum/index.php for forums, I changed your robots text to disallow: /forum/files/ etc

and I still get error above, I also setup my .htaccess etc,

I'm running vbulletin v3.5.0 gold.

any sugguestions?

VaaKo 10-06-2005 02:33 PM

Quote:

Originally Posted by PixelFx
Ok I know chmod, and set my archive/internal files to 777 .. I still get



I also did my root as 750 ..

my direction setup is url/forum/index.php for forums, I changed your robots text to disallow: /forum/files/ etc

and I still get error above, I also setup my .htaccess etc,

I'm running vbulletin v3.5.0 gold.

any sugguestions?

i'm getting the same error!


All times are GMT. The time now is 03:06 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01757 seconds
  • Memory Usage 1,900KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (6)bbcode_code_printable
  • (1)bbcode_php_printable
  • (25)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (3)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete