Go Back   vb.org Archive > vBulletin Modifications > Archive > vB.org Archives > vBulletin 3.5 > vBulletin 3.5 Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Google sitemap for the vB Archives. Redirect human and robots. Details »»
Google sitemap for the vB Archives. Redirect human and robots.
Version: 1.2, by lierduh lierduh is offline
Developer Last Online: Nov 2023 Show Printable Version Email this Page

Version: 3.5.1 Rating:
Released: 08-09-2005 Last Update: 11-08-2005 Installs: 130
Uses Plugins
Code Changes Additional Files  
No support by the author.

Release V1.2 (9 Nov 2005)
* Higher sitemap priority rate is given to threads with new posts. So Google can index fresh threads first.

* Not recommending the original optional STEP 3 hack. To avoid potential Google penalty, my advice is to remove the STEP 3 hack.

Release V1.1a (12 Oct 2005)

* Bug fix only

Release V1.1 (9 Oct 2005)

* Can handle very large forums with more than 50,000 URLs per forum
URLs will be spanned through multiple files for each large forum.

* Created a function to detect search engine crawlers. The vB built-in
search engine detector can only identify about 3 or 4 search engines.
My function will detect over 20 search engine crawlers.

* Support forums hosted by web servers that do not support 'fix_pathinfo'
ie. instead of the usual 'archive/index.php/f-10.html' link. These
forums have a link as 'archive/index.php?f-10.html'.

* Alert about wrong directory permissions to help newbies.

* Automatically write index file to archive directory if the php
script can not write into the base vB directory.

* Bug fixes.


Objectives
==============
  • Create Google sitemap files and sitemap index file for vB archives, submit to Google by the Scheduled Tasks.
  • To have the vB Archive used as a mirror to the actual threads.
  • Google loves the nature of the archive pages, as they are static and do not contain repeated contents.
  • Google gauge pages heavily based on external links. We need to redirect these external thread links to the archive pages.
  • We often see vbulletin archive in the Google search results, but the users are taken to the archive page instead of the actual threads. We need to automatically redirect visitors to the actual threads instead of the archive. Otherwise the visitor either need to reclick for the Full Version or read the dull archive contents.

Q and A
==============
Q. Would the sitemap contain the links for hidden forums?
A. No, the forum permission was consulted while generating the sitemap files.

Q. How often are the sitemap files generated?
A. You decide and set in the Scheduled Tasks. The script can not be called by external user by default to prevent boring people killing your server.

Q. Is the sitemap file compressed.
A. Yes, the multiple sitemap files are gunziped according to Google sitemap standard to save bandwidth. Sitemap index file is not compressed, it is submitted as a normal xml file.

Q. Would the sitemaps include links for the normal threads? eg. showthread.php?t=1234...
A. No, it is unlikely Google will index your entire site if you feed it with all the combination of showthread links. It is better to let Google going through the more static archives. You will have a better chance for sure to have more thread contents indexed by Google this way.

Q. Why don't you go crazy about rewrite rules and do things like including thread title as the url.
A. I won't deny having keywords in the url is a good SEO strategy, but Google also does not like "Over Search Engine Optimized" web sites. Google has recently penalized a huge number of such sites. Sending them from page rank of 5, 6 to 0.

Q. Does sitemap really help?
A. Definitely, Google has done over 60,000 pages since I submitted my sitemaps a few days ago. Yahoo bots were visiting more pages than Google before the sitemap. I expect the total Google visits for this month will be exceeding Yahoo in the next one or two days.

What is involved?
==================
I have divided this hack into two steps. The first step involves unloading a php file. This enables the sitemap to be generated and submitted to Google.

The second step involves installing a Plugin using AdminCP. This sends all robots to the archive pages, preventing them viewing the actual threads.

For example, Google/Other Crawlers follows an external link to visit:
http://forums.mysite/showthread.php?t=1234&page=2

It will be told this page is permanently relocated to:
http://forums.mysite/archive/index.php/t-1234-p-2

This way you don't lose page rank gain from external links.

Install
=========
To install, follow the readme file.
To let me know you have installed this and let me send update information to you. Please click INSTALL .

Strategy
=========

It is unlikely Google/other Search Engine will index your entire site, especially due to the dynamic nature of the vbulletin forums. An archive sitemap will let Google concentrate on the real contents of your forums -- the threads. If Google needs to go through the endless member profile pages. It will get sick of it and just become tired.(sorry, perhaps robots can not become tired). What we can do is disallowing the crawling of unneccessary pages. My robots.txt contains:

#ALL BOTS
User-agent: *
Disallow: /admincp/
Disallow: /ajax.php
Disallow: /attachments/
Disallow: /clientscript/
Disallow: /cpstyles/
Disallow: /images/
Disallow: /includes/
Disallow: /install/
Disallow: /modcp/
Disallow: /subscriptions/
Disallow: /customavatars/
Disallow: /customprofilepics/
Disallow: /announcement.php
Disallow: /attachment.php
Disallow: /calendar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /external.php
Disallow: /faq.php
Disallow: /frm_attach
Disallow: /image.php
#Disallow: /index.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /member.php?
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /payment_gateway.php
Disallow: /payments.php
Disallow: /poll.php
Disallow: /postings.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /profile.php
Disallow: /register.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /showpost.php
Disallow: /subscription.php
Disallow: /usercp.php
Disallow: /threadrate.php
Disallow: /usercp.php
Disallow: /usernote.php

You perhaps have noticed I included index.php in there. Apparently Google regards http://forums.mysite/index.html as same as http://forums.mysite/
...but http://forums.mysite/index.php as a different file. The default vB templates include index.php as the internal link. That will spread your page rank on your home page! So it is better off not letting Google see this file.

If you have rewrite installed. Perhaps you could add to the .htaccess file:

RewriteCond %{QUERY_STRING} ^$
RewriteRule ^index.php$ / [R=301,L]

(if your forums are under http://site/forums/. Try: RewriteRule ^forums/index.php$ forums/ [R=301,L])

That will redirect /index.php to /, but only if no query_string is presented. ie. /index.php?do=mymod will not be redirected.

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #22  
Old 08-17-2005, 11:29 AM
KarateKid's Avatar
KarateKid KarateKid is offline
 
Join Date: Oct 2001
Location: Sydney
Posts: 158
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by lierduh
Go to your AdminCP (Admin Control Panel), look down on the left hand side.
does anyone run this hack already successfully with RC2?
Reply With Quote
  #23  
Old 08-17-2005, 12:37 PM
Rich's Avatar
Rich Rich is offline
 
Join Date: Mar 2004
Location: U.S.A
Posts: 921
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Hello,

I am running on a unix machine, my forums are in the root directory and not a sub directory. I uploaded the forums_sitemap.php to the archive folder and then I chmod it to 775. (meaning the archive directory)

I then chmod my root directory (public_html) to 775.

Then I went in, and added the scheduled task.

Then I ran the scheduled task and got these errors:

Fatal error: Call to a member function on a non-object in /home/habitats/public_html/archive/forums_sitemap.php on line 98

which is: $forums = $vbulletin->db->query("

Fatal error: Call to a member function on a non-object in /home/habitats/public_html/includes/functions.php on line 4240

which is: $vbulletin->db->unlock_tables();

Did I miss a step? I believe I covered everything.(I have complete access to my db including telenet authorization as well as query running ability. All of my usernames and passwords are set correctly.)
Reply With Quote
  #24  
Old 08-17-2005, 01:48 PM
lierduh lierduh is offline
 
Join Date: Jan 2003
Location: Sydney, Australia
Posts: 459
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

See the third post within this thread.

Quote:
Originally Posted by iguanairs
Hello,

I am running on a unix machine, my forums are in the root directory and not a sub directory. I uploaded the forums_sitemap.php to the archive folder and then I chmod it to 775. (meaning the archive directory)

I then chmod my root directory (public_html) to 775.

Then I went in, and added the scheduled task.

Then I ran the scheduled task and got these errors:

Fatal error: Call to a member function on a non-object in /home/habitats/public_html/archive/forums_sitemap.php on line 98

which is: $forums = $vbulletin->db->query("

Fatal error: Call to a member function on a non-object in /home/habitats/public_html/includes/functions.php on line 4240

which is: $vbulletin->db->unlock_tables();

Did I miss a step? I believe I covered everything.(I have complete access to my db including telenet authorization as well as query running ability. All of my usernames and passwords are set correctly.)
Reply With Quote
  #25  
Old 08-18-2005, 04:14 AM
flaregun flaregun is offline
 
Join Date: Aug 2004
Posts: 35
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

If I call the script directly: I get these errors:

Warning: array_keys(): The first argument should be an array in /includes/class_core.php on line 1375

Warning: Invalid argument supplied for foreach() in /includes/class_core.php on line 1375

Warning: array_keys(): The first argument should be an array in /includes/class_core.php on line 1390

Warning: Invalid argument supplied for foreach() in /includes/class_core.php on line 1390
Reply With Quote
  #26  
Old 08-18-2005, 04:52 AM
Brandon Sheley's Avatar
Brandon Sheley Brandon Sheley is offline
 
Join Date: Mar 2005
Location: Google Kansas
Posts: 4,678
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

i don't understand this part of the install ? i'm very new to the new setup.. 2 nights now..lol

Code:
You need to change the directory permission so that the script can write
to the base and archive directory. If your server runs apache and apache
is run under apache user and apacher group.

I normally assign the permission this way:

#chown apache.MYUSER_GROUP archive
#chmod 775 archive

MYUSER_GROUP is the user group my login belongs to. #ls -l will show that.

775 will let apache (the script) and me (after I log in) add/change files.

Set the same permission to the base vB directory.
is this a file i chmod :ermm: im lost
Reply With Quote
  #27  
Old 08-19-2005, 12:31 AM
lierduh lierduh is offline
 
Join Date: Jan 2003
Location: Sydney, Australia
Posts: 459
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Which version of vB do you use? The line numbers do not match. Otherwise, have you modified class_core.php file? perhaps due to the installation of another hack?


Quote:
Originally Posted by flaregun
If I call the script directly: I get these errors:

Warning: array_keys(): The first argument should be an array in /includes/class_core.php on line 1375

Warning: Invalid argument supplied for foreach() in /includes/class_core.php on line 1375

Warning: array_keys(): The first argument should be an array in /includes/class_core.php on line 1390

Warning: Invalid argument supplied for foreach() in /includes/class_core.php on line 1390
Reply With Quote
  #28  
Old 08-19-2005, 01:14 AM
lierduh lierduh is offline
 
Join Date: Jan 2003
Location: Sydney, Australia
Posts: 459
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Loco Macheen
i don't understand this part of the install ? i'm very new to the new setup.. 2 nights now..lol

Code:
You need to change the directory permission so that the script can write
to the base and archive directory. If your server runs apache and apache
is run under apache user and apacher group.

I normally assign the permission this way:

#chown apache.MYUSER_GROUP archive
#chmod 775 archive

MYUSER_GROUP is the user group my login belongs to. #ls -l will show that.

775 will let apache (the script) and me (after I log in) add/change files.

Set the same permission to the base vB directory.
is this a file i chmod :ermm: im lost
chmod and chown are Unix commands.

I did not realise many amdins are newbies when comes to system admin.

Basically you need to change the permission of the directories so that the php script can write files (sitemaps) to them. I understand some of your providers probably do not even provide shell access to the server, only some sort of user control panel for admin purpose. I am afraid I can not explain how to use these control panels as I have not seen one. Users of such ISP control panel may be able to provide more information. People who do not know what to do should provide information such as what sort of control panel do you use, what is in there, what have you tried.

I will try to explain in general:

When I refer to the base directory, I refer to the forum base directory. Some of you might have set up the forums this way:

http://www.mysite.com/forums/

That means http://www.mysite.com/index.html will be in the root directory of the domain. The base directory for the vB will be ./forums under the web root.

If your forums are set up as http://forums.mysite.com/
Then the base vB directory will be the root directory for the domain (forums.mysite.com)

This hack needs to write to
1) The base vB directory (where you find showthread.php file)
2) The archive directoy (where you find archive.css file)

So you need to make these two directories writable for the php script, OR world writable.

A little info for the Unix chomd command.

1: executable
2: writable
4: readable

1+4 = 5 means readable and executable. Directories should be at least 1, 5 for a directory means visitor can read (list) directory contents, unless a index file is found.

php files only need to be readable: 4

If we need to write to the directory, then the permission needs to be:
1+4+2 = 7

There are three permission for each file/directory. 1) User, 2) Group, 3) World/anyone

User means the Unix logged in user, or the user the script runs as (typically apache or nobody is used by web servers).

Group means the user group the user belongs to. Typically the apache server runs the script as 'apache' or 'nobody' group.

World means the permission for everyone. They can be any user who logs into the web server. Naturally it includes the user that the php scripts runs as.

If you do a
#chmod 777 a_directory
The first 7 means the user can read, execute, write to a_directory.
The second 7 means the user group can read, execute, write to a_directory.
The third 7 means anyone can do these tasks.

So a 777 permission will sure let scripts write stuff to the directory, but with less security.

chown is another Unix command to change the owership of a file/directory.

#chown myusername.mygroupname a_directory
will change the directory's owner to 'myusername', and make the directory belongs to 'mygroupname' group.

All above refers to Unix/Linux usage, Windows probably uses some mouse clicks, but the essence should be the same regarding user/group and permission.

Now that I have spent time and effort to write, I hope the people who ask questions can also take the time and effort to write questions.
Reply With Quote
  #29  
Old 08-19-2005, 03:26 AM
flaregun flaregun is offline
 
Join Date: Aug 2004
Posts: 35
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by lierduh
Which version of vB do you use? The line numbers do not match. Otherwise, have you modified class_core.php file? perhaps due to the installation of another hack?
VB3.5 rc2, no mods to that file
Reply With Quote
  #30  
Old 08-19-2005, 03:47 AM
Brandon Sheley's Avatar
Brandon Sheley Brandon Sheley is offline
 
Join Date: Mar 2005
Location: Google Kansas
Posts: 4,678
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by lierduh
chmod and chown are Unix commands.

I did not realise many amdins are newbies when comes to system admin.

Basically you need to change the permission of the directories so that the php script can write files (sitemaps) to them. I understand some of your providers probably do not even provide shell access to the server, only some sort of user control panel for admin purpose. I am afraid I can not explain how to use these control panels as I have not seen one. Users of such ISP control panel may be able to provide more information. People who do not know what to do should provide information such as what sort of control panel do you use, what is in there, what have you tried.

I will try to explain in general:

When I refer to the base directory, I refer to the forum base directory. Some of you might have set up the forums this way:

http://www.mysite.com/forums/

That means http://www.mysite.com/index.html will be in the root directory of the domain. The base directory for the vB will be ./forums under the web root.

If your forums are set up as http://forums.mysite.com/
Then the base vB directory will be the root directory for the domain (forums.mysite.com)

This hack needs to write to
1) The base vB directory (where you find showthread.php file)
2) The archive directoy (where you find archive.css file)

So you need to make these two directories writable for the php script, OR world writable.

A little info for the Unix chomd command.

1: executable
2: writable
4: readable

1+4 = 5 means readable and executable. Directories should be at least 1, 5 for a directory means visitor can read (list) directory contents, unless a index file is found.

php files only need to be readable: 4

If we need to write to the directory, then the permission needs to be:
1+4+2 = 7

There are three permission for each file/directory. 1) User, 2) Group, 3) World/anyone

User means the Unix logged in user, or the user the script runs as (typically apache or nobody is used by web servers).

Group means the user group the user belongs to. Typically the apache server runs the script as 'apache' or 'nobody' group.

World means the permission for everyone. They can be any user who logs into the web server. Naturally it includes the user that the php scripts runs as.

If you do a
#chmod 777 a_directory
The first 7 means the user can read, execute, write to a_directory.
The second 7 means the user group can read, execute, write to a_directory.
The third 7 means anyone can do these tasks.

So a 777 permission will sure let scripts write stuff to the directory, but with less security.

chown is another Unix command to change the owership of a file/directory.

#chown myusername.mygroupname a_directory
will change the directory's owner to 'myusername', and make the directory belongs to 'mygroupname' group.

All above refers to Unix/Linux usage, Windows probably uses some mouse clicks, but the essence should be the same regarding user/group and permission.

Now that I have spent time and effort to write, I hope the people who ask questions can also take the time and effort to write questions.
i wouldn't go as far as to say im a newbie...
i know how to chmod,
as you notice my question is where is the file if i'm chmoding...

you said something about shell access ? whats that mean, i'm my own hosting reseller, i have full access to my host.
If you woul be so kind to tell me where this file is,, thats all i ask
btw its vb 3.5.0 rc2
the newest one as of this post

thank you,
-LM
Reply With Quote
  #31  
Old 08-19-2005, 05:02 AM
lierduh lierduh is offline
 
Join Date: Jan 2003
Location: Sydney, Australia
Posts: 459
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Loco Macheen
i wouldn't go as far as to say im a newbie...
i know how to chmod,
as you notice my question is where is the file if i'm chmoding...

you said something about shell access ? whats that mean, i'm my own hosting reseller, i have full access to my host.
If you woul be so kind to tell me where this file is,, thats all i ask
btw its vb 3.5.0 rc2
the newest one as of this post

thank you,
-LM
I spent all that time, and you don't even read it probably?

Quote:
You need to change the directory permission so that the script can write
to the base and archive directory.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 01:16 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.05140 seconds
  • Memory Usage 2,354KB
  • Queries Executed 27 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (2)bbcode_code
  • (8)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (6)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (1)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete