Go Back   vb.org Archive > vBulletin Modifications > Archive > vB.org Archives > vBulletin 3.5 > vBulletin 3.5 Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Google sitemap for the vB Archives. Redirect human and robots. Details »»
Google sitemap for the vB Archives. Redirect human and robots.
Version: 1.2, by lierduh lierduh is offline
Developer Last Online: Nov 2023 Show Printable Version Email this Page

Version: 3.5.1 Rating:
Released: 08-09-2005 Last Update: 11-08-2005 Installs: 130
Uses Plugins
Code Changes Additional Files  
No support by the author.

Release V1.2 (9 Nov 2005)
* Higher sitemap priority rate is given to threads with new posts. So Google can index fresh threads first.

* Not recommending the original optional STEP 3 hack. To avoid potential Google penalty, my advice is to remove the STEP 3 hack.

Release V1.1a (12 Oct 2005)

* Bug fix only

Release V1.1 (9 Oct 2005)

* Can handle very large forums with more than 50,000 URLs per forum
URLs will be spanned through multiple files for each large forum.

* Created a function to detect search engine crawlers. The vB built-in
search engine detector can only identify about 3 or 4 search engines.
My function will detect over 20 search engine crawlers.

* Support forums hosted by web servers that do not support 'fix_pathinfo'
ie. instead of the usual 'archive/index.php/f-10.html' link. These
forums have a link as 'archive/index.php?f-10.html'.

* Alert about wrong directory permissions to help newbies.

* Automatically write index file to archive directory if the php
script can not write into the base vB directory.

* Bug fixes.


Objectives
==============
  • Create Google sitemap files and sitemap index file for vB archives, submit to Google by the Scheduled Tasks.
  • To have the vB Archive used as a mirror to the actual threads.
  • Google loves the nature of the archive pages, as they are static and do not contain repeated contents.
  • Google gauge pages heavily based on external links. We need to redirect these external thread links to the archive pages.
  • We often see vbulletin archive in the Google search results, but the users are taken to the archive page instead of the actual threads. We need to automatically redirect visitors to the actual threads instead of the archive. Otherwise the visitor either need to reclick for the Full Version or read the dull archive contents.

Q and A
==============
Q. Would the sitemap contain the links for hidden forums?
A. No, the forum permission was consulted while generating the sitemap files.

Q. How often are the sitemap files generated?
A. You decide and set in the Scheduled Tasks. The script can not be called by external user by default to prevent boring people killing your server.

Q. Is the sitemap file compressed.
A. Yes, the multiple sitemap files are gunziped according to Google sitemap standard to save bandwidth. Sitemap index file is not compressed, it is submitted as a normal xml file.

Q. Would the sitemaps include links for the normal threads? eg. showthread.php?t=1234...
A. No, it is unlikely Google will index your entire site if you feed it with all the combination of showthread links. It is better to let Google going through the more static archives. You will have a better chance for sure to have more thread contents indexed by Google this way.

Q. Why don't you go crazy about rewrite rules and do things like including thread title as the url.
A. I won't deny having keywords in the url is a good SEO strategy, but Google also does not like "Over Search Engine Optimized" web sites. Google has recently penalized a huge number of such sites. Sending them from page rank of 5, 6 to 0.

Q. Does sitemap really help?
A. Definitely, Google has done over 60,000 pages since I submitted my sitemaps a few days ago. Yahoo bots were visiting more pages than Google before the sitemap. I expect the total Google visits for this month will be exceeding Yahoo in the next one or two days.

What is involved?
==================
I have divided this hack into two steps. The first step involves unloading a php file. This enables the sitemap to be generated and submitted to Google.

The second step involves installing a Plugin using AdminCP. This sends all robots to the archive pages, preventing them viewing the actual threads.

For example, Google/Other Crawlers follows an external link to visit:
http://forums.mysite/showthread.php?t=1234&page=2

It will be told this page is permanently relocated to:
http://forums.mysite/archive/index.php/t-1234-p-2

This way you don't lose page rank gain from external links.

Install
=========
To install, follow the readme file.
To let me know you have installed this and let me send update information to you. Please click INSTALL .

Strategy
=========

It is unlikely Google/other Search Engine will index your entire site, especially due to the dynamic nature of the vbulletin forums. An archive sitemap will let Google concentrate on the real contents of your forums -- the threads. If Google needs to go through the endless member profile pages. It will get sick of it and just become tired.(sorry, perhaps robots can not become tired). What we can do is disallowing the crawling of unneccessary pages. My robots.txt contains:

#ALL BOTS
User-agent: *
Disallow: /admincp/
Disallow: /ajax.php
Disallow: /attachments/
Disallow: /clientscript/
Disallow: /cpstyles/
Disallow: /images/
Disallow: /includes/
Disallow: /install/
Disallow: /modcp/
Disallow: /subscriptions/
Disallow: /customavatars/
Disallow: /customprofilepics/
Disallow: /announcement.php
Disallow: /attachment.php
Disallow: /calendar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /external.php
Disallow: /faq.php
Disallow: /frm_attach
Disallow: /image.php
#Disallow: /index.php
Disallow: /inlinemod.php
Disallow: /joinrequests.php
Disallow: /login.php
Disallow: /member.php?
Disallow: /memberlist.php
Disallow: /misc.php
Disallow: /moderator.php
Disallow: /newattachment.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /payment_gateway.php
Disallow: /payments.php
Disallow: /poll.php
Disallow: /postings.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /profile.php
Disallow: /register.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendmessage.php
Disallow: /showgroups.php
Disallow: /showpost.php
Disallow: /subscription.php
Disallow: /usercp.php
Disallow: /threadrate.php
Disallow: /usercp.php
Disallow: /usernote.php

You perhaps have noticed I included index.php in there. Apparently Google regards http://forums.mysite/index.html as same as http://forums.mysite/
...but http://forums.mysite/index.php as a different file. The default vB templates include index.php as the internal link. That will spread your page rank on your home page! So it is better off not letting Google see this file.

If you have rewrite installed. Perhaps you could add to the .htaccess file:

RewriteCond %{QUERY_STRING} ^$
RewriteRule ^index.php$ / [R=301,L]

(if your forums are under http://site/forums/. Try: RewriteRule ^forums/index.php$ forums/ [R=301,L])

That will redirect /index.php to /, but only if no query_string is presented. ie. /index.php?do=mymod will not be redirected.

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #212  
Old 12-15-2005, 02:03 PM
Totti's Avatar
Totti Totti is offline
 
Join Date: Jul 2005
Location: Germany
Posts: 72
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

ho guys, i am using vbSEO sitemap generation ... this plug-in for 3.5x.

Actually my question is: Which product is better?
Or should we possibly merge them by using the sitemap of vbseo and the redirection of this one here.

thanks for replies!
Reply With Quote
  #213  
Old 01-07-2006, 09:05 PM
rlamego rlamego is offline
 
Join Date: Nov 2004
Posts: 79
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Is there any way to run this without having to chmod my root dir to the the most insecure values on earth?
Reply With Quote
  #214  
Old 01-11-2006, 05:46 PM
eoc_Jason's Avatar
eoc_Jason eoc_Jason is offline
 
Join Date: Dec 2001
Location: Houston, TX
Posts: 493
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by rlamego
Is there any way to run this without having to chmod my root dir to the the most insecure values on earth?
If you create the initial file with the proper permissions, then you shouldn't have to chmod the whole directory as it just overwrites the file I believe...
Reply With Quote
  #215  
Old 01-11-2006, 07:59 PM
samu2 samu2 is offline
 
Join Date: Nov 2005
Posts: 66
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

How to I install the plugin? what hook does it go under and what title does it have? thanks

EDIT-Know how to install plugin,just don't know what hook name or what I should call it.
Reply With Quote
  #216  
Old 01-11-2006, 09:07 PM
samu2 samu2 is offline
 
Join Date: Nov 2005
Posts: 66
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I need to learn to read the install files

OK so submitted it,it said it couldn't make a site map from a certain file then it attempted to submit it from elsewhere and it went through OK I assume as it said that it had been submitted for crawling.

Installed the plug in.Since then my bots on my site have been stuck viewing one thread for an hour.I guess I will see what they do tomorrow if I made an error could it make the bots get stuck?
Reply With Quote
  #217  
Old 01-12-2006, 09:28 PM
Eagle Creek's Avatar
Eagle Creek Eagle Creek is offline
 
Join Date: Jan 2004
Location: Netherlands
Posts: 742
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Looks nice!

Is there any possibility I will lose, or decrease my hits on google?
Reply With Quote
  #218  
Old 01-20-2006, 06:56 AM
geezmo geezmo is offline
 
Join Date: Aug 2005
Posts: 15
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Help, my plug-in is not sending the sitemap to Google! I've installed everything correctly, added the plug-in and set a scheduled task for it. In my /forum/archive folder, there are several .gz files so I know sitemaps are being generated by the hack. But the problem is, these are not being sent to Google.

In AdminCP > Scheduled Task Manager, I get a "Done" message everytime I click "Run Now" to test the plugin. But if I check Scheduled Task Log, there's no log entry for this plugin, not even one!

Can anyone tell me what's wrong?

Probably this could help. While installing this hack, I chmodded to 777 the /forum and /archive folders as per the instructions. However when I tried to call the forums_sitemap.php file, I always get an Error 500 (Internal Server Error). Also, the forum itself is down and inaccessible while the folders are chmodded 777. I checked my error log and it confirms that the error existed because the "directory is writable by others."

The point is, when I call the forums_sitemap.php file, I get a Server Error so I basically can't make it to run even for the first time. Because the forum is down because the "directory is writable by others," I have to chmod the folders back to 755.

Anyone who knows how to make the hack send the sitemaps to Google?

PS. I'm using ver. 3.5.3
Reply With Quote
  #219  
Old 01-24-2006, 03:14 AM
geezmo geezmo is offline
 
Join Date: Aug 2005
Posts: 15
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

any reply to my question above anyone?

anyway, i have another question. what i just did was manually submit to Google the sitemaps generated by this hack. as i said before, in /forum/archive folder, i have a lot of gz files which i assume are sitemaps.

my question is, how are the different gz files different from one another? i have more than 50 sitemap files from number 13 until number 98 but i am missing other files. what i mean is i have sitemap_13.gz file and sitemap_98.gz but i don't have sitemap_21.gz , sitemap_22.gz and sitemap_23.gz files to name a few.

i only submitted sitemap_98.gz and i'm wondering if the other (or all?) sitemap files need to be submitted to Google?
Reply With Quote
  #220  
Old 02-20-2006, 03:13 AM
Zia's Avatar
Zia Zia is offline
 
Join Date: Dec 2005
Location: golpo.net
Posts: 931
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Totti
ho guys, i am using vbSEO sitemap generation ... this plug-in for 3.5x.
Actually my question is: Which product is better?
Or should we possibly merge them by using the sitemap of vbseo and the redirection of this one here.
thanks for replies!
I have The Same Qustion.....as we are not xpert hand..
we are usingng "vBSEO Google/Yahoo Sitemap Generator for vBulletin 3.5.x & vBulletin 3.0.x " located in Here

what is the difference in between this two ? if u xplain,it will help us to decide our mind.....
other hand we looks for ur suggetion about url re-writer(not vbseo origin one-that costs 150$)

does this sitemap provide rss feed portion too? (maybe i cant ask the qustion properly).i mean using this sitemap does google discover sites rss ?
btw in robots.txt why we need to disallow external.php ?


Plz let us know the details...

Thnx in advance.
Reply With Quote
  #221  
Old 04-20-2006, 07:01 PM
stinger2's Avatar
stinger2 stinger2 is offline
 
Join Date: Jul 2005
Posts: 274
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

no one is supporting this ?
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 05:58 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.04814 seconds
  • Memory Usage 2,324KB
  • Queries Executed 25 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (2)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (6)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (3)pagenav_pagelink
  • (1)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete