The Arcive of Official vBulletin Modifications Site.It is not a VB3 engine, just a parsed copy! |
|
Google sitemap for the vB Archives. Redirect human and robots. Details »» | |||||||||||||||||||||||||||
Google sitemap for the vB Archives. Redirect human and robots.
Developer Last Online: Nov 2023
Release V1.2 (9 Nov 2005)
* Higher sitemap priority rate is given to threads with new posts. So Google can index fresh threads first. * Not recommending the original optional STEP 3 hack. To avoid potential Google penalty, my advice is to remove the STEP 3 hack. Release V1.1a (12 Oct 2005) * Bug fix only Release V1.1 (9 Oct 2005) * Can handle very large forums with more than 50,000 URLs per forum URLs will be spanned through multiple files for each large forum. * Created a function to detect search engine crawlers. The vB built-in search engine detector can only identify about 3 or 4 search engines. My function will detect over 20 search engine crawlers. * Support forums hosted by web servers that do not support 'fix_pathinfo' ie. instead of the usual 'archive/index.php/f-10.html' link. These forums have a link as 'archive/index.php?f-10.html'. * Alert about wrong directory permissions to help newbies. * Automatically write index file to archive directory if the php script can not write into the base vB directory. * Bug fixes. Objectives ==============
Q and A ============== Q. Would the sitemap contain the links for hidden forums? A. No, the forum permission was consulted while generating the sitemap files. Q. How often are the sitemap files generated? A. You decide and set in the Scheduled Tasks. The script can not be called by external user by default to prevent boring people killing your server. Q. Is the sitemap file compressed. A. Yes, the multiple sitemap files are gunziped according to Google sitemap standard to save bandwidth. Sitemap index file is not compressed, it is submitted as a normal xml file. Q. Would the sitemaps include links for the normal threads? eg. showthread.php?t=1234... A. No, it is unlikely Google will index your entire site if you feed it with all the combination of showthread links. It is better to let Google going through the more static archives. You will have a better chance for sure to have more thread contents indexed by Google this way. Q. Why don't you go crazy about rewrite rules and do things like including thread title as the url. A. I won't deny having keywords in the url is a good SEO strategy, but Google also does not like "Over Search Engine Optimized" web sites. Google has recently penalized a huge number of such sites. Sending them from page rank of 5, 6 to 0. Q. Does sitemap really help? A. Definitely, Google has done over 60,000 pages since I submitted my sitemaps a few days ago. Yahoo bots were visiting more pages than Google before the sitemap. I expect the total Google visits for this month will be exceeding Yahoo in the next one or two days. What is involved? ================== I have divided this hack into two steps. The first step involves unloading a php file. This enables the sitemap to be generated and submitted to Google. The second step involves installing a Plugin using AdminCP. This sends all robots to the archive pages, preventing them viewing the actual threads. For example, Google/Other Crawlers follows an external link to visit: http://forums.mysite/showthread.php?t=1234&page=2 It will be told this page is permanently relocated to: http://forums.mysite/archive/index.php/t-1234-p-2 This way you don't lose page rank gain from external links. Install ========= To install, follow the readme file. To let me know you have installed this and let me send update information to you. Please click INSTALL . Strategy ========= It is unlikely Google/other Search Engine will index your entire site, especially due to the dynamic nature of the vbulletin forums. An archive sitemap will let Google concentrate on the real contents of your forums -- the threads. If Google needs to go through the endless member profile pages. It will get sick of it and just become tired.(sorry, perhaps robots can not become tired). What we can do is disallowing the crawling of unneccessary pages. My robots.txt contains: #ALL BOTS User-agent: * Disallow: /admincp/ Disallow: /ajax.php Disallow: /attachments/ Disallow: /clientscript/ Disallow: /cpstyles/ Disallow: /images/ Disallow: /includes/ Disallow: /install/ Disallow: /modcp/ Disallow: /subscriptions/ Disallow: /customavatars/ Disallow: /customprofilepics/ Disallow: /announcement.php Disallow: /attachment.php Disallow: /calendar.php Disallow: /cron.php Disallow: /editpost.php Disallow: /external.php Disallow: /faq.php Disallow: /frm_attach Disallow: /image.php #Disallow: /index.php Disallow: /inlinemod.php Disallow: /joinrequests.php Disallow: /login.php Disallow: /member.php? Disallow: /memberlist.php Disallow: /misc.php Disallow: /moderator.php Disallow: /newattachment.php Disallow: /newreply.php Disallow: /newthread.php Disallow: /online.php Disallow: /payment_gateway.php Disallow: /payments.php Disallow: /poll.php Disallow: /postings.php Disallow: /printthread.php Disallow: /private.php Disallow: /profile.php Disallow: /register.php Disallow: /report.php Disallow: /reputation.php Disallow: /search.php Disallow: /sendmessage.php Disallow: /showgroups.php Disallow: /showpost.php Disallow: /subscription.php Disallow: /usercp.php Disallow: /threadrate.php Disallow: /usercp.php Disallow: /usernote.php You perhaps have noticed I included index.php in there. Apparently Google regards http://forums.mysite/index.html as same as http://forums.mysite/ ...but http://forums.mysite/index.php as a different file. The default vB templates include index.php as the internal link. That will spread your page rank on your home page! So it is better off not letting Google see this file. If you have rewrite installed. Perhaps you could add to the .htaccess file: RewriteCond %{QUERY_STRING} ^$ RewriteRule ^index.php$ / [R=301,L] (if your forums are under http://site/forums/. Try: RewriteRule ^forums/index.php$ forums/ [R=301,L]) That will redirect /index.php to /, but only if no query_string is presented. ie. /index.php?do=mymod will not be redirected. Show Your Support
|
Comments |
#82
|
|||
|
|||
I've tried to do the compare html, but its much harder than just making a txt file, saying remove here, add here, place under this, place over that etc. As far as I know this hack is implete or just not designed for 3.5.0 .. although I'm sure its a good hack for RC3 users, gold I can't get it to work.
|
#83
|
|||
|
|||
Quote:
Your permissions are wrong, and by the sound of it, you do not fully understand chmod. The good thing is at least you explained what you have done instead of saying: "I have the errors, tell me how to fix". Please list your directory permissions here. (two #ls -l output) Let me guess, you changed the permission for the files, not the DIRECTORIES as my instruction. You said you changed your root? (I highly doubt, do you mean '/' or the base directory of your vB install? When you have 750, what are the ownership of the directory? User ownership and group owership? It is very weird to set a web directory to 750, usually you would need to at least have 751, or 755. ============= Instruction I will include in the next readme file: The script will need to write files to two directories. 1) The base vB directory 2) the archive directory. You will need to change the DIRECTORY permissions for these two directories. Let's presume your directory structure is: ~public_html ('~' means it is under your user home directory, the acture directory should be something like: /home/your_isp_user_name/public_html) ~public_html/showthread.php ... ~public_html/archive/ ~public_html/archive/index.php ... You need to do: 1) change the permission for the vB base directory. So #chmod 777 public_html (or #chmod 777 ~public_html if you are not already under your home directory) 2) change the permission for the archive directory. Do #chmod 777 public_html/archive (or #chmod 777 ~public_html/archive or #chmod 777 archive if you use some kind of web based control panel) ============= I really wish someone who can write a better instruction for changing directory permissions for me. 50% of the problems using this hack are permission related. |
#84
|
|||
|
|||
Quote:
|
#85
|
|||
|
|||
I have cpanel system, was using wsftp to chmod the directories. As per your instructions. by Root I ment public_html .. its a cpanel permission thing, as base permission. anyway I did what you asked in your intructions(readme, not whats above), but for some reason it was greifing, I've never had an issue with chmod till trying to do this script, I figoured it was RC3 issue not vb3.5.0 .. I'll give it a shot again with your info about
.. in instructions above, the only thing I missed was permissions on /forum/ from how you explained it above. I'll try again and post results. my setup is pubic_html/forum/archive .. your saying in your setup in relation to mine to do /forum/ = 777 and /archive/ = 777, which is what I'll try next, currently I'm in sleep typing mode. |
#86
|
|||
|
|||
Quote:
It's been a few days now, guess what http://www.google.be/search?hl=nl&q=...e+zoeken&meta= 9.640 listed already, coming from 642 just 2 days ago!!!!!! Man this hack rocks, you deserve to win hack of the month + hack of the year for all I care ps: just need a better explanation for step 3, I couldn't do that one. |
#87
|
|||
|
|||
I have a problem though... you're making a sitemap gz for each forum, well, some of my forums are big:
Quote:
So we'd start with: http://www.bowlie.com/forum/archive/sitemap_4_1.gz And when we passed an arbitrary value (make it a setting in the file in case Google change it later) we would move onto: http://www.bowlie.com/forum/archive/sitemap_4_2.gz http://www.bowlie.com/forum/archive/sitemap_4_3.gz through http://www.bowlie.com/forum/archive/...p_4_9999999.gz etc As it stands, Google is now refusing to pay attention to my mine as the one that exceeds it basically causes the whole thing to error. |
#88
|
|||
|
|||
Quote:
|
#89
|
|||
|
|||
Does this workd on 3.5.0 Gold?
|
#90
|
|||
|
|||
What are we looking for when this script runs. I see alot of zip files or whatever in the archive section but nothing in the other sections..is this right
|
#91
|
|||
|
|||
Quote:
|
|
|
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
More Information | |
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|