vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   Modification Graveyard (https://vborg.vbsupport.ru/forumdisplay.php?f=224)
-   -   Miscellaneous Hacks - vB Global Translator - Multiply your indexed pages & put search traffic on autopilot (https://vborg.vbsupport.ru/showthread.php?t=217329)

imported_silkroad 08-04-2009 05:31 AM

After working with this mod for three weeks, I have dropped the Google Sitemap method suggested by the vBSEO team. That method, well intended, was a quick "kludge" which was not optimal for this type of application.

What I have done is easy and requires a small bit of manual labor and goes something like this:
  1. Copy the *xml.gz files from ./vbseo_sitemap/data to another directory, for example FORUMROOT/es for Spanish, FORUMROOT/ja for Japanese, FORUMROOT/zh-CN for Chinese etc.
  2. Unzip the files and use sed to add the ?hl=ja (or whatever flag you want to do) to each URL in the Sitemap. This takes about 10 seconds.
  3. Update sitemap_index.xml.gz the same way, or use VI, etc.
  4. Submit this Sitemap to Google.
  5. Copy the first one you did and repeat for as many languages as you wish.

This method has many advantages.

First of all you have a completely different sitemap of your entire site for each language. So easy to submit to language specific search engines. Also, you can easily track the indexing progress for each Sitemap. This is much easier to manage and much cleaner, IMHO.

Of course, this method takes a bit of work when your need to update your language Sitemaps, but if you have a large board, this will get you indexed nicely in a well organized way. You can add the newer links after a high percentage of the legacy links are archived (in a few months).

We added the top 10 languages to Google Webmaster Tools, each with its own Sitemap, so where we originally had one big sitemap with nearly 396K URLs, we now have a total of around 4,750K URLs total in 11 Sitemaps. So far, Google is happy :-)

With this simple method, you can see the index progress on each language. You can submit your Sitemaps to language specific search engines. You can manage the update frequency on the translated URLs differently than your main site. You can also avoid any potential problems with your main sitemap.

(See attachment)

Enjoy and Good Luck!

imported_silkroad 08-04-2009 05:49 AM

I forgot to add, for the next "trick" I will write the URLs in the new Sitemaps as

Code:

FORUMROOT/flag/url.html
versus

Code:

FORUMROOT/url.html?hl=flag
etc....

and then add a very simple mod_rewrite rule to rewrite FORUMROOT/flags/blahblah.... to FORUMROOT/blahblah?hl=flag

:-)

cyc 08-04-2009 05:57 AM

Quote:

Originally Posted by imported_silkroad (Post 1861045)
I forgot to add, for the next "trick" I will write the URLs in the new Sitemaps as

Code:

FORUMROOT/flag/url.html
versus

Code:

FORUMROOT/url.html?hl=flag
etc....

and then add a very simple mod_rewrite rule to rewrite FORUMROOT/flags/blahblah.... to FORUMROOT/blahblah?hl=flag

:-)

Great idea! keep in mind this will effect paths to css and images on some forums.

imported_silkroad 08-04-2009 06:16 AM

Note:

We are definately seeing a sort of "Google penalty" for using ?hl=flag (duplicate content, it seems -- at least in the Sitemaps)

As Google indexes the various language sitemaps, it is subtracting indexed links from the main sitemap ... so I will need to rewrite the URLs sooner-than-later !

NLP-er 08-04-2009 10:03 AM

Quote:

Originally Posted by imported_silkroad (Post 1861053)
Note:

We are definately seeing a sort of "Google penalty" for using ?hl=flag (duplicate content, it seems -- at least in the Sitemaps)

As Google indexes the various language sitemaps, it is subtracting indexed links from the main sitemap ... so I will need to rewrite the URLs sooner-than-later !

I think it was temporary Google problem - I change nothing and right now my sitemap is clear of duplicate content errors :) Just checked.

imported_silkroad 08-04-2009 10:23 AM

<font color="Red">EDIT: DOES NOT WORK YET. DO NOT USE THE 'BRAINSTORMING' IDEA HERE</font>. - imported_silkroad

Cheers NLP-er and Thanks.

I think we will vBSEO rewrite, something like:

'^(.+?)\.html\?hl=(.+?)$' => '$2/$1.html'

Do you think it this work?

Or maybe just manually change the sitmaps (as above #591) and try:

'^ja/(.+?)\.html$' => '$1.html?hl=ja'
'^ru/(.+?)\.html$' => '$1.html?hl=ru'
'^ko/(.+?)\.html$' => '$1.html?hl=ko'

yadda, yadda, yadda ..

Maybe?

swerdlow 08-04-2009 03:03 PM

just installed this...works great and easy setup too

thanks ;)

tpearl5 08-04-2009 04:08 PM

Quote:

Originally Posted by imported_silkroad (Post 1861143)
Cheers NLP-er and Thanks.

I think we will vBSEO rewrite, something like:

'^(.+?)\.html\?hl=(.+?)$' => '$2/$1.html'

Do you think it this work?

Or maybe just manually change the sitmaps (as above #591) and try:

'^ja/(.+?)\.html$' => '$1.html?hl=ja'
'^ru/(.+?)\.html$' => '$1.html?hl=ru'
'^ko/(.+?)\.html$' => '$1.html?hl=ko'

yadda, yadda, yadda ..

Maybe?

Did you add this to your custom rewrite rules and custom redirects?

NLP-er 08-04-2009 11:53 PM

Quote:

Originally Posted by imported_silkroad (Post 1861143)
Cheers NLP-er and Thanks.

I think we will vBSEO rewrite, something like:

'^(.+?)\.html\?hl=(.+?)$' => '$2/$1.html'

Do you think it this work?

Or maybe just manually change the sitmaps (as above #591) and try:

'^ja/(.+?)\.html$' => '$1.html?hl=ja'
'^ru/(.+?)\.html$' => '$1.html?hl=ru'
'^ko/(.+?)\.html$' => '$1.html?hl=ko'

yadda, yadda, yadda ..

Maybe?

I checked first one '^(.+?)\.html\?hl=(.+?)$' => '$2/$1.html' and it crashed my forum :D So I commented it.

The best way would be to redirect internally URLs like country/rest to rest?hl=country.
Internally I mean without changing URL in browser (without sending header).
And redirect URLs like rest?hl=country to country/rest with 301 header.

It would be best because old, already indexed addresses will work. Redirect will made reindexing faster (I think :)) and you avoid possibility of duplicate content penalty if same content is available in booth URL's. In same time it would be good for this mod because no changes would be required at all. Unfortunately I'm not expert in .htaccess file or vbSEO custom rewrite rules, so I don't know does it is possible. For sure it is possible to redirect one address to other, but can it be done internally?...

imported_silkroad 08-05-2009 01:30 AM

Quote:

Originally Posted by NLP-er (Post 1861521)
I checked first one '^(.+?)\.html\?hl=(.+?)$' => '$2/$1.html' and it crashed my forum :D So I commented it.

Sorry about that!

I tried something similar in .htaccess after posting above and it crashed here too.... driving the load average of the server to outer space :o

More later ..... :confused:


All times are GMT. The time now is 01:24 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01642 seconds
  • Memory Usage 1,752KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (4)bbcode_code_printable
  • (5)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (2)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (10)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete