I'm new to this forum, I was directed here by a member of VB's support team. I'll just paste the history so that you can understand the issue I am dealing with.
I hope someone can provide a solution.
Quote:
We plan to run multiple language forums in one forum, therefore the codepage to go for is UTF-8. The problem is that our old forum, http://www.dekart.com/forum/ uses a different codepage: ISO-8859-5, and when we import the threads - they don't show up correctly on the new forum, if it is set to use UTF-8. If we switch it to ISO-8859-5 - the characters are rendered correctly.
The language that causes the trouble is Russian. But since there will be French and Spanish too, we must go for UTF-8.
I tried the following: took a dump of the database, opened it with a text editor and saved it as a unicode file. Uploading it back - did not work, the server itself did not accept the file as it uses a different codepage.
|
answer:
1. Not sure I understand this one. However with vB3, you can set the HTML Character Set for each specific language on your forums.
follow-up
Quote:
1. I am aware of the fact that VB can use different languages for different forum sections, but we chose the Unicode-way for the following reason: The main page will consist of several forums [for each language], each of them including several child-forums [a child-forum per program title].
So, on the main page there will be multiple languages on the same screen, displaying them in Unicode - OK - I can read umlauts, cyrillic characters, and east-european diacritics in one move.
On the other hand, if we use a different codepage for each forum, then those on the main screen will only be able to read correctly the text written in the currently-active codepage. This might be acceptable in the case of umlauts [only the special characters will be unreadable], but Russian is a totally different thing.
This is not convenient because:
- One can read only their language, but see gibberish in the case of all the other languages
- Those who speak several languages miss the chance to browse the 'foreign' stuff in a convenient way
- As a moderator, I speak multiple languages, and I manage more than one section of the forum. I will have to play with codepages each time I need to do something
This is why we decided to switch to UTF-8. It works perfectly, but it raises the problem of old threads in the old forum, which are stored in a ISO-codepage, not Unicode.
I hope the situation is more clear now.
The software on the server is this:
php 4.3.9
apache 1.3.33
mysql 3.23.52
It runs on an older version of SuSE linux, 8.something, if I recall correctly...
|
This issue really bothers me. We've purchased VB about 2 months ago, and we still haven't implemented it because of this legacy-problem.
There has to be a better way than manually posting all the 'problematic' messages. Perhaps someone wrote a tool that inerracts directly with the database?
Well, this is it. I hope someone can help me figure out which direction to go in order to solve this; or at least 'scientifically prove' that there is no solution.
As I understand, the 'bottleneck' is MySQL - it cannot work with unicode files; [i'm not an expert at that, so this might be wrong] but in this case, how does VB use UTF for the text?