Hi everyone,
We've set up a test server to test out conversion of myql db from latin to unicode.
Main purpose is to facilitate the non-English forums.
We've modified mysql's configuration and inserted the following:
Quote:
[CLIENT]
default-character-set=utf8
[MYSQLD]
default-character-set=utf8
collation_server=utf8_unicode_ci
character_set_server=utf8
init-connect='SET NAMES utf8'
|
Next, under languages and phases, we changed language settings to UTF-8
Lastly, do a conversion on all vBulletin tables - convert all instances of latin to utf8_unicode_ci. Change the database collation to utf8_general too.
Things seems to work fine. All non-English data appear as it is (no HTML ENTITY was detected!)
But we noticed 2 perculiar issues:
1. Line Break issues with non-English data during posting of new thread and replies
Eg: Assuming the following are unicode text
Quote:
Para 1. XXXXXXX
Para 2. YYYYYYYY
|
Any text that is AFTER the LINE BREAK doesnt get inserted into the database. I've checked the tables - Only the text before the line break goes in. The rest simply disappears. (Para 2 onwards is not saved into the db)
Is this a bug? Should I submit a ticket?
2. When we post any new threads, random data gets inserted on its own. The data seems to be grabbed from other posts randomly.
Any advice?
Thanks
--------------- Added [DATE]1201007886[/DATE] at [TIME]1201007886[/TIME] ---------------
I'm thinking of using iconv to convert the encoding manually. I've also thought of re-creating the database from fresh again and dump in the iconv-ed data..
But it seems pretty irrelevant