Bug? Unicode post gets truncated / Random insertion of data. [Archive]

evannn

01-22-2008, 11:14 AM

Hi everyone,

We've set up a test server to test out conversion of myql db from latin to unicode.

Main purpose is to facilitate the non-English forums.

We've modified mysql's configuration and inserted the following:

[CLIENT]
default-character-set=utf8

[MYSQLD]
default-character-set=utf8
collation_server=utf8_unicode_ci
character_set_server=utf8
init-connect='SET NAMES utf8'

Next, under languages and phases, we changed language settings to UTF-8

Lastly, do a conversion on all vBulletin tables - convert all instances of latin to utf8_unicode_ci. Change the database collation to utf8_general too.

Things seems to work fine. All non-English data appear as it is (no HTML ENTITY was detected!)

But we noticed 2 perculiar issues:

1. Line Break issues with non-English data during posting of new thread and replies

Eg: Assuming the following are unicode text

Para 1. XXXXXXX

Para 2. YYYYYYYY

Any text that is AFTER the LINE BREAK doesnt get inserted into the database. I've checked the tables - Only the text before the line break goes in. The rest simply disappears. (Para 2 onwards is not saved into the db)

Is this a bug? Should I submit a ticket?

2. When we post any new threads, random data gets inserted on its own. The data seems to be grabbed from other posts randomly.

Any advice?

Thanks

--------------- Added 1201007886 at 1201007886 ---------------

I'm thinking of using iconv to convert the encoding manually. I've also thought of re-creating the database from fresh again and dump in the iconv-ed data..

But it seems pretty irrelevant