Hi andrew...
I did some tests
with german umlaut.. (???) or french accents (? ? ? ? ?) my fix above seems to work...
BUT
with chinese... AAAAAAAAAAAAAAAGH.... you are right.. that is a pain....
but i think i narrowed it down.. BUT i would need some help..
OK.. the problem is.. in UTF-8 with multibyte characters.. there is a preg_replace problem..
(dont ask me why)
The problem is in the line..:
PHP Code:
// Apply highlighting to each of the substrings
$resstrings = preg_replace($find, $replace, $substrings);
(its the $find that doesnt work for UTF-8 chinese)
I ran some examples with dummy strings.. (notice the /u modifier was added)
example :
PHP Code:
$resstrings = preg_replace("/(\b???\b)/iu", " Replaced: $1 ", "test ??? test");
will give:
test Replaced: ??? test
==> works!!!
but:
PHP Code:
$resstrings = preg_replace("/(\b欢迎您\b)/iu", " Replaced: $1 ", "欢迎欢迎您 欢迎");
will give:
欢迎欢迎您 欢迎
==> no highlight

------------
The problem whith characters like chinese seems to be the boundary
\b
so i tried:
PHP Code:
$resstrings = preg_replace("/(" . preg_quote("欢迎您") . ")/iu", " Replaced: $1 ", "欢迎欢迎您 欢迎");
will give:
欢迎 Replaced: 欢迎您 欢迎
==> works.. chinese text is highlighted..!!!!!
BUT.. my question.. is there a drawback?????
not using the boundary \b ???
i used the preg_quote just to avoid $1 errors.. but i guess its not really needed..
to implement this in LDM it would be:
in local_links_include.php in function ldm_make_highlight_regex
find:
PHP Code:
$find[] = '/(\b'.$w.'\b)/i';
replace with:
PHP Code:
$find[] = '/('.preg_quote($w).')/iu';
actually i would also (maybe) add @ in front of the preg_replace in $resstrings = preg_replace($find, $replace, $substrings);
i have ran some tests with keywords in chinese.. they highlight... tried message-text: works also...
PLEASE TEST...!!!!
Felix