PDA

View Full Version : Diacritic marks in Search engine


01-25-2001, 05:36 PM
Hi all!

I use vBulletin as regional forum in Czech language. Because many users are lazy and write without diacritic marks I'll have to modify search engine, so that it would not use diacritics for searching. Because I don't know how to explain it here is example:

string in post#1: Digitální váha
string in post#2: digitalni vaha

I would like to both posts show when I write to search field: "digitalni" or "digitální".

Please help me! Thank you for any ideas on how to do it.

01-25-2001, 05:44 PM
use regular expressions to replace them with their generic letter.

01-25-2001, 06:27 PM
I must admit that I don't understand you. I thought that I need to hack search.php. You recommend replacement vars in admin control panel?

Sorry, but I misunderstood you.

01-25-2001, 06:57 PM
Tohle uz me taky napadlo. Ale jak na to... :)

01-25-2001, 10:10 PM
I meant PHP's regular expressions not vBulletin's replacement variables.

Look at: http://vbulletin.com/forum/showthread.php?threadid=7205

01-26-2001, 01:14 PM
OK. I understand what reg. exp. are.
I checked search.php and found string "combinedwords". So what I have to do is check this string for reg. exp. But the search is processed by MySQL, not vB php script.
database is combined from posts with diacritics and without diacritics. I don't want to leave diacritics out forever.

01-27-2001, 04:40 PM
Any help from Development team? I really need this feature.

02-18-2001, 03:40 PM
Because nobody wants to help I've just researched vB 1.x php code and find my own way to do it.
I want to ask any expert or vB developer if I'm on the right track. I haven't tried it but I think it should work:

I will be replacing letters with:

str_replace("?","a",$string);
str_replace("?","i",$string);
...and so on


in $subject, $pagetext and $combinedwords string from following files:


misc.php [Building index part]

--------
$subject=wordsonly($subject);
$pagetext=$subject." ".wordsonly($pagetext);
$usernames=wordsonly($usernames);
--------
HERE

global.php [function indexthread($threadid) part]

--------
$subject=wordsonly($subject);
$pagetext=$subject." ".wordsonly($pagetext);
$usernames=wordsonly($usernames);
--------
HERE

search.php [MySQL Search SELECT]

HERE
--------
$searchresults=$DB_site->query("SELECT DISTINCT
threadid,
lastpost
FROM thread
WHERE visible=1 $checkforum $subjectonly $checkuser $checkdate $combinedwords
ORDER BY lastpost DESC");
--------


What do you think, will it work?