TosaInu
08-09-2004, 10:41 AM
Hello,
Our site has about 500,000 posts, not the biggest site, but some posts are pages long. It's pretty important to have an efficient searchlog (the more there we want other hacks). I read about the fulltext: it's interesting for us and it's not.
-It's possible to exclude some forums from the searchlog, but the sql fulltext is an all or nothing (as far as I understand it). Our board has an Off Topic forum and the content is 'volatile'. The topics shouldn't be deleted but storing all those posts in the searchlog while hurting the search for content isn't a jolly good idea either.
A post database with an sql fulltext search index is about as large as a post database having a searchlog. The searchlog can be made smaller though (people having access to SQL config can probably gain there). An optimized searchlog is better for storage and I guess it will beat the sql fulltext in speed.
-The SQL search omits small words, it's easy and necessary to add some site specific ones in the searchlog. I estimate we have 50 smaller than 3 letters word. Prime subjects of our site. Searchlog allows to do that.
The searchlog lacks some options though to make it the perfect solution for us. A badwordlist. Storing the 10,000's of records with variants of $@#!, cowstuff, horsetool, $ costs and pound sterling costs of products, words merged with &tags like &34my, numbers, yes..yes, yeh, yes, yes? and 10's of their variants, hello, hallo, ciao, current, altogether, nice, mine, yours .................... is not efficient. The word mine alone has 3631 records.
A tool to delete such entries from an existing searchlog would also be great. I know it's possible to make sql queries in say PHPMyAdmin, but it's errorprone and timeconsuming.
A PHP script that lists the wordlog and allows to select the words you want to strip will be convenient. The script stores the array of word ID's and deletes the corresponding records in the postindex
DELETE FROM vb_postindex
WHERE WORDID = deleteWORDID
I lack even the basic knowledge to create even the most basic PHP script. I guess it will be of great help to optimize the searchlog, I will surely use it. Someone please?
Our site has about 500,000 posts, not the biggest site, but some posts are pages long. It's pretty important to have an efficient searchlog (the more there we want other hacks). I read about the fulltext: it's interesting for us and it's not.
-It's possible to exclude some forums from the searchlog, but the sql fulltext is an all or nothing (as far as I understand it). Our board has an Off Topic forum and the content is 'volatile'. The topics shouldn't be deleted but storing all those posts in the searchlog while hurting the search for content isn't a jolly good idea either.
A post database with an sql fulltext search index is about as large as a post database having a searchlog. The searchlog can be made smaller though (people having access to SQL config can probably gain there). An optimized searchlog is better for storage and I guess it will beat the sql fulltext in speed.
-The SQL search omits small words, it's easy and necessary to add some site specific ones in the searchlog. I estimate we have 50 smaller than 3 letters word. Prime subjects of our site. Searchlog allows to do that.
The searchlog lacks some options though to make it the perfect solution for us. A badwordlist. Storing the 10,000's of records with variants of $@#!, cowstuff, horsetool, $ costs and pound sterling costs of products, words merged with &tags like &34my, numbers, yes..yes, yeh, yes, yes? and 10's of their variants, hello, hallo, ciao, current, altogether, nice, mine, yours .................... is not efficient. The word mine alone has 3631 records.
A tool to delete such entries from an existing searchlog would also be great. I know it's possible to make sql queries in say PHPMyAdmin, but it's errorprone and timeconsuming.
A PHP script that lists the wordlog and allows to select the words you want to strip will be convenient. The script stores the array of word ID's and deletes the corresponding records in the postindex
DELETE FROM vb_postindex
WHERE WORDID = deleteWORDID
I lack even the basic knowledge to create even the most basic PHP script. I guess it will be of great help to optimize the searchlog, I will surely use it. Someone please?