PDA

View Full Version : Fulltext boolean search v.2.2 for vB


JohnWoo
03-07-2004, 10:00 PM
Hello all!

Moving here from beta forum
https://vborg.vbsupport.ru/showthread.php?t=62218

This hack makes nearly same for vB3 as
[vB 2.2.x] - Mysql 4 Search hack
https://vborg.vbsupport.ru/showthread.php?t=51716
for vB2 :)

You will need MySQL server v4.0.1 or better (but sometimes it may work on 3.23.xx). After installing you will be able to search with empty native vB index (word and postindex tables) and using modifiers.
Allowed modifiers + are ,-, * and "
All modifiers except * should be used only once for one word (in the beginning and without space).
* it should be used at the end of a word.
For example:
windows unix -> will find messages containing at least one these words.
+windows +unix -> will find messages with both this words.
windows* -> will find "windows", "windowss", "windowssauce" or "windowst".
*indows will NOT find "windows"
"some words" -> will find "some words of wisdom", but will not find "some extra words".
Search phrase length limitations replaced with results number limitation.
Value of old "Search Index Maximum Word Length" used to limit number of posts in the result returned by fulltext search (control panel/Message Searching Options)
Supposed that it must run faster then native vB search :)

History:
v.2.2 [5 Apr 2004]
- search words relevance (when sort by relevance) added at last :)
but little different then native vB (it may not work when searching with * modifiers)
- attempt to fix incompatibility with other hacks
=to upgrade replace code block #5 in search.php with latest one :)

v.2.1 [4 Apr 2004]
- Excluding from search forums with "Index New Posts in Search Engine" option set to "No"

v.2.0 [30 Mar 2004]
-"Similar Threads" now must start working :) (to move from 1.x to 2 just change one more script - functions_search.php)

v.1.9 [29 Mar 2004]
-checking if $query string is not empty before running fill text sql

v.1.8 [20 Mar 2004]
- line numbers and higlight code changed for VB3 Gold
- more tests and error explanations

v.1.7 [9 Mar 2004]
- MySQL error for administrators bug fixed
checking is $not_forumid string exixts before adding it to query

v.1.6 [9 Mar 2004]
- national letters bug fixed
preg_replace("~[^\w\"\-+\* ]~i", "", $query);
was replaced by
preg_replace("~[^\w\xC0-\xFF\"\-+\* ]~i", "", $query);

v.1.5 [8 Mar 2004]
- TABLE_PREFIX bug fixed
- slightly optimised SQL requests

v.1.4 [8 Mar 2004]
- delete_post_index function turned off
- more tests and error explanations

v.1.3 [7 Mar 2004]
- less code because of using native vB $postQueryLogic and $threadQueryLogic conditions
- more tests and error explanations

v.1.2 [7 Mar 2004]
- boolean mode can be turned off in AdminCP ("Allow Search Wild Cards" setting)
- "titles only" search fixed
- limiting number of matches retunned by fulltext search AFTER applying search conditions

v.1.1 [7 Mar 2004]
- HighLight support added

gmarik
03-08-2004, 01:22 PM
This goes quick

MrNase
03-08-2004, 04:16 PM
i'll give it a try ;)

SDB
03-09-2004, 07:55 AM
Anyone have any idea how efficient this is going to be with a very large board... 3 million posts.

Does it require firstly to index all of the current posts? If so, can anyone estimate how long that is likely to take?

thanks in advance

JohnWoo
03-09-2004, 08:34 AM
when you will say
ALTER TABLE post ADD FULLTEXT (title)
it will " index all of the current posts" :)
And not sure about 3 millions, but that request takes about 370 seconds on P4 2400/1G RAM/two SATA 120G Maxtor disks/MySQL 4.0.16 and ~427.000 posts :)

Archer
03-09-2004, 08:37 AM
This part of code from block 2 strip national characters in query. E.g., russian characters not passed.


$query = preg_replace("~[^\w\"\-+\* ]~i", "", $query);


If I comment it - search works.
I'm not guru in RegExp, so can't fix it by myself. Need help.

JohnWoo
03-09-2004, 09:02 AM
On all servers that I have ever seen (since php2 :) "\w" match national characters :) But not on in latest PHP versions...
but if you want, you may say
$query = preg_replace("~[^\w\xC0-\xFF\"\-+\* ]~i", "", $query);
:)

Archer
03-09-2004, 09:31 AM
Old expression strip russian characters at all.
Second one - left only first character.

JohnWoo
03-09-2004, 10:43 AM
true...
Just tested it on 7 hosts and on 3 of them it don't work... I need to update PHP version on my servers more frequently.
And
$query = preg_replace("~[^\w\xC0-\xFF\"\-+\* ]~i", "", $query);
go to v.1.6 :)

Archer
03-09-2004, 10:51 AM
Thanks!

ImportPassion
03-09-2004, 03:16 PM
anyone have a test site to try with lots of posts?

Archer
03-09-2004, 03:22 PM
forum.wbfree.net
~270,000 posts

heynurse
03-10-2004, 05:02 AM
It looks like you have to register to search on that site? Any other sites? I doubt you want hundreds of us registering just to test out the search ;)

webrats
03-10-2004, 05:18 AM
ive got 40000 post not much
forums.webrats.com
i thinking of installing this

JohnWoo
03-10-2004, 07:18 AM
heynurse :)
Go to AdminCP and select

Usergroup Manager
[Edit] Unregistered / Not Logged In
check yes for
- Can View Forum
- Can View Others' Threads
- Can Search Forum
after it go to Forum Permissions and select what forums Unregistered users will be able to see, read and search :)

msimplay
03-12-2004, 10:18 AM
[Note for #3]
you may leave functions_databuild.php file without changes, but comment all build_post_index and delete_post_index
functions calls in all scripts

build_post_index line numbers around:

editpost.php
420

postings.php
701
773
901
1042
1246
1247
1511
1515

admincp/misc.php
144

includes/functions_newpost.php
402


delete_post_index line numbers around:

editpost.php
406

postings.php
699
900
1041
1244
1245
1510
1514

includes/functions_databuild.php
230
443
723

does the above in the instructions mean
either edit the databuild.php or do the editing in the files in [note 3] ?

JohnWoo
03-12-2004, 10:24 AM
sorry for my English :)
it means that you may edit functions_databuild.php or comment all calls of that functions in all scripts or don't touch #3 (but then your unused word and postsindex will grow and it will take server processor time:)

msimplay
03-12-2004, 10:31 AM
sorry for my English :)
it means that you may edit functions_databuild.php or comment all calls of that functions in all scripts or don't touch #3 (but then your unused word and postsindex will grow and it will take server processor time:) thank you
so basicly if i edit databuild my search index will not grow right ?
also what does this querie do
FLUSH TABLE post

as i have no permissions to execute it
but the hack seems to work without it
also does this have an affect on similar threads ?

JohnWoo
03-12-2004, 10:53 AM
yes - after editing databuild, index will not grow :)
and FLUSH just clear mysql cache for post table. On some servers
ALTER TABLE post ADD FULLTEXT (title)
runs more then 10 minutes and die with error without FLUSH befor? and finish fine in 3 minutes if FLUSH was executed before. Can't explain why it happen, but it happen sometimes :)

Natch
03-12-2004, 11:32 AM
Thansk for this hack John - it's gonna help no end I think :)

Great job!

* Natch installs

I also don't have the permissions to run the FLUSH query, but as you say, unless the board is HUGE, and the post table is way way bloated, the FLUSH query is not a *required* query to run for functionality ...

Nitesh
03-12-2004, 11:34 AM
Thanks for the hack :)

I put this hack on my forums which has 1.1 million posts... the search works ok but unfortunately the server load has gone up now.
Current Server Load Averages: 2.75 2.48 2.45 | 161 Users Online (119 members and 42 guests)
It used to be around 1 before this hack was installed. Have you got any ideas on this?

JohnWoo
03-12-2004, 01:42 PM
hmm...
Sorry Nitesh, but I had no chance to test it with more then 400.00 posts... And just now have no idea how to fight with it...
do you remember your server load numbers before instslling?

Nitesh
03-12-2004, 02:06 PM
Its aright John :)

It seems my server load is fluctuating now...
Server Load Averages: 1.10 1.40 2.16 | 174 Users Online (135 members and 39 guests).
I?m on a dedicated server so there aren?t any other scripts... I think it maybe just being odd loll... I'll keep an eye on it and keep you informed.

I couldn't get simular threads to work with this either, is that included with this hack?

Anyway thank you for the hard work you've put on this script, saved me about 20 hours of search index rebuilding!

JohnWoo
03-12-2004, 04:28 PM
:)
But I still thinking at "simular threads" and have no good (and not heavy for server) solution for it. But feel that it is somewhere near :)

Erwin
03-12-2004, 10:14 PM
I have almost 2 million posts. Anyone installed this hack on a larger forum? I don't want to have to reindex my search table if this doesn't work. :)

ImportPassion
03-13-2004, 02:05 AM
aww...c'mon, be a sport Erwin. Erwin! Erwin!

JohnWoo
03-13-2004, 06:22 AM
besides why not to change scripts without touching functions_databuild.php and current words and posts index? :) after it you will be able to return back to standard without reindexing :)

Erwin
03-13-2004, 12:18 PM
besides why not to change scripts without touching functions_databuild.php and current words and posts index? :) after it you will be able to return back to standard without reindexing :)
That's true. :) I may try this when I find time.

mtha
03-23-2004, 09:40 PM
2. [That HighLight part can be buggy and need more testing.... Be carefull.]
you may skip this step and loose find words hightlighing

yes it is buggy :D

I tried on my board, with UTF-8 text, and the highlight is broken (when highlighting utf-8 chars)

:D I may wait for a solution for it, before puting this cool hack in use.

JohnWoo
03-24-2004, 05:19 AM
Sorry, but just now I have no idea how to test it with UTF-8 :)
Will think, but can't promice someting...

Dontom
03-24-2004, 06:27 PM
Hello, i tried this hack but somehow it showed far less results than the normal search (12 <-> 1000).
Currently our forums is on rc2 and I replaced search.php with vb3 gold hacked search - which might be a reason.
Tom

JohnWoo
03-24-2004, 06:37 PM
you tried to change number in "Search Index Maximum Word Length" setting in adminCP?
Sorry - asking becase it looks like the most possible reason :)
And if it will continue returning too few results after setting large number there, you may try to drop fulltext index and create it again with repair and optimise requests before and after creation.

Dontom
03-25-2004, 09:45 AM
I don't understand this ;)
I think the Search Index Maximum Word Length in admincp has nothing to do with mysqls fulltext index (this is limited only by entries in my.cnf regarding the fulltext index) - the maximum word length in vbs acp only limits the entries of the word table...

Recreating the mysqls fulltext index might be an idea - i have to shut down our forums tonight and will try this.
Thank you
Tom

JohnWoo
03-25-2004, 10:53 AM
I don't understand this ;)
I think the Search Index Maximum Word Length in admincp has nothing to do with mysqls fulltext index (this is limited only by entries in my.cnf regarding the fulltext index) - the maximum word length in vbs acp only limits the entries of the word table...
Tom

yes :) but see #2 in extra changes part

2. Value of old "Search Index Maximum Word Length" used to limit number of posts in the result returned by fulltext search

I used value of that setting just to free database from searces that may return too much results - sorry :)

Dontom
03-25-2004, 11:16 AM
JohnWoo, thank you!
LOL, didn't see this ;)

will try it again...

JonP
03-25-2004, 09:39 PM
In the process of installing, 1.3 million posts.

alter tables are taking ages ... :(

Will keep you updated.

JonP
03-25-2004, 10:34 PM
Still altering... sigh.

Dontom
03-26-2004, 03:02 AM
Ok, this hack seems to work on our 1.1mio posts board. :up: :up:

I will be able to say something about load after sunday evening, which is one of
our most freuqented days
Thank you for this hack!
Tom

JonP
03-26-2004, 04:03 AM
Had to reindex, still indexing, forum's been down most the day ...

Dontom
03-26-2004, 04:10 AM
This took iirc ~ less than half an hour.. you know that you do not have to reindex using vbulletin? Mysql will create this automatically for you as soon as you enter the sql commands as instructed...
Tom

JohnWoo
03-26-2004, 05:11 AM
yes... if it take more then 30-40 minutes, it is better to drop fulltext index, do flush, repair table, optimise table and after it try to create index again.
PS on one my friend's forum with ~1.6mln posts it took 28 minutes to create index :)

JonP
03-26-2004, 12:24 PM
Took me 9 hours and 30 minutes to index.

Now I can't post or delete threads :(

JonP
03-26-2004, 12:31 PM
Btw, was reindexing through a mysql shell prompt not vbulletin.

JohnWoo
03-26-2004, 01:43 PM
then what error message you have when you "can't post or delete threads"?
and i still think that 9 hours is not normal...
and still recommend to try again

ALTER TABLE post DROP INDEX pagetext
after it
ALTER TABLE post DROP INDEX title
after it
FLUSH TABLE post
after it
REPAIR TABLE post
after it
ALTER TABLE post ADD FULLTEXT (title)
after it
ALTER TABLE post ADD FULLTEXT (pagetext)
and after it
OPTIMIZE TABLE post

Nearly sure that it must take less then hour summary :)

JonP
03-26-2004, 03:09 PM
Running a dedicated server only hosting the forum, P4 2.4ghz 1.5 gigs RAM, 1.3 million posts.

Right now I'm getting ready to revert back to my backed up database.

Takes forever to try and drop indexes, and/or repair table.

msimplay
03-27-2004, 02:30 PM
hmm question whats in this hack that makes getting unanswered posts impossible

ie this hack should be simple but each time it returns 0 results

https://vborg.vbsupport.ru/showthread.php?p=491095#post491095

rikman
03-27-2004, 07:02 PM
Took me 9 hours and 30 minutes to index.

Did you shut down your board while altering the tables? If there are queries waiting for access to the post table while it is locked, your machine load will grow high and higher. This has an affect on ALL tasks running on your machine, incl. creating the index columns at your post table.

rikman

Erwin
03-29-2004, 09:34 AM
Any other large forums tried this yet? :)

msimplay
03-29-2004, 10:07 AM
Any other large forums tried this yet? :) http://www.wass-up.com/forum/

has it installed i helped with the installation of it and we've not had any problems with it but there is one sacrifice which is similar threads
they don't seem to work with this hack

Boofo
03-29-2004, 10:25 AM
The "Unanswered Threads" hack doesn't work with it, either, I am told.

msimplay
03-29-2004, 10:47 AM
The "Unanswered Threads" hack doesn't work with it, either, I am told. lol yeh thats coz i tested it for ya :p

Boofo
03-29-2004, 11:29 AM
And a good job you did. ;)

JohnWoo
03-29-2004, 03:57 PM
it must start working after fix in latest v.1.9 :)

msimplay
03-29-2004, 04:17 PM
any chance of upgrade instructions ?

JohnWoo
03-29-2004, 04:31 PM
:)

1. after

// code block 5 added by [FullText Search hack] (getting list of posts with match against search phrase)

add

if (strlen($query) > 0) {

2. before

// end of code block 5 added by [FullText Search hack]

add

}


it is all :)

msimplay
03-29-2004, 05:14 PM
thanks i didnt fancy rehacking the whole thing
does this now allow unanswered topics mod and similar threads ?

JohnWoo
03-29-2004, 05:51 PM
unanswered topics mod - yes
similar threads - not, but thinking and think that it is near :)

msimplay
03-29-2004, 07:20 PM
unanswered topics mod - yes
similar threads - not, but thinking and think that it is near :)
good work :D

JohnWoo
03-29-2004, 09:25 PM
ok :) Think that "Similar Threads" works with v.2.0 :)

msimplay
03-29-2004, 10:34 PM
ok :) Think that "Similar Threads" works with v.2.0 :)yayyyyyy it works
but u made a tiny mistake

instead of searching for

function fetch_similar_threads($threadtitle, $threadid = 0) {

you should search for and replace

function fetch_similar_threads($threadtitle, $threadid = 0)

on step 4 of the install otherwise you get a parse error

JohnWoo
03-30-2004, 06:34 AM
true... Uploading fixed txt

JonP
03-31-2004, 04:49 AM
Did you shut down your board while altering the tables? If there are queries waiting for access to the post table while it is locked, your machine load will grow high and higher. This has an affect on ALL tasks running on your machine, incl. creating the index columns at your post table.

rikman

Yep, I completely shut down the forum, and the forum is the only thing on the dedicated server.

Erwin
04-01-2004, 10:21 AM
Ooh... this is so tempting...

sabret00the
04-01-2004, 10:26 AM
what does it do exactly and what's the advantage of having this installed?

msimplay
04-01-2004, 10:37 AM
Ooh... this is so tempting... well it seems to be working perfect now
i have it installed with similar threads and unanswered posts
u can check at my website
what does it do exactly and what's the advantage of having this installed? it removes the need for a search index hence making your database smaller and faster
extremely good for large sites
it will be standard in vb3.1 but don't know how long that will take
and large sites need it now

check this link for a better description of its capabilities
https://vborg.vbsupport.ru/showthread.php?t=51716&highlight=mysql

Boofo
04-01-2004, 11:16 AM
Ooh... this is so tempting...

If you install this, Erwin, please let me know what you think of it. If you're brave enough, I may have a go at it. ;)

sabret00the
04-01-2004, 11:54 AM
well it seems to be working perfect now
i have it installed with similar threads and unanswered posts
u can check at my website
it removes the need for a search index hence making your database smaller and faster
extremely good for large sites
it will be standard in vb3.1 but don't know how long that will take
and large sites need it now

check this link for a better description of its capabilities
https://vborg.vbsupport.ru/showthread.php?t=51716&highlight=mysql thanks for explaining that, it's fairly exciting after reading that thread :)

Natch
04-01-2004, 02:22 PM
I am running it - small board, but definite performance improvement ...

Plus I just like the fulltext searching bonus!

and I'm a sucker for a hack not many have installed ...

msimplay
04-03-2004, 08:30 AM
ok one problem theres an option when u make a new forum where it says do not index posts
traditionally this stops people from searching those forums
but with this hack to be able to do that
u have to be able to exclude forums from fulltext searching with the same setting

reason is because i have the unanswered post hack installed and also the news syndication hack installed

which means that all posts in the news area will always be unanswered
so i do not want them to be searchable

Dontom
04-03-2004, 04:26 PM
hey sorry for bugging you :p
i have found a new bug
when you search this thread it returns results from other threads
Same here, too.
Tom

dslteam
04-03-2004, 07:36 PM
Hello!

Great Hack! We use vb3 gold with about 320.000 posts and I have just installed it. On our Dual AMD MP 2.5 Ghz - the installation (with these DB queries) took about 23 seconds.

The search performance is indeed much better now.

Thanks!

JohnWoo
04-03-2004, 09:45 PM
Yes.. You are right msimplay. I missed that thing.
Uploading updated zip and to upgrage just change in search.php

//fast list of forums visible and searchable for user
foreach ($bbuserinfo['forumpermissions'] AS $forumid => $fperms) {
if (!($fperms & CANVIEW) OR !($fperms & CANSEARCH) OR !verify_forum_password($forumid, $forum['password'], false)) {
$not_forumid .= $forumid." ";
}
}
to
//fast list of forums not visible or searchable for user
foreach ($bbuserinfo['forumpermissions'] AS $forumid => $fperms) {
if (!($fperms & CANVIEW) OR !($fperms & CANSEARCH) OR !verify_forum_password($forumid, $forum['password'], false) OR !($forumcache[$forumid]["options"] & $_FORUMOPTIONS["indexposts"])) {
$not_forumid .= $forumid." ";
}
}

msimplay
04-03-2004, 09:57 PM
Yes.. You are right msimplay. I missed that thing.
Uploading updated zip and to upgrage just change in search.php

//fast list of forums visible and searchable for user
foreach ($bbuserinfo['forumpermissions'] AS $forumid => $fperms) {
if (!($fperms & CANVIEW) OR !($fperms & CANSEARCH) OR !verify_forum_password($forumid, $forum['password'], false)) {
$not_forumid .= $forumid." ";
}
}
to
//fast list of forums not visible or searchable for user
foreach ($bbuserinfo['forumpermissions'] AS $forumid => $fperms) {
if (!($fperms & CANVIEW) OR !($fperms & CANSEARCH) OR !verify_forum_password($forumid, $forum['password'], false) OR !($forumcache[$forumid]["options"] & $_FORUMOPTIONS["indexposts"])) {
$not_forumid .= $forumid." ";
}
} that doesn't seem to work i can still search those forums
i tested with a another account thinking maybe it only applies to not admins
but no that doesn't seem to work

JohnWoo
04-03-2004, 10:13 PM
hmm..
You are able to search in that forums typing "Key Words" in field on search page or you can see that forums after clicking some link (like unanswered post hack have)?

msimplay
04-03-2004, 10:15 PM
i have a staff forum which has a post called reported in there
i used the nav drop down
and typed in report*

but it seemed to work on showthread
but again not with unanswered posts

obviosly normal registered members can't see it due to permissions
coz thats a staff forum

but what about forums that are not protected by permissions and are still viewable but just don't want them to be searchable
for example a test forum where people test signatures avatars bbcode etc

just tested again your hack works on
showthread
advanced search
forumdisplay

not navbar
not unanswered

JohnWoo
04-03-2004, 10:28 PM
Think that I need to think a little :)
But can you please try to replace 2 strings

unset ($postQueryLogic);
$postQueryLogic = array('post.postid IN(' . $postids . ')');

with one

$postQueryLogic[] ='post.postid IN(' . $postids . ')';

and say how it will go?

PS using unset because I still trying to give less job to database - sorry :)

msimplay
04-03-2004, 10:37 PM
hey sorry for bugging you :p
i have found a new bug
when you search this thread it returns results from other threads

msimplay
04-03-2004, 10:59 PM
Think that I need to think a little :)
But can you please try to replace 2 strings

unset ($postQueryLogic);
$postQueryLogic = array('post.postid IN(' . $postids . ')');

with one

$postQueryLogic[] ='post.postid IN(' . $postids . ')';

and say how it will go?

PS using unset because I still trying to give less job to database - sorry :)
this doesnt seem to work :disappointed:

JohnWoo
04-04-2004, 07:07 PM
and how it is now? after v.2.2 changes ?

jb605
04-05-2004, 02:43 AM
I just installed this hack on vB 3.0.0 gold version. I am running a chinese website. Chinese is multibyte language, and the words are not delimited by space. I thougt by using fulltext search, the searched keyword is matched against the fulltext index directly, so there is no need to build word index on post. Am I correct?

But now I got no search result at all. Can anybody give any hints?

Thanks a lot.

JohnWoo
04-05-2004, 07:50 AM
sorry, but I can't test it with multibyte language...
But problem may be in too strict input filtering.
Please try to replace
$query = trim(preg_replace("~[\.,:;@=']~", " ", $query));
$query = preg_replace("~[^\w\xC0-\xFF\"\-+\* ]~i", "", trim($query));
with
$query = preg_replace("~['<>]~i", "", trim($query));

but i still don't think that it will help...

jb605
04-05-2004, 06:07 PM
No, it didn't work either. It does not work even if I remove that part completely.

I think the issue lies between fulltext itself and my language character. I tried to do an search using

SELECT * FROM post WHERE MATCH(title, pagetext) AGAINST('汉字')

where inside the AGAINST is some chinese character, it returns nothing. Although I know that there are posts with these words. Is fulltext doing a partial match be default?

Thanks for your help.

motorhaven
04-08-2004, 01:10 PM
Installed two days ago on a 1.5 million post VB. Works like a charm.

On our database server (dual AMD 2400MP system, 3 gigabytes RAM) we're seeing a load of about .25 during peak now (12.5% load). Used to be about .50. I was more concerned with overall response time, not load, and this has helped with both.

Our front end system (dual AMD 2800MP system, 4 gigabytes RAM) still has a crap load, 1.50 - 2.50 and that's the one regret I have with moving from VB 2.x to VB 3.x. Our front end server used to run about a .75 to 1.5 load. :(

Trigunflame
04-10-2004, 12:26 PM
I have 8+ years in system administration, and could try to help you lower that frontend system load if you wanted, with dual amd 2800, your load should be no where near that high.

motorhaven
04-10-2004, 08:19 PM
I've been programming since 1979, sys admin since the early 90s. This is not a case of lack of experience, the system really is loaded down with VBulletin. We're quickly approaching a terabyte of bandwidth per month from the front end server. 10s of millions of page views each month, 1000-1500 users at once in the forums (except this week has been slower due to the Easter/Spring Break holiday).

Vbulletin eats up an enormous number of CPU cycles compared to the prior version. For instance, instead of having the language in the templates, language is now parsed in, adding overhead to every template built.

In the last 24 hours, I've done a few things to reduce load not directly associated with VBulletin. For one, I'm running mod_gzip and it looks like image.php and attachment.php where being compressed (avatar.php was left over from the prior httpd.conf config and is now removed). Not a huge amount of CPU overhead reduced by not compressing these, but every bit helps.

Additionally, at any given time we will have between 20-100 Google, Yahoo and Inktomi spiders crawling the site. That eats up a lot because they fetch more pages per minute than the average real user. I certainly don't want to turn them away, search engine spidering is crucial to success on the web these days....

kmike
04-11-2004, 06:57 AM
We are confident that this hack causes some unnecessary mysql tables locking, raising server load to the roof. It was not so with vb2 version of fulltext search hack. Could it be that LEFT JOIN with thread table in the main SELECT causes that locking? VB2 version had a simple


SELECT postid FROM post WHERE MATCH(pagetext) AGAINST ('+word' IN BOOLEAN MODE)
ORDER BY dateline DESC


without any check for forums id's or permissions, and then applied permissions and forums conditionals at the search results display time.

JohnWoo
04-11-2004, 07:42 AM
LEFT JOIN don't only add server load :) (besides it don't add too much - you can try to remove that lLEFT JOIN and look yourself at timings with debug=1)
If you have large forum with many different users and user groups and with many forums invisible for some of that usergroups or users, that LEFT JOIN make fulltext search run faster (and sometimes much faster) because some (or many) posts will be excluded from MATCH => AGAINST (and that part of request takes the most part or server load)
Second reson is limiting mumber or results returned by search. vb2 version of fulltext search first do lookup in ALL posts with limiting number of matches and after it cut (from that already limited number) more posts looking at user permissions. So, I don't think that vb2 fulltext search works as it must work - sorry :)

JohnWoo
04-11-2004, 08:29 AM
and if you start talking about server load, here I have one thing that really takes too much processor time.. My pain in that hack is relevance when seaching with modifiers (IN BOOLEAN MODE).. If you feel that you can live without relevance when seaching with modifiers, you may try to replace string starting from $fulltext_sql with something like it:
$isboolean="'";
$limit_final = "";
if ($vboptions['allowwildcards'] ==1 && preg_match("~[\"\-\*+]~i", $query)) {
$isboolean="' IN BOOLEAN MODE";
}
if (strlen($limit_conditions) > 0) {
$limit_final = "WHERE 1".$limit_conditions;
}
$fulltext_sql = "SELECT postid,MATCH (".$what_field.") AGAINST ('".$query.$isboolean.") as relevance FROM ".TABLE_PREFIX."post AS post LEFT JOIN ".TABLE_PREFIX."thread AS thread ON post.threadid=thread.threadid ".$limit_final." HAVING relevance>0 ORDER BY relevance DESC LIMIT 0, ".$vboptions['maxresults'];

And it will save up to 40% server time on some combination of search words :)

kmike
04-11-2004, 08:31 AM
Yes, forum is quite large, about 2,000,000 posts and over 1000 users online in the daytime.

I'll try to dig further, but fact stands as it is: only yesterday on the same hardware with vb2 fulltext search server load was around 2-3 in daytime and less than 1.0 in night time, whereas today it sometimes spikes to 30.0, and I'm seeing hundreds of locked mysql threads besides running search query.

Not acceptable - I had to disable search completely for now (BTW, global "enablesearches" setting in admin CP is currently broken).

Erwin
04-11-2004, 11:47 AM
Yes, forum is quite large, about 2,000,000 posts and over 1000 users online in the daytime.

I'll try to dig further, but fact stands as it is: only yesterday on the same hardware with vb2 fulltext search server load was around 2-3 in daytime and less than 1.0 in night time, whereas today it sometimes spikes to 30.0, and I'm seeing hundreds of locked mysql threads besides running search query.

Not acceptable - I had to disable search completely for now (BTW, global "enablesearches" setting in admin CP is currently broken).
Mmm... I have exactly the same forums, with 2.2 mil posts and up to 1200 users online at once. I was just about to install this hack, but your tale has given me cause to hesitate. Let me know how it goes.

motorhaven
04-11-2004, 01:26 PM
My database server has had no increase in load with the hack installed. 1.5 mil posts and 1000-1500 users online. My front end server is my concern, its been a dog since the day I upgraded from VB2 to VB3.

msimplay
04-11-2004, 01:40 PM
My database server load has had no increase in load with the hack installed. 1.5 mil posts and 1000-1500 users online. My front end server is my concern, its been a dog since the day I upgraded from VB2 to VB3.
hmm i thort i was the only one that noticed vb3 seems more server intensive then 2 :rolleyes:

ImportPassion
04-11-2004, 04:57 PM
motorhaven: you think ur is bad, i run always around 5 prolly and get up to 20 alot. I don't have the $$ to get a sep db server or http server. just running dual xeon 2.8 2gb ram. still on vb2 tho with too many hacks.

kmike
04-11-2004, 06:44 PM
Database server was not issue with vb2 and is not issue with vb3 - db server load rarely reaches 2.0, though we didn't see whole 1300+ users online yet with vb3. Will see how it goes on Monday.
When I wrote about server load 30, I meant frontend server - http processes were stacking up on locked mysql queries, causing peaks of load.

With search disabled, frontend server is humming nicely with average load around 1-2. Maybe a little higher than with vb2 at the same day/time, but acceptable.

motorhaven
04-11-2004, 06:57 PM
Our front end server had no increase in load since adding fulltext. Its just as crappy after fulltext as it was before! Here's a portion of my.cnf:

max_connections = 200
key_buffer_size = 496M
# myisam_sort_buffer_size is used when repairing tables only.
#myisam_sort_buffer_size = 48M
join_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 2M;
sort_buffer_size = 4M
table_cache = 1536
thread_cache_size = 250
wait_timeout = 3000
connect_timeout = 60
max_allowed_packet = 8M
max_connect_errors = 10
query_cache_limit = 2M
query_cache_size = 24M
query_cache_type = 1
thread_concurrency = 8
# Full text search fine tuning
ft_min_word_len=3
ft_max_word_len=25
ft_stopword_file=/home/database/mysql/stopwords
flush_time=86400

3 gigs of RAM on the server.... every bit of the database is either in MySQL's buffers or the Linux disk cache. :)

ft_max_word_len=25 really helped to decrease the size of the post index file. It specifies the maximum word length to index. There aren't very many useful words longer than 25!

Erwin
04-11-2004, 11:23 PM
motorhaven: you think ur is bad, i run always around 5 prolly and get up to 20 alot. I don't have the $$ to get a sep db server or http server. just running dual xeon 2.8 2gb ram. still on vb2 tho with too many hacks.
Upgrade to 4 Gb of RAM - that would help a lot.

kmike
04-12-2004, 07:26 PM
Upgrade to 4 Gb of RAM - that would help a lot.
I guess it is offtopic, but irony aside, our 2G RAM frontend server really could use some RAM upgrade. It started swapping after upgrade to vb3. It had a measly 10-15Mb swap size allocated with vb2, but now it is nearing 500Mb.

motorhaven
04-14-2004, 04:20 AM
Here's an idea for those with disk space (and index RAM to spare)...

What about 2 post tables? One without the fulltext index, and the other with it.

Searches would be pulled from the secondary post table with the fulltext index which would eliminate locking issues with the primary post table. Of course this would require every insert, update and delete to the post table to be duplicated to the secondary table.

I'm beginning to suspect the post table lock mentioned may, as noted previously, cause lock issues causing our front end server to sit in an I/O wait for the data. This would explain why the front end server sees a large load and the database doesn't.

I increased the key buffer size on the database server from 500 meg to 1 gigabyte and the query cache to 48 megabytes and saw no real difference on the database server but did see a load decrease on the front end server!

Just a thought.....

kmike
04-14-2004, 08:10 AM
Having 2 post tables won't help if they're updated simultaneously - locking just shifts from first to second post table. Now, if new posts/edits are queued and dropped into second table in batches (via INSERT DELAYED or UPDATE LOW PRIORITY), it would make more sense and indeed will eliminate most of locking.

kmike
04-14-2004, 02:35 PM
If you have large forum with many different users and user groups and with many forums invisible for some of that usergroups or users, that LEFT JOIN make fulltext search run faster (and sometimes much faster) because some (or many) posts will be excluded from MATCH => AGAINST (and that part of request takes the most part or server load)

Not so.
Typical limiting condition consists of something like:
thread.forumid [NOT] IN(25,197,68,159,193,191,120)
sometimes with posts.userid IN (100,200,300) added.

Now, if you run "EXPLAIN" on resulting queries, you'll see that first index used by mysql is FULLTEXT index on pagetext, and only then threadid index is applied using "where". It means ALL posts are being scanned first using FULLTEXT. The only search type where that LEFT JOIN really limits number of posts to search is a search within thread - EXPLAIN shows that first used index is threadid, and only then fulltext index on pagetext is used.
But... I suspect mysql has a bug here, though - actual search time is exactly the same as without "AND thread.threadid=nnn" condition in WHERE clause, which suggests that fulltext index is used here first anyway.


Second reson is limiting mumber or results returned by search. vb2 version of fulltext search first do lookup in ALL posts with limiting number of matches and after it cut (from that already limited number) more posts looking at user permissions. So, I don't think that vb2 fulltext search works as it must work - sorry :)

Fulltext hack for VB2 always used boolean mode search, so number of results was much much lower.
Also, now that vb3 has a logic for caching search results, querying all posts and limiting results afterwards depending on user's permission or search preferences suddenly appears quite logical, isn't it? Chances are that another user will run search with the same query, in that case we'll just pick up saved post ids and apply query logic filter to them.

kmike
04-15-2004, 08:50 AM
Thinking about two post tables idea, it could go really well along with mysql replication.
Idea: set up second mysql server (even on the same physical machine), set it up to replicate post table from main forums db. All updates to that second mysql server will be handled by only one slave thread, and it could be locked freely without any effect on first mysql server perfomance.
Since JohnWoo's hack uses two tables in the main query, it appears that thread table need to be replicated along with post table.
Changes to current hack code would be really small: initialize another db connection and fire query to a second db instead of main one.

I don't plan to implement this any time soon, though accidentally we have workng replication of our forum db to another server. I'm polishing a vb2-like hack of this (JohnWoo's) hack :) . It eliminates LEFT JOIN and always searches IN BOOLEAN MODE. Had to drop search by relevance though, I highly doubt someone used it even once though.

cerebro
04-16-2004, 08:23 AM
i will try..and see what happen. Allway can come back .

cerebro
04-16-2004, 09:43 AM
i dont have to wait to much...

and work Fine whit 3.0.1

mysql> ALTER TABLE post ADD FULLTEXT (title);
Query OK, 1762278 rows affected (4 min 2.67 sec)
Records: 1762278 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE post ADD FULLTEXT (pagetext);
Query OK, 1762278 rows affected (14 min 59.70 sec)
Records: 1762278 Duplicates: 0 Warnings: 0

mysql> OPTIMIZE TABLE post;
+----------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+----------+----------+----------+----------+
| vb3.post | optimize | status | OK |
+----------+----------+----------+----------+
1 row in set (1 min 21.15 sec)

cerebro
04-18-2004, 09:51 PM
mmm same problem whit server load...
Treads: 112,110, Post: 1,745,876, Members: 29,111

First day work fine whit 400 users online, second day server go to 8...11.11!!!!!!

i have to uninstall...some fix?

cuerty
04-22-2004, 03:06 AM
It's great, mysql load average goes as down as it can ;-) Thanks.

Erwin
05-01-2004, 06:27 AM
I've installed this!

It's working well... I'll look over the server loads over the next few days and report on it.

Overall, took me only 3 hours... most of the time it was trying to find out why I couldn't add a fulltext index to post, to find out that my secured tmp directory was too small, so I had to modify my.cnf to change the tmpdir temporarily. :)

JohnWoo
05-01-2004, 10:00 AM
Sorry for disappearing :)
Returning to discussion about SQL requests with or without LEFT JOIN :)
I recently had a chance to do a lot of tests on one large programming forum with about 900.000 posts. And yes - it is true that excluding some forums from searches do not make search run faster... But including left join don't make it perceptible slower :)
But after removing revelance it runs up to 10 times faster!
Here are results as requests and time in seconds below. Each request was executed 10 times with clearing DB cache after each. Fastest and slowest time below. Forums were not closed at that moment and there were about 120 online users. Think that numbers too different because of it :)

============
SELECT postid
FROM post AS post
LEFT JOIN thread AS thread ON post.threadid = thread.threadid
WHERE MATCH (
pagetext
)
AGAINST (
'+user +acces'
IN BOOLEAN
MODE
)
LIMIT 0 , 200

0.0277 - 0.3782 s

============

excluding security forums

============
SELECT postid
FROM post AS post
LEFT JOIN thread AS thread ON post.threadid = thread.threadid
WHERE MATCH (
pagetext
)
AGAINST (
'+user +acces'
IN BOOLEAN
MODE
) AND thread.forumid NOT IN (57, 64)
LIMIT 0 , 200

0.0197 - 0.2272 s
============

============
SELECT postid
FROM post AS post
WHERE MATCH (
pagetext
)
AGAINST (
'+user +acces*'
IN BOOLEAN
MODE
)
LIMIT 0 , 200

0.0144 - 0.1043 s
============

============
SELECT postid,
MATCH (
pagetext
)
AGAINST (
'user acces'
) AS relevance
FROM post AS post
LEFT JOIN thread AS thread ON post.threadid = thread.threadid
WHERE MATCH (
pagetext
)
AGAINST (
'+user +acces*'
IN BOOLEAN
MODE
) AND thread.forumid NOT
IN ( 57, 64 )
LIMIT 0 , 200

0.6938 - 1.3414 s
============

============
SELECT postid,
MATCH (
pagetext
)
AGAINST (
'user acces'
) AS relevance
FROM post
WHERE MATCH (
pagetext
)
AGAINST (
'+user +acces*'
IN BOOLEAN
MODE
)
LIMIT 0 , 200

0.5509 - 1.4414 s
============

Erwin
05-01-2004, 10:56 AM
Well, my current server loads seem a bit lower than before I installed the hack. But it's not peak time yet. Search appears to be noticeably faster.

John, just tell me which code to use that is the fastest and puts the least amount of stress on the server, and I'll use it. :) Do I just remove the RELEVANCE part in search.php?

JohnWoo
05-01-2004, 11:40 AM
yes :) to remove revelance replace

//fulltext search query
$fulltext_sql = "SELECT postid,MATCH (".$what_field.") AGAINST ('".$norm_query."') as relevance FROM ".TABLE_PREFIX."post AS post LEFT JOIN ".TABLE_PREFIX."thread AS thread ON post.threadid=thread.threadid WHERE MATCH (".$what_field.") AGAINST ('$query'".iif($vboptions['allowwildcards'] ==1 && preg_match("~[\"\-\*+]~i", $query), ' IN BOOLEAN MODE', '').")".$limit_conditions." LIMIT 0, ".$vboptions['maxresults'];

with

//fulltext search query
$isboolean="'";
if ($vboptions['allowwildcards'] ==1 && preg_match("~[\"\-\*+]~i", $query)) {
$isboolean="' IN BOOLEAN MODE";
}
$fulltext_sql = "SELECT postid FROM ".TABLE_PREFIX."post AS post LEFT JOIN ".TABLE_PREFIX."thread AS thread ON post.threadid=thread.threadid WHERE MATCH (".$what_field.") AGAINST ('".$query.$isboolean.") ".$limit_conditions." LIMIT 0, ".$vboptions['maxresults'];
and remove string
$postscores[$thispost["postid"]] = $thispost["relevance"] * $vboptions['multimatchscore'];
:)

cerebro
05-01-2004, 04:22 PM
Thanks again JohnWoo...just modify that in my file....in a few days will see :D

Erwin
05-02-2004, 12:10 AM
Upgraded. Thanks a lot! Searches are definitely FASTER than before.

I'd vote for this hack to be Hack of the Month, no problems. :) It's something that vBulletin should really have as default.

motorhaven
05-02-2004, 05:45 AM
I've put full text search back in. Works fine. Turns out a lot of the performance issues on the site were crawlers from China/Korea. I've completely blocked both countries from the server and load dropped in half during prime time. Because my site is about trucks, virtually no one from those countries is a legit visitor, and we don't ship to them so its not a loss if a handful can't visit. Until I can come up with a better long term solution blocking is the best route (these crawlers don't obey robots.txt).

Also, I'm in the stages of finishing out a "final" solution (for now) to performance issues related to search and the post table. Basically, every time a post is added, edited, moved or deleted, that post information is changed in a secondary table (postsearch). Searches are conducted in this table, all other activity is in the primary table. Off course, this means two tables are updated/locked every time there is a post/edit/move/delete but the number of these per day is minor compared to the number of table locks during load. Initial tests look very good, just have to make sure there are no bugs. :)

kmike
05-05-2004, 05:49 AM
We're running a cross-breed between this hack and fulltext search hack for vb2 (boolean queries only, no LEFT JOIN on thread table) for more than 2 weeks now. Runs smoothly and without any problems so far.

I think the total failure of this hack on our board is due to the fact that we have about 350 new threads per day, most of them happen in daytime. And mysql slow query log shows that some search queries take up to 30-40 seconds to complete. According to stats, there're about 15 of such slow queries per hour, likewise, with daytime peak. Couple that with LEFT JOIN on thread table, and you'll get quite a high chance of locked mysql process for new thread creation for every such search. What's worse, I think select queries will be locked as well as they wait in line for INSERT process to complete (captured mysql process list during lockup proves this).

I'm not sure why fulltext searches do not lock post table as badly - new posts should trigger post table locks much more often than thread table gets locked with new threads. Maybe there's some kind of row-level or page-level locking kicking in here?

motorhaven:
I see the benefit of your solution in that only search queries will be locked, instead of hundreds of common SELECT queries waiting for INSERT/UPDATE to complete. Like I said before, you can improve your hack by deferring new posts or updates to second table and executing them in batch, probably using cron job like the one which is updating deferred thread views in official vb3.

Erwin
05-06-2004, 10:03 AM
kmike, care to share your no LEFT JOIN version?

My slow_queries log shows that LEFT JOIN is a main culprit of queries >100 seconds.

kmike
05-07-2004, 05:00 AM
Erwin: well, I have to tidy up and clean the code from some custom tweaks first. Also, my hack changes search template a bit, like vb2 version of this hack did - there're 4 search fields instead of just one:
Search for items including these words:
and excluding these words:
and optionally including these words:
search for exact phrase:

That said, are you sure your search queries would run faster without LEFT JOIN? From my tests, that LEFT JOIN doesn't add substantial processing time, it only causes thread table locking which in turn causes peaks of load on frontend server.

Erwin
05-07-2004, 01:49 PM
I have only 2 queries showing up in my mysql slow_queries log - the LEFT JOIN in search.php for this hack is one of them. The other one cannot be help. I have a lot of threads on my forum.

Please do share your code. :) Anything to remove that slow_query.

Another thing, can you also share the IPs that you've banned? I suspect I have the same problem too.

kmike
05-08-2004, 03:34 AM
Erwin: ok, I'll post my changes after tidying up. Not before the end of next week though - I will be out of reach of computer until then.
Also, I want to point out that you won't eliminate that slow query altogether. I also have that search related slow query in mysql log, though its probably not so bad as for you - 20-40 seconds at max. It is limitation of mysql fulltext search engine, the more posts you have in your db the longer search proceeds. We can't help it either, just wait for mysql folks to improve their engine.

Erwin
05-08-2004, 05:38 AM
It's not too bad at the moment - ranges from 30-60 seconds for the LEFT JOIN query.

Thanks for sharing your code. I want to test it out and see if it make a difference.

Erwin
05-08-2004, 06:55 AM
Aarggh... just had another server crash because of the LEFT JOIN query in search.php - it locked up the whole mysql server, and the query was up to 400,000 rows before I nuked it.

msimplay
05-08-2004, 08:46 AM
Aarggh... just had another server crash because of the LEFT JOIN query in search.php - it locked up the whole mysql server, and the query was up to 400,000 rows before I nuked it.
keep us posted on your advanced with the leftjoin thing
ps can you search within threads using your version of this hack ?

because when i did it seemed to search outside of the thread in question aswell

Erwin
05-08-2004, 12:16 PM
I'm waiting for kmike's version - why reinvent the wheel? :)

kmike, did you just use the vB2 code? Coz if you did, I can try to modify it to fit too.

ImportPassion
05-08-2004, 08:39 PM
Here is some excellent info that everyone should look at.

http://dev.mysql.com/doc/mysql/en/Fulltext_Fine-tuning.html

Erwin
05-08-2004, 08:42 PM
My database is fine-tuned already. No matter how much you fine-tune, if your database is big, a LEFT JOIN is a bad query to have.

ImportPassion
05-08-2004, 09:46 PM
wasn't meant to be about that. About changing the minimum and maximum word length and other stuff like the stop word list.

Erwin
05-09-2004, 05:09 AM
Ahhh... thanks. I see now. :) Interesting... do you have any particular settings that work for you?

kmike, do post your code once you have them ready. :)

JohnWoo
05-09-2004, 08:17 AM
If you want to remove left joins together with revelance why not to say just
$isboolean="'";
if ($vboptions['allowwildcards'] ==1 && preg_match("~[\"\-\*+]~", $query)) {
$isboolean="' IN BOOLEAN MODE";
}
$fulltext_sql = "SELECT postid FROM ".TABLE_PREFIX."post WHERE MATCH (".$what_field.") AGAINST ('$query".$isboolean.") LIMIT 0, ".$vboptions['maxresults'];
? :)
or in one string
$fulltext_sql = "SELECT postid FROM
".TABLE_PREFIX."post WHERE MATCH
(".$what_field.") AGAINST
('$query'". iif($vboptions['allowwildcards'] ==1 && preg_match("~[\"\-\*+]~", $query), ' IN BOOLEAN MODE', '').")
LIMIT 0, ".$vboptions['maxresults'];

ScottW23
05-09-2004, 08:19 AM
Just wanted to say that we installed this hack on our 950,000 post VB3 board with good results so far. Searches were slow with relevance enabled so I disabled it and searches seem snappy.

ScottW23
05-09-2004, 08:22 AM
If you want to remove left joins together with revelance why not to say just

Yes why not?! <Slaps JohnWoo> :) Works great, even faster now.

JohnWoo
05-09-2004, 09:00 AM
Still want to say few words for left joins :)
Lets say that you have large board with many forums, usergroups and complex access level. After it lets suppose that somebody with common access level (just registered) will search for some common phrase (for example "+javascript +flash").
On one board, where i tested it, such search with no left joins return nothing because fulltext query find first 200 matches in forums invisible to beginner members and stop on it. Showresult page (looking on current member access level) hide all that 200 matches from search result and with great enthusiasm say that nothing found :) On same forum search with left joins return complete set of 200 matches (from other forums visible to beginner members) for same search phrase and may return more if "maxresult" will be set to larger number.
But if your forum access level structure is not too complex, you may remove left join part without visible consequence :)

Erwin
05-09-2004, 12:13 PM
Awesome, John Woo. Your new query has fixed my slow_query/ mysql locking/ apache crashing problem (as far as I can tell with testing).

I'm using the first code example, just because the original code is like that. :) Thanks!

Erwin
05-12-2004, 02:47 AM
Even without the LEFT JOIN, certain phrases do make the query go into my slow_query log, but it's only around 40 seconds now. :) Much better.

Another question for those with this hack - when I use the Advanced Search page, and look for all posts by username, it works.

But when I look for all threads started by username (from the dropdown menu under the box where you put the username) only really old threads come up.

Does this happen to you too? Any ideas?

Erwin
05-12-2004, 09:03 PM
Sorry to ask another question - does Search Thread work for you guys? My members are saying it searches the whole site instead of just the thread.

msimplay
05-12-2004, 09:14 PM
Sorry to ask another question - does Search Thread work for you guys? My members are saying it searches the whole site instead of just the thread.
i can confirm this because i did post about it earlier
but nothing has been done as of yet

cerebro
06-05-2004, 09:51 PM
Kmike, can you post that templates?

and JohnWoo, maybe make a new release whit Template MOD too, and left join together.

Please this project dont have to die, is really nice...come on :D

Erwin
06-05-2004, 10:41 PM
Kmike, can you post that templates?

and JohnWoo, maybe make a new release whit Template MOD too, and left join together.

Please this project dont have to die, is really nice...come on :D
This is the best thing that's come out for large vB sites. :)

cerebro
06-05-2004, 11:23 PM
yeah...i know!
i have a 3 Gigabyte Database...insall this hack and reduce to 800 Mg

But something i have to close search for a few day, becauso all forum goind down. I think that the hack need a little more work, but i dont know where. :P

Erwin
06-06-2004, 04:01 AM
As some may know, Search Thread does NOT work with this hack installed.

Here is my fix:

Open search.php:

[Removed code - I've made a new piece of code to replace this one that also makes Search Individual Forums work too - refer to post below.]

Erwin
06-06-2004, 04:04 AM
u deleted ur post! argh!
The post is back! I made it better! ;) I decided to share my optimization!

ImportPassion
06-06-2004, 04:15 AM
problem somewhere

Database error in vBulletin 3.0.1:

Invalid SQL: SELECT postid FROM post WHERE MATCH (post.pagetext) AGAINST ('hood) LIMIT 0, 500 mysql error: You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near ''hood) LIMIT 0, 500' at line 1

mysql error number: 1064

Erwin
06-06-2004, 05:23 AM
problem somewhere

Database error in vBulletin 3.0.1:

Invalid SQL: SELECT postid FROM post WHERE MATCH (post.pagetext) AGAINST ('hood) LIMIT 0, 500 mysql error: You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near ''hood) LIMIT 0, 500' at line 1

mysql error number: 1064
Are you sure you copied that line as it is? I just cut and pasted that line EXACTLY AS IT IS POSTED THERE just to test it, and it works on my site.

Morgalis
06-06-2004, 07:21 AM
i have +8million lines and 165 megs committed to postindex with +500k posts

i've felt searching drags my server so much, would this be a better alternative?

kmike
06-06-2004, 07:24 AM
Are you sure you copied that line as it is? I just cut and pasted that line EXACTLY AS IT IS POSTED THERE just to test it, and it works on my site.

Erwin: I believe there's an error in your pasted query here:
AGAINST ('$query".$isboolean.")
unmatched single quota. It should be:
AGAINST ('$query' ".$isboolean.")

ImportPassion
06-06-2004, 10:57 AM
Erwin,

Yes copied exactly.

I thik there is a single quote missing here
AGAINST ('$query".$isboolean.")

Erwin
06-06-2004, 11:55 AM
Erwin: I believe there's an error in your pasted query here:
AGAINST ('$query".$isboolean.")
unmatched single quota. It should be:
AGAINST ('$query' ".$isboolean.")
Nope, that doesn't work on my site.

The code that I've posted works as it is - don't ask me why. You might have to modify it for your site.

ImportPassion
06-06-2004, 01:10 PM
yup, i added the single quote and it works no prob, and fast

Showing results 1 to 25 of 342
Search took 0.06 seconds.

it's extremely weird that it works on your site. it's invalid SQL.

kmike
06-06-2004, 02:33 PM
It just means there's matching single quote added either at the end of $query variable or at the beginning of $isboolean somewhere.

cerebro
06-06-2004, 02:52 PM
Thanks for update!

* cerebro install :D

cerebro
06-06-2004, 04:43 PM
Nice mod...really nice, help a lot. Thanks Erwin for Sharing.

What d u think about stats cache...for optimize forum home.

https://vborg.vbsupport.ru/showthread.php?t=61581

some one install this hack and bollean search?

Erwin
06-10-2004, 01:56 PM
Okay, I've made the hack work for "Search Forum" now as well as "Search Thread" - before, "Search individual forums" did not work.

I've also cleaned up the code a lot more. Plus optimized the search query by removing the JOIN function when it is not required (unlike before when you had a JOIN query for every search for no real reason). I also removed relevance and other bits that slowed the search down and increased server load. Straight searching should be faster with this.

Remove my previous codes.

Then, open search.php and replace the line beginning with:

$fulltext_sql =

with this:


// Erwin's Add-On

$searchlimit = " WHERE";
$searchforumadd = "";

if ($searchthread) {
$searchlimit = " WHERE post.threadid=$searchthreadid AND";
}

if ($forumchoice) {
$searchforumadd = ", forumid";
$searchlimit = " LEFT JOIN thread AS thread ON post.threadid=thread.threadid WHERE forumid IN($forumchoice) AND";
}

$fulltext_sql = "SELECT postid$searchforumadd FROM post$searchlimit MATCH (".$what_field.") AGAINST ('$query".$isboolean.") LIMIT 0, ".$vboptions['maxresults'];

// Erwin's Add-On


Some people have had problems with this:

('$query".$isboolean.")

It works for me, but some others have had to modify this bit by removing or adding quotation marks - not sure why.

Hope it works for you! It's working well for me - Search Threads and Search Forums are now working well finally! Plus search is optimized even more!

Erwin
06-10-2004, 02:07 PM
Note, when I said removed my previous code above that applies only to people who have applied my prevuous Search Thread bug fix.

When I say the query is optimized even more, it's optimized compared to the original fulltext_sql query - it's basically similar to my previous fix, but written slightly different.

Just to clarify. :)

nexialys
06-10-2004, 02:12 PM
so, erwin... when will we see a final version of your own ?!.... it looks promissing, but we don't have the entire package... ;)

msimplay
06-10-2004, 02:55 PM
hmm weird i'm sure search relevancy was fixed for 2.2

i can search within forums and threads etc without those modifications

Erwin
06-10-2004, 08:35 PM
so, erwin... when will we see a final version of your own ?!.... it looks promissing, but we don't have the entire package... ;)
I've only modified the code for my site slightly to remove show results as threads, so all my results are shown as posts, as that's less server intensive than to have to join the threads in.

ImportPassion
06-10-2004, 08:53 PM
ya, i don't get it cause it's working fine for me too.

Erwin
06-10-2004, 10:04 PM
ya, i don't get it cause it's working fine for me too.
Weird, because it didn't work for me until I made that mod. I am talking about the search form at the top of a forum. The search forum on the Advance Search page does work. It's just the search form at the top of forums that didn't work for me.

msimplay
06-11-2004, 06:12 AM
Weird, because it didn't work for me until I made that mod. I am talking about the search form at the top of a forum. The search forum on the Advance Search page does work. It's just the search form at the top of forums that didn't work for me.
in 2.2 they all seem to be working for me now then again i uninstalled the hack

and when i saw the advanced you made i reinstalled the hack but i don't get it everything seems to be running in order now without modifying extra

do you have similar threads on by any chance because i don't :P

Erwin
06-11-2004, 01:32 PM
in 2.2 they all seem to be working for me now then again i uninstalled the hack

and when i saw the advanced you made i reinstalled the hack but i don't get it everything seems to be running in order now without modifying extra

do you have similar threads on by any chance because i don't :P
No, I never had similar threads on.

cerebro
06-30-2004, 03:41 PM
Some one have some upgrate or news about new version? jejeje
My board increise a lot...have to change something..CPU is on TOP! :P

Erwin
07-02-2004, 03:03 AM
Some one have some upgrate or news about new version? jejeje
My board increise a lot...have to change something..CPU is on TOP! :P
The next update of vB3 may have this - not sure.

ShiningArcanine
07-02-2004, 04:26 PM
The next update of vB3 may have this - not sure.
It is in 3.02 but not supported so the devs won't say how to enable it in 3.0.2. :(

Zachery
07-02-2004, 04:27 PM
It is in 3.02 but not supported so the devs won't say how to enable it in 3.0.2. :(
hehe You should do a search for new posts

ShiningArcanine
07-02-2004, 04:31 PM
I'll do that.

cerebro
07-02-2004, 07:17 PM
hehe You should do a search for new posts
is that oficial VBulletin TEam word? 3.0.2 whit have this hack ?

Boofo
07-03-2004, 04:56 AM
hehe You should do a search for new posts
Why don't you just post the link? ;)

cerebro
07-03-2004, 06:54 AM
Vb 3.0.2 release...whit Hack on :D

mtha
07-03-2004, 10:17 AM
Why don't you just post the link? ;)
heheh, he wants people to have/practise some skills in hacking community :D


ok, I give a hint (... or tip) :> it's in "Modification Hints and Tips" section :> :rolleyes: :ninja:

Boofo
07-03-2004, 10:38 AM
heheh, he wants people to have/practise some skills in hacking community :D


ok, I give a hint (... or tip) :> it's in "Modification Hints and Tips" section :> :rolleyes: :ninja:I already found it. I just thought it would be easier posting the link instead of reading through a bunch of "I couldn't find it" messages. ;)

Nitesh
07-05-2004, 10:30 AM
I just tryed the mysql search hack that comes with the new 3.0.2.... crashed my computer lol... So im coming back to this hack, I was wondering when will the next update be with all the optimisations in the thread?

Thank you :)

nexialys
07-06-2004, 06:57 PM
@Nitesh... FullText Search is NOW integrated to vB 3.0.2 and 3.0.3 ... so it's useless to add this hack... you will simply trash your installation.
you need to take the SQL query found in your Options screen to enable the Search in FullText... because there is a special feature to add to sql, the FULLTEXT thing... it may be the reason your installation trashed.

ImportPassion
07-06-2004, 07:10 PM
bleh, vb one didn't work for me. had to go back to this one, and yes i changed the admin hidden options. Not sure if vb.com has it enabled, but if they do, it doesn't work for "search this thread". that was a few days ago.

Nitesh
07-06-2004, 07:26 PM
nexialys, I tryed the vbulletin search with the extra code etc... but it was still running down my server. I installed this hack and its back to working faster then ever :) On a large board its definately best to use this hack rather then vbulletins fulltext search.

Erwin
07-06-2004, 10:27 PM
nexialys, I tryed the vbulletin search with the extra code etc... but it was still running down my server. I installed this hack and its back to working faster then ever :) On a large board its definately best to use this hack rather then vbulletins fulltext search.
That's my experience. :) That's why I'm still using this one.

nexialys
07-06-2004, 11:03 PM
hum... i see your points here... so the best system is the hack here, isn't it ?!

ok, then, if Erwin and some other posted different debugs, wouldn't it be cool to have a final version to date ?!

Erwin argues are the best i can read, but it seems complicated to read all the 12 pages to see if it's ok or not to do this ir that change... can someone post a final 2.3 version of this hack please?! and updated for vB 3.0.3 ??? because lines are changed a lot!

ImportPassion
07-07-2004, 12:03 AM
well, erwin says his wasn't working for search this forum and search this thread, but it is for me, so i didn't do his updates.
the original works great for me.

cerebro
07-07-2004, 04:45 AM
Well, i try to install this hacks in a fresh 3.0.3 and is not really easy.

Can some one make some v 2.3 whit a good install instruccion and Erwin addon :P

Nitesh
07-07-2004, 10:09 AM
Yeah the original 2.2 version of that hack works fine for me, but its a little tricky to do the file edits on search.php as some of the codes different. After messin about a bit its not too difficult to figure out :)

Which part are you stuck on?

nexialys
07-07-2004, 12:02 PM
the new 3.0.2+ version have already a part for the fulltext feature because it's now a feature of the board, BUT ... that feature is partial in 3.0.2... this hack have more changes to do with the code, so maybe someone can rewrite it to suite the new options ?!

i would make it, but i'm not ease with it... the best would be to take advantage of the new option of vB and activate this hack when fulltext is activated, and use the old code when it's not ... i've started to try it, but i don't understand the code really well... you guys have more experience on this search feature, i suppose this would help...

cerebro
07-07-2004, 06:12 PM
/me vote for that!

nexialys
08-09-2004, 06:03 PM
hum.. why this thread was not updated for the last month or so ?! is there someone using it, or we have to forget about it ???

i'd like to see it working properly for 3.0.3, and with everything fixed... is it possible to find it ?!

thanks!

nexialys
08-20-2004, 12:10 PM
bumping for the pleasure of it... someone ???

just update the thread please!

ImportPassion
08-20-2004, 12:14 PM
i only have one problem, but it works fine in 3.0.3 for me.

if ppl search for 17" rims, it will blow up cause of the ". I should be able to fix it tho.

nexialys
08-20-2004, 01:44 PM
the " is easy to deal with, it's a tag... point actually is the install instructions... 3.0.0 is really different from 3.0.3, because of the new fulltext search propriety... is someone updating the install so we can drive it cool ?!

M1th
09-04-2004, 03:21 PM
Yep, that'd be cool. It's quite confusing to follow the instructions for 3.0.3.

ImportPassion
09-15-2004, 11:30 PM
ok, this thing is now killing my site for some reason, especially when some one searches for say 17" rims.
The vb3 one also kills it, but even worse.

I don't know what to do anymore.

WotC_Tech
09-22-2004, 11:15 PM
I've installed this hack and are having good luck with it on vB 3.0.3 with 3 million posts. I did have a performance problem when searching for posts in forums rather than searching against all posts. It appears that Erwin's Add-On helped us a great deal when searching against all posts, but queries against specific forums would often take 30 or more seconds to complete.

We resolved this issue by modifying Erwin's Add-On a bit:


// Erwin's Add-On

$searchlimit = " WHERE";
$searchforumadd = "";

if ($searchthread) {
$searchlimit = " WHERE post.threadid=$searchthreadid AND";
}

if ($forumchoice) {
$searchforumadd = ", forumid";
// Modified by WotC_Tech
//$searchlimit = " LEFT JOIN thread AS thread ON post.threadid=thread.threadid WHERE forumid IN($forumchoice) AND";
$searchlimit = " LEFT JOIN thread AS thread ON post.threadid=thread.threadid AND thread.forumid IN($forumchoice) WHERE";
}

$fulltext_sql = "SELECT postid$searchforumadd FROM post$searchlimit MATCH (".$what_field.") AGAINST ('$query".$isboolean.") LIMIT 0, ".$vboptions['maxresults'];

// Erwin's Add-On


It appears that the MySQL optimizer isn't doing its job and running the query as fast as it could.

nexialys
10-08-2004, 01:10 PM
diguidi bump! -- want news for this !!!

alexi
11-29-2004, 02:25 AM
I'm with nexialys, any chance of getting an updated version of this for 3.03? I'm missing something trying to tie all of these threads together with the changes

Steve St.Lauren
12-27-2004, 02:58 PM
With this hack when you search on entire posts it won't search the titles as well. I was searching for a thread on my board and if I did the search on titles only it would come up but if I did the same search on entire posts it won't. Any ideas?

nexialys
02-20-2005, 11:02 AM
as msimplay released his new hack for this tool (https://vborg.vbsupport.ru/showthread.php?t=76678), can someone update the install script for 3.0.6 ???

there is a completely different file structure after 3.0.1+, so this would be more than efficient to update the script install process...

someone ?!

msimplay
02-20-2005, 11:33 AM
as msimplay released his new hack for this tool (https://vborg.vbsupport.ru/showthread.php?t=76678), can someone update the install script for 3.0.6 ???

there is a completely different file structure after 3.0.1+, sothiswould be more than efficient to update the script install process...

someone ?!

Someone has already used the same hack with vb3.06 its installed

nexialys
02-20-2005, 03:04 PM
Someone has already used the same hack with vb3.06 its installed
yeah, i know, but the hack require some changes to be following the 3.0.6 structure of codes... i know it's possible to install it...

msimplay
02-20-2005, 04:42 PM
yeah, i know, but the hack require some changes to befollowing the 3.0.6 structure of codes... i know it's possible toinstall it...

what exactly happens when u try to use the installer ?

nexialys
05-23-2005, 06:32 PM
hum.. little request here...

vB 3.5 will have the FullText integration complete, is there a beta tester here that knows if THIS hack is the same as the one they will introduce in 3.5 or if it's a complete different story ?!

:)

kmike
05-23-2005, 06:52 PM
Fulltext search is available in official vB release ever since version 3.0.2. Though it isn't supported - you have to know where to look to enable it.
Conceptually this hack should be very similar to vB's own implementation. Details will differ of course.
The point is, you can already have look at the official fulltext search code, and compare it to this hack.

nexialys
05-23-2005, 07:56 PM
... hum... you did not follow the topic, why do you answer this ?!...
and also, look at who you're talking to, you'll spare some words... (yes i'm the guy in perpetual bad mood)

JohnWoo
05-24-2005, 12:07 PM
And I can say only sorry :) I have never seen vb3 other then vb3.0.0 Gold and can't say something exactly...

David Bott
05-27-2005, 12:29 PM
Hello John...(Running 3.0.6 (this security updates applied.)

I have a question if I may. I see your full text search requires two different indexes in the post table. One is TITLE and the other is PAGETEXT (keynames). Looking at the structure for the POST table I see these two indexes with pagetext having a 1 after it.

Now looking at how Vb asked for it to be done, they want you to run...

ALTER TABLE post ADD FULLTEXT INDEX (title,pagetext);
ALTER TABLE thread ADD FULLTEXT INDEX (title);

This results, as you know, in a single new index in POST for TITLE that has both TITLE and PAGETEXT in one keyname. (title). Then they wanted another full text search index for the THREAD table for TITLE.

My questions, because I really do not know...

1) What is the difference having two keynames in the POST table, like yours, and only having one (theirs)? What does help with?

2) Does it matter, or will it help, with your code to have the full text search index in the THREAD table like they requested? (I made it just in case.)

I have your code now running on TiVo Community Forum and I am looking to run it on AVS Forum (Even though it is from last year, I hope the code is still ok with the current version Iam running). I am looking for solutions for we are getting killed with the searches more so on AVS Forum. (We went to VB3 last Sunday.)

Thank you for your time.

(Note...I noticed in the VB3 search.php it really needs help. Espically when you have it limite the return. It still searches and returns ALL HITS and then purnes it down to the limit number. Thus the return is still very large. I think your code actually does it correct by stopping when it reached the limit.)

Update...I just installed it on AVS Forum. It took my MySQL load from 1.4 (or more) to currently .38 with 2000 users on-line.

Just need to know why. :)

David Bott
05-30-2005, 01:50 PM
Hello Again...

Sad to see no answer to the above. So far the code seems to do very well. But still would love to know why it seems to have a lesswof an impact on MySQL vs what VB3 used.

kmike
05-30-2005, 02:10 PM
Now looking at how Vb asked for it to be done, they want you to run...

ALTER TABLE post ADD FULLTEXT INDEX (title,pagetext);
ALTER TABLE thread ADD FULLTEXT INDEX (title);
...
1) What is the difference having two keynames in the POST table, like yours, and only having one (theirs)? What does help with?

Two indexes are required to search both title AND post text in default search mode.

David Bott
05-30-2005, 02:14 PM
Hello...

Yes, I do see that and have it that way and all is well.

But th questionis as to why. Just wondering what the difference is and why do it that way vs the way VB asked for it to be done. It is more affective? Do I still need the "ALTER TABLE thread ADD FULLTEXT INDEX (title);" as they request?

I guess I am seeking why this hack works so much better than the VB3 version of the Full Text Search.

Thank you.

nexialys
05-30-2005, 02:56 PM
the "actual" version of fulltext inside vB is not working at all... 75% of the results are falsed and bug on most Mysql systems because not fixed... that'S why vB 3.5 will ahve a complete new fulltext integration, mostly based on this one... logically... 0:)

David Bott
05-30-2005, 03:27 PM
Ah. Why thank you very much. Thus the issue we had when we made the move to VB3 with the high mysql loads and frequent site outages.

I am guessing then I do not need their...

ALTER TABLE thread ADD FULLTEXT INDEX (title);

...querty and I can dump that index.

nexialys
05-30-2005, 03:43 PM
this actual hack was built before the full index modification was applied in vB 3.0.1, therefore you can delete that index... because vB added the fulltext in vB 3.0.2 ...

David Bott
05-31-2005, 12:24 PM
Well in the VB version of the code, Full Text seems to work well other than the high loads. Meaning....if someone looks for "the quick brown fox" it finds that. However...I just received this...


Is there any way to search for an exact phrase? On the Search FAQ page there are a couple of matching options, including "Match Complete Phrase", but on the Search Forums page there is only a keyword search. For example, if I search the forums for "whole house audio" it returns some 300 threads, all of which contain the three words somewhere, but not necessarily the whole phrase.

I've tried some things like using quotes and Boolean operators (AND, OR, NOT) -- with no luck. Am I missing something, or is this simply not an option?


I tested this and found he is correct. It does not hit like a full text search should.

Anyone have any thoughts on this?

Thank you.


*** Disreguard...NOTE...You need to have "Allow Search Wild Cards" set to YES in the Message Searching Options area of the Vb Options page.

David Bott
06-08-2005, 09:23 PM
Hello again...

I searched for the word RAID and the user name of BasementBob on my site in advanced mode. It retured no results even though their are threads by him with the word RAID in it. So the question is...Can you not use both the word box and the user name box at the same time with this hack running?

Can someome please test and advise.

Thanks

David Bott
06-24-2005, 11:59 AM
Ok, have some more issues and I have no clue why.

In this thread you can see a user reported an issue with not finding posts with the serach that clearly has the words in it.

http://www.tivocommunity.com/tivo-vb/showthread.php?t=246239

After testing I had found this to be true and droped the indexes and regreated them based on the information provided in the install text file.

even tried to find that thread above by seraching for "channels" and even "DirectTV" and could not even find that thread that reported the same issue.

Any thoughts on this?

MySQL version = MySQL 4.1.10a

David Bott
07-05-2005, 02:25 PM
Hello...Anyone? Is this hack no longer supported?

Thank you.

Steve St.Lauren
07-05-2005, 03:51 PM
Do you still have the hack installed? If so then you must have missed some part of the hack install. I just did a search on "+tivo +will +no +longer +change" in titles only in that forum to try to find the thread in question and it returned two threads that didn't include all of those terms in the search results - it shouldn't have returned those. My guess is that you missed some part of the hack install - check it over carefully again.

David Bott
07-08-2005, 08:37 PM
I have done so now more than once. Not new to this at all as you may see. ;) I just do not get it. Something justis not right and it happen both on TiVo Community Forum and AVS Forum.

Thank you kindly for the reply.

[Edited to removed requested found item.]

klamm
08-01-2005, 02:18 AM
Guys, you saved me! :)

I'm running a >2 Million Posts Board with 500 Users on ... heaviest load during searches using the standard algorithm. Now I tried this hack (vb 3.0.8, including Erwins improved Addon) and load is going down to nearly zero!
Thank God for this hack!

Btw: The built-in-hidden-debug-beta-fulltext-VB-search - don't even think of trying it @ big boards. It will only crash your machines ...

Steve St.Lauren
04-18-2007, 09:08 PM
Has anyone considered rewriting this for 3.6.5? The built in fulltext search is horrible and the sphynx search has it's issues (sorting by date, etc). When we were running this search on 3.0.9 searches returned within seconds. Now with the fulltext search it can take upwards of a minute if a search is done on all posts - >1.5 million posts. That's with running the searches on a slave server that's just handling searches.