PDA

View Full Version : Sphinx Search


Pages : 1 2 [3] 4

Spinball
01-28-2008, 11:21 PM
Maybe we should chip in and pay someone to release a full Sphinx mod with optimisation.
I'd chip in $100 for a start.

andrewkhunn
01-29-2008, 12:24 AM
Maybe we should chip in and pay someone to release a full Sphinx mod with optimisation.
I'd chip in $100 for a start.

Xorlev is already working on it for 3.7 I believe.

Spinball
01-29-2008, 05:48 AM
Xorlev is already working on it for 3.7 I believe.
WooHoo!
Any decent programmer will not touch your site for fees like $100 per modification because those type of modifications can be done by yourself. You will not see me offering vBulletin installations or any type of minimal "hack".
Bing! You maxed out on the arrogantometer there, mate.

TECK
01-29-2008, 07:13 PM
Bing! You maxed out on the arrogantometer there, mate.
It was not intended to sound arrogant. If it sounded like that, I apologize. I simply wanted to let Andrew and others that I do not deal with small code modifications, but large projects that are highly tested before they are implemented. I just finished a project for a client that had the work effort evaluated at $7,000.

If you need Sphinx search implemented, you will not get only this product, but also a set of optimizations that will improve your overall board performance by 300%. We are talking about several classes and function files what manage an agressive cache system intended to replace intensive queries performed into vBulletin. Since the search is an issue into vBulletin, you will consider this as an optimization, along with the other improvements.

This is a solution for large board owners, not the average Joe who has only one server.

Spinball
01-29-2008, 08:13 PM
Teck I might be interested in your services if, now you have the code and methods written, they may be quite a lot cheaper than $7,000. How 'pluginable' are they? By which I mean would upgrading vBulletin undo your changes, or do you apply them as plugins?
And what happens when a new version of vB is installed?
Take a look at my site www.avforums.com which has usually suffered from some kind of performance problems on a regular basis.

TECK
01-30-2008, 12:54 AM
Sphinx has so much potential, you can't even imagine.
I plan to do all heavy queries through Sphinx... for example the latest threads. :)

My new project will be similar to digg.com and will be 60% powered by Sphinx. I'm in the process of buying 4 servers (dual quad core 5410, 8Gb RAM and 2 146GB SAS 15K disks on RAID1, per server). The setup will cost me $12,000... I save $5,000 on this hardware deal. I got a really good deal on the hardware, so that's why I decided to buy 4 units, just to plan ahead. Collocation is the way I will go. Nginx will serve the data through an array of 2 servers while the DB will take the other 2.

I post those details not to show off, but to make you understand that I'm very serious about the work I do.
I don't want to be rude but the current approach to implement Sphinx is not good, IMO. I'm talking again about large forums who get hammered daily by a ton of hungry users.

mute
01-30-2008, 01:04 AM
Sphinx has so much potential, you can't even imagine.
I plan to do all heavy queries through Sphinx... for example the latest threads. :)

My new project will be similar to digg.com and will be 60% powered by Sphinx. I'm in the process of buying 4 servers (dual quad core 5410, 8Gb RAM and 2 146GB SAS 15K disks on RAID1, per server). The setup will cost me $12,000... I save $5,000 on this hardware deal. I got a really good deal on the hardware, so that's why I decided to buy 4 units, just to plan ahead. Collocation is the way I will go. Nginx will serve the data through an array of 2 servers while the DB will take the other 2.

I post those details not to show off, but to make you understand that I'm very serious about the work I do.
I don't want to be rude but the current approach to implement Sphinx is not good, IMO. I'm talking again about large forums who get hammered daily by a ton of hungry users.

Well, it may not be the optimal approach, but it's what everyone is using for lack of anything else. Without some type of sphinx implementation, we wouldn't have searching at all against our 41 million odd posts.

amcd
01-30-2008, 04:46 AM
Well, it may not be the optimal approach, but it's what everyone is using for lack of anything else. Without some type of sphinx implementation, we wouldn't have searching at all against our 41 million odd posts.
Exactly. We have only 7 million, but the problem is significant enough.

Most of us may not be first rate programmers like TECK, but many of do know quite a bit of programming. However, being an administrator of a large board takes up so much time that there is hardly any left for programming.

Marco van Herwaarden
01-30-2008, 07:05 AM
Guys,

Please do not turn this thread into advertising or discussing services offered by someone.

Spinball
01-30-2008, 07:59 AM
I don't want to be rude but the current approach to implement Sphinx is not good, IMO. I'm talking again about large forums who get hammered daily by a ton of hungry users.
This may be true, but stating it without offering some kind of solution (appropriate to this forum) is actually less than helpful because it unsettles people who then resent you for refusing to offer an insight into how the issues can be resolved.
It's like saying 'Oh I have the same car as you and I can make it run twice as far and twice as fast on the same gas but I'm not telling you how.' Nobody will thank you for it!

jwksite
01-30-2008, 11:41 PM
If it is a community forum for sharing, why they allow here paid services?
Well, for one, they allow REQUESTS for paid services, and those have specific forums you must post in. Coming into this topic and throwing stats around is a sales pitch, and if I had to interpret the Site Rules myself, I would say that's not allowed.

You have every right to try and sell your services, just not on vB.org.

# No posting messages anywhere on this site that are primarily for the promotion or advertising of any website, forums, email address, business, MLM, activity, or other entities that you have an affiliation with (ie. no self-promotion).


Discussion of commercial modifications is permitted on the forums at vBulletin.org as long as they are discussed with no promotion. Any promotional threads will be closed and/or deleted on sight.

1. Promotion of commercial modifications is not allowed.
2. Commercial modification authors are not permitted to create threads regarding their own products or promote their products in any way, shape or form, or to encourage another user to do so.

--------------- Added 1201743870 at 1201743870 ---------------

.... And I feel silly for somehow missing the last two pages of posts bashing on TECK. :) Wasn't trying to bring it back up, sorry! (I really just thought that page 35 was the last page.. haha)

Simetrical
02-18-2008, 08:05 PM
It currently supports non-exact checking. The nice thing about sphinx filters is you can pass it an array.

I'm actually working on it right now. Mostly I just ripped out vBulletin's search completely rather than the conditional "is the query blank or not?" and handling with vB's or Sphinx. Right now I'm working on adding the features sphinx.php lacks from the advanced search as well as all the sorting modes. If it's on the advanced search page, I'll add it.

After that it's most likely sanity checking to make sure it's all working.

Coventry: All we want to do is exclude posts/threads from userids in the list, right?
How's progress on this?

Edit: Actually, I got the instructions from earlier working, hurrah. I haven't seen any problems yet.

andrewkhunn
02-27-2008, 04:39 PM
Any update Xorlev? Hoping that something concrete emerges before 3.7 is released into the wild.

NickCat
03-03-2008, 12:30 AM
I just set up 0.9.8 on my 3.6.8 forum (nasioc.com) and we have just over 20 million posts. So far the performance increase is night and day. I'm using the sphinx.php from weeno and everything seems to be working well.

There is a results issue I can't quite understand.

I have a max_results of 5000 in my sphinx.conf, and my vbulletin is set to display 2500 results.

When I search a very common word (raw results from search on fulltext are 982507) on my site and display the results as threads, I get about 581 results. If I search the same term and display as posts I get the full 2500 results. I'm presuming this is because it's finding raw posts with my word in it and condensing that 2500 posts by grouping them into the proper threads. Is that correct?

Is there anyway to ask sphinx to actually return 2500 threads instead?

Otherwise things seem to be running quite smoothly... to have gone from searches that were taking upwards of 60 seconds to this is a dream come true. Now it's just a matter of working out the kinks.

Has anyone produced a newer example sphinx.php for 0.9.8 yet?

mlx
03-03-2008, 06:21 AM
Is there anyway to ask sphinx to actually return 2500 threads instead?

Yes, there is. You need to group the results by threadid.

https://vborg.vbsupport.ru/showpost.php?p=1327370&postcount=430
https://vborg.vbsupport.ru/showpost.php?p=1328258&postcount=432

NickCat
03-03-2008, 02:15 PM
Yes, there is. You need to group the results by threadid.

https://vborg.vbsupport.ru/showpost.php?p=1327370&postcount=430
https://vborg.vbsupport.ru/showpost.php?p=1328258&postcount=432

Thanks...

I just added this snippet from your file to the one I'm using:

if (!$vbulletin->GPC['showposts'] && $vbulletin->GPC['searchthreadid'] == 0 && $vbulletin->GPC['titleonly'] == 0)
{
// show results as threads
$cl->SetGroupBy('threadid', SPH_GROUPBY_ATTR);
// end show results as threads
}That's pulling the proper number of results now. How else does that affect the search though since the results it's pulling seem to be much different. I'm comparing the new and old version of the sphinx.php.

And since moving to 0.9.8 from 0.9.7 I'm getting the following warnings when I start searchd:
WARNING: key 'strip_html' is deprecated in /etc/sphinx.conf line 12; use 'html_strip (per-index)' instead.
WARNING: key 'sql_group_column' is deprecated in /etc/sphinx.conf line 28; use 'sql_attr_uint' instead.

sql_group_column looks like I just rename any instances to sql_attr_uint, but strip_html looks like the new command html_strip might want some sort of other variable.

Or does anyone just have an updated 0.9.8 sphinx.conf file to share?

Kevlar
03-13-2008, 12:53 AM
Has anybody had any problems with certain users not getting indexed in particualr forums? i.e. I have one user and if you search for his name anywhere on the forum, it works fine, unless you search in one particular forum... and then it comes up blank.

Also... I have an issue where a particular user was indexed up until 2/28/08 ... but then nothing between 2/28/08 and today (3/12/08).

Finally ... is there anyway to search for the term "K&N" ... if I try to search it now it comes up blank.

Thoughts?

Kevlar
03-13-2008, 03:34 PM
Look at the html code, does it have an ampersand?
I can search fine for "K&N", on my test forums. I can also search with no problems in a specific forum for a specific username.

When I try to search for K&N ... I get sorry, no matches.

In regrads to the username problem... for some reason, that user, in a certain forum just didn't get indexed. If I search for him in any other forum, all his posts show up fine. :confused:

Also... a different user for some reason did not get indexed between 2/28/08 and yesterday 3/12/08.

Marco van Herwaarden
03-14-2008, 07:04 AM
Thread cleaned from all posts discussing TECK's commercial venture.

Paul M
03-14-2008, 08:18 PM
Off topic post also removed, stick to the subject please.

Kevlar
03-19-2008, 12:28 PM
In regrads to the username problem... for some reason, that user, in a certain forum just didn't get indexed. If I search for him in any other forum, all his posts show up fine. :confused:

Turns out that entire forum didn't get indexed... I'm not sure why. If you search for anything in that specific forum by username or by keyword, nothing shows up. The rest of the forums work fine.

EDIT: When I say forum, I mean forumid=15 not an entire vB installation of a forum.

amcd
03-19-2008, 12:33 PM
Turns out that entire forum didn't get indexed... I'm not sure why. If you search for anything in that specific forum by username or by keyword, nothing shows up. The rest of the forums work fine.
It may not be an indexing problem at all. Check the forum manager in vbulletin adminCP and make sure the forum is marked as searchable.

Kevlar
03-19-2008, 01:32 PM
It may not be an indexing problem at all. Check the forum manager in vbulletin adminCP and make sure the forum is marked as searchable.
Jackpot! I have no clue how that got turned off, especially since all my other forums are on.:confused: That is why we love technology.

Many thanks! :up:

Deriel
04-06-2008, 05:44 PM
No vB 3.7 implementation yet? vB Gold is almost here =]

Kevlar
04-07-2008, 12:16 PM
I am using it on vB 3.7 RC1 without problems.

Deriel
04-07-2008, 12:22 PM
I am using it on vB 3.7 RC1 without problems.

Hey, good to know :)

You are using the instructions posted here in this topic? Any changes? All the searchs works, including tags?

Thanks!

Kevlar
04-07-2008, 12:27 PM
There were a few small changes to make it work with RC1 if memory serves me correctly... but nothing difficult to figure out. As far as I know, all my searches work (none of my users have complained).

That tag searching works well so far... but then again, I haven't given it to my users to use yet so only probably a handful or two of threads have tags to be able to test the tag searching with.

mute
04-08-2008, 04:21 PM
What's the likelihood of getting an updated version together? It seems like there are fixes for various things floating about (like "Find all posts by user"), etc floating about but nothing definitive since the last time Orban posted months ago.

I'm still curious if anyone has any insight into how we can improve this project overall. I've seen all these claims that the way Orban was doing things was overcomplicated, but not really anything concrete about what we can do to improve the process.

andrewkhunn
04-08-2008, 04:31 PM
Well, I know Xorlev *was* working on compiling a 3.7-ready complete version. It also appeared that he had been making some progress with similar threads and other outlaying issues to the current version we're all using.

Thomas P
04-08-2008, 06:39 PM
Isn't this under GPL?

Deriel
04-14-2008, 11:34 AM
Just do reinforce something said early, I am really interested in a detailed and organized solution for Sphinx/vB 3.7. I could do some donation (not a very large sum - our currency here in Brazil isn't much strong: 1 US Dollar = 1,8 Brazilian Real)

:)

wtrk
04-14-2008, 08:12 PM
im using memcached with vbulletin, if i add sphinx will it take advantage of memcached configured with vbulletin or does it have to be separately configured to use memcache?

mlx
04-15-2008, 05:22 AM
im using memcached with vbulletin, if i add sphinx will it take advantage of memcached configured with vbulletin or does it have to be separately configured to use memcache?

No offense but why should Sphinx use or support memcached? It's fast enough the way it is, isn't it?

AFAIK vBulletin only stores the datastore cache in memcached. I don't think that Sphinx needs any of that data, so there's nothing to take advantage of here.

I'm quite sure that all search results will be written to vBulletin's database no matter if memcached is enabled or not. So if you want to cache those results in memcached custom coding would be required at that point anyway - no matter if you use Sphinx or not.

wtrk
04-15-2008, 01:30 PM
No offense but why should Sphinx use or support memcached? It's fast enough the way it is, isn't it?

I wasnt sure if it was needed or not. I didnt think it was needed, but I just wanted to be sure. Thanks.

Deriel
04-30-2008, 01:22 PM
Just to confirm one thing:

If I upgrade my 3.6.9 vB to 3.7.0 I will be able to use the info prom this thread to use Sphinx with it?

Really important question

mlx
04-30-2008, 06:46 PM
Just to confirm one thing:

If I upgrade my 3.6.9 vB to 3.7.0 I will be able to use the info prom this thread to use Sphinx with it?

Really important question

Hasn't this been answered already?
I am using it on vB 3.7 RC1 without problems.

I didn't have time to look at 3.7.0 gold yet, but I'm not aware of any apparent problems in any of the RCs.

However if it's that important ... nobody stops you from setting up a 3.7.0 test forum and give it a shot yourself ... or look at the search.php source which hasn't changed much anyway.

Just my 2 cents.

kontrabass
05-01-2008, 06:19 PM
I guess I'll give it a test in the next few weeks and see if it works with 3.7. Only report so far that I found, is that it works "with some tweaks" with 3.7 RC1.

Is it me, or are other large board admins pretty concerned about the fact that their entire VBulletin managed web business is dependent on this hack (with instructions buried somewhere around page 26, and peppered throughout) for basic search functionality? And equally concerned that this thread seems to be losing its home and support. Heh. I've used VBulletin for the past 8 years, it's sad that such a ceiling is reached at about 7 mil posts, that the forum search 'breaks', leaving you with the two options of throwing thousands of dollars in hardware, or depending on 3rd party users to port Sphinx over as a hack bandage.

amcd
05-02-2008, 08:42 AM
I guess I'll give it a test in the next few weeks and see if it works with 3.7. Only report so far that I found, is that it works "with some tweaks" with 3.7 RC1.

Is it me, or are other large board admins pretty concerned about the fact that their entire VBulletin managed web business is dependent on this hack (with instructions buried somewhere around page 26, and peppered throughout) for basic search functionality? And equally concerned that this thread seems to be losing its home and support. Heh. I've used VBulletin for the past 8 years, it's sad that such a ceiling is reached at about 7 mil posts, that the forum search 'breaks', leaving you with the two options of throwing thousands of dollars in hardware, or depending on 3rd party users to port Sphinx over as a hack bandage.
Very true.

I am surprised no one has come forward and released a commercial solution. There are so many enterprises selling add-ons to VB, and some of the developers are really experts, but none have deemed this to be a feasible project.

NickCat
05-02-2008, 04:22 PM
I guess I'll give it a test in the next few weeks and see if it works with 3.7. Only report so far that I found, is that it works "with some tweaks" with 3.7 RC1.

Is it me, or are other large board admins pretty concerned about the fact that their entire VBulletin managed web business is dependent on this hack (with instructions buried somewhere around page 26, and peppered throughout) for basic search functionality? And equally concerned that this thread seems to be losing its home and support. Heh. I've used VBulletin for the past 8 years, it's sad that such a ceiling is reached at about 7 mil posts, that the forum search 'breaks', leaving you with the two options of throwing thousands of dollars in hardware, or depending on 3rd party users to port Sphinx over as a hack bandage.

Very true.

I am surprised no one has come forward and released a commercial solution. There are so many enterprises selling add-ons to VB, and some of the developers are really experts, but none have deemed this to be a feasible project.

You guys have to remember... we aren't the majority, we are sadly the very small minority of their install base, and frankly we are their target audience any longer. We've already bought the software, we pay a measly $30 a year for upgrades, so that puts us in the category of don't fix it.

It is really sad though. We are pushing 21.2 millions posts, and without this hack the site would have eaten itself. As it was, people would have to wait an average of 15-60 seconds when someone would post a really nasty search. Towards the ends I had literally hundreds is disallowed common, but still very useful, words on the search just to limp by.

Being probably just 1%, or less, of their customer base puts us at a real disadvantage when it comes to solving issues like this.

It seems as though the development team over there is resistant to change these days. The smallest and silliest example I saw the other day was an argument on .com about the feasibility of adding a gtalk IM to user profiles. The still don't believe enough people use it to add it in. Maybe that's changed on 3.7.0, but I'm still on the 3.6 branch while I wait for the rest of our hacks we use to be ported.

I know this thread isn't for solicitation, but I know Teck's posts were removed as he was deemed to be "vending." I personally would be very interested in his packaged commercial solution if he actually develops it.

Sphinx saved my butt, and I'd like to ensure that I have the option for many years to come.

Spinball
05-02-2008, 10:43 PM
I echo what you guys say. Sphinx saved our board without getting at least one more server - maybe two. Disappointing that integrating it isn't high on the vB team's priority list.

scanlover
05-03-2008, 07:33 AM
Hi,

I managed to integrate sphinx 0.9.8 rc2 with my vbulletin board 3.70 quite easily thanks to this thread.

I followed the instructions here:

https://vborg.vbsupport.ru/showpost.php?p=1283359&postcount=387

However the attached config file there was deprecated, I've updated it and attached my version below. I note that in vbulletin, the dateline and lastpost fields are stored as int(10) and not as a timestamp, hence I changed the config file to mark them as "sql_attr_uint" instead.

Also, as my board and database is running totally on utf-8 with some posts in CJK, so I have enabled CJK support in the attached config file as well, hope it is of use to someone

The sphinx.php files I found here seem to be partially buggy. ($coventry? empty arrays?) Will appreciate if someone can post a updated working version, especially for vB 3.7 ?

Lastly, I have converted my thread and post tables to INNODB, since there is now no need for the MYISAM FULLTEXT indices. Hopefully this will alleviate the table locking problems (no stats to show for though). Not sure if there are any repercussions in doing so :)

Background:
My forum is not exactly big, only about 300K posts but it is running on a very modest dedicated server with only 1G ram. A usual mysql fulltext search takes around 3-5 seconds, but my users tend to use the search gratuitously and it got so bad that I had to disable post body search.

Now with sphinx each search takes around 0.1-0.2 seconds, so everything's cool again.

PSS
05-03-2008, 08:40 PM
Just wanted to say I really appreciate your contribution, scanlover. I'll try that code with my big board in couple of weeks, I have to set up a test system first for 3.7.

--------------- Added 1209851068 at 1209851068 ---------------

Well, I know Xorlev *was* working on compiling a 3.7-ready complete version. It also appeared that he had been making some progress with similar threads and other outlaying issues to the current version we're all using.

Has anyone more info? If there is a Sphinx-to-Vb 3.7 (and any new version) integration package you can buy I'm interested. PM me if you can't write here.

mack101
05-04-2008, 03:08 AM
I think I have this working on a 36M post board running latest Sphinx and VB 3.7 :) Great job to everyone who has contributed to this, I was able to get it up and running in about 5 hours (3.5 hours to index and sort)

However, I'm quite new to the process, is there anything that has to be done on the VB admincp side of things, or is it left as is? I removed the full text indexes from MySQL, and it looks to have reverted to internal search. Do the file edits in search.php override this?

Based on the test results in the forum it looks very promising so far. Thanks for any advice you folks can give me.

scanlover
05-04-2008, 03:15 AM
I dont think vbulletin will switch to internal search just because you dropped the index.
As far as your vb is concerned, it still thinks the fulltext index is there. For example, I believe the 'similar threads' feature will try to use the fulltext index and so if you have that enabled you might get some errors. I'm not sure where else vB will try to use the fulltext index.

To change to local search you will need to use the 'Search Type' setting from admincp.

mack101
05-04-2008, 03:24 AM
Thanks. I think my mistake was removing the fulltext indices from VB admincp, which reverts the system to the VB internal engine. Whether it's actually using it, I can't say for certain.

I can't say I sound very sure of myself at this point :o

I guess what I need to confirm everything is working is a couple of sanity checks to validate the Sphinx setup and that vB is interfacing with it correctly.

amcd
05-04-2008, 01:36 PM
Once you get sphinx working, you will know immediately. Your searches will run in 1/100th of the time it used to take, and your database load will ease out a lot.

Make sure vb adminCP is set to fulltext search. If it is set to default search, then your forum will keep running the code to update the index tables. Once you are sure sphinx is working properly, just drop the fulltext indexes (from mysql, not from vb admincp).

ivanp
05-04-2008, 01:47 PM
Not really sure if indexes should be dropped.

I think other functions such is "Similar threads" depend on it. Should be checked.

--------------- Added 1209912502 at 1209912502 ---------------

Scanlover, many thanks for your detailed explanation!

I've just installed Sphinx and it seems to be working fantastic!

Kevlar
05-04-2008, 01:48 PM
Upgraded from 3.7 RC1 to 3.7 Gold. Things seem to be moving along fine.

mack101
05-05-2008, 06:05 AM
Running into a issue on my production server install.. the indexes grew to 62GB where it maxed out my disk space.

On the test server, using the same db, the indexes only used 39GB... What could be the cause of this?

ivanp
05-05-2008, 06:14 AM
Running into a issue on my production server install.. the indexes grew to 62GB where it maxed out my disk space.

On the test server, using the same db, the indexes only used 39GB... What could be the cause of this?

My guess is you were using --rotate switch, which temporarily creates copy of indexes. When it finishes, it replaces existing indexes. This is normal behavior.

orban
05-05-2008, 07:10 AM
I will be releasing an updated version of the vBulletin Sphinx integration for 3.6 and 3.7 without file modification (two plugins, two file uploads), later today. Contains copy paste code from search.php though (about 300 lines) but I do think it's worth it and necessary. Currently working on a few small problems (similar threads, tags, thread prefixes).

If somebody has a 3.7 install and integrated these with Sphinx and can help me out, I'd be very grateful. In more detail I'd like to know how thread prefixes and tags are stored in the thread table (I haven't set up a 3.7 test environment yet).

amcd
05-05-2008, 08:30 AM
Welcome back, orban.

orban
05-05-2008, 10:33 AM
Hello amcd!

Anyway, slowly getting back in business, updated the first post in this thread... just sharing what I have at the moment. Maybe I can get some feedback

It's kind of messy but yeah

As usual if you break your server it ain't my fault

mack101
05-05-2008, 11:07 AM
My guess is you were using --rotate switch, which temporarily creates copy of indexes. When it finishes, it replaces existing indexes. This is normal behavior.

Actually it was just the 'indexer --config sphinx.conf --all' command, same as I had run before.

Going to try again today.

--------------- Added 1209993523 at 1209993523 ---------------

Sorry, I think I understand what is going on now.. I just don't have enough physical disk to generate the temp files and then the indexes. Doh!

I will move my indexes from the test site to the production server then run an update on that.

Simetrical
05-05-2008, 12:26 PM
Not really sure if indexes should be dropped.

I think other functions such is "Similar threads" depend on it. Should be checked.
If you don't drop the indexes, you'll lose a good part of the benefit of switching to Sphinx (although you'll keep most of it, it's true). The indexes are very large and keeping them up-to-date will add to server load unnecessarily. Although not half as much as the locks that conducting the actual searches takes out.

scanlover
05-05-2008, 04:35 PM
Thanks Orban!

I just tried your version 0.1 on my 3.70 board
Tag search seems ok, personally I dont use thread prefixes and similar threads on my board..
Other than that, all that you said should work seems to be working fine.

A few points:
1) if no results returned, I get some extra text at the end of the error message :)
2) should there be a way to set weights for title/postbody?
3) how can we confirm that the search cache is being used?

orban
05-05-2008, 04:55 PM
1) Change line 514 to "$errors[] = array('searchnoresults', '');"
2) Add "$cl->SetFieldWeights(array('title' => 100, 'pagetext' => 10));" after line 410 and change those values to your liking (see here http://sphinxsearch.com/doc.html#weighting)
3) Search cache? It's definitely not using vB built-in search caching because that one is horribly ++++ed up

Tag search is not yet enabled but luckily for us Sphinx supports multi-valued attributes (http://sphinxsearch.com/doc.html#mva) so we can store all tags for every thread. As soon as I get my hands on a 3.7 installation I can implement that.

Prefix search... I can't believe how ++++ing retarded vB is. Prefixes don't get an id, they are just stored as strings (there is no unique integer id in the table). Sphinx doesn't support string attributes. I'm a bit at a loss here. Possibly a very simple hashing function... but it'd have to be implemented in MySQL... what were they thinking

scanlover
05-05-2008, 06:06 PM
thanks again.. works great.

Actually vB tags are implemented at thread level, so I don't really see the need to implement sphinxsearch on that unless someone does a non-trivial hack to make tags operate at post level.

What will be great would be to add a new advanced search form that caters specially to sphinxsearch. The old advanced search form can work alongside it (it can be used for title-search only queries).

orban
05-05-2008, 09:13 PM
That's a bit beyond what I'm trying to accomplish here but if somebody wants to give it a go :)

NickCat
05-06-2008, 03:49 AM
Thanks for the great work orban to keep this going!!!

I just installed .1 and everything seems to be working great.

I am getting an error on the redirect page before the results pop though, nothing bad, just informational.

Warning: in_array() [function.in-array]: Wrong datatype for second argument in /search_sphinx.php on line 160I am running vb 368p2.

I was trying to hold out upgrading to 3.6.10 until I'm sure they aren't going to .11 in a week! :) I'm had vb long enough to know better!

--------------- Added 06 May 2008 at 01:48 ---------------

I guess I spoke too soon.

I upgraded to 3.6.10, to try to alleviate any other issues.

Having a few other problems... and I'm not sure what's going on.

1) For some reason my updates on the deltas aren't working. It runs, but never finds anything to index. And there are definitly new posts and threads to index.

The result looks like this:

Sphinx 0.9.8-rc2 (r1234)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/etc/sphinx.conf'...
indexing index 'postdelta'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.011 sec, 0.00 bytes/sec, 0.00 docs/sec
indexing index 'threaddelta'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.010 sec, 0.00 bytes/sec, 0.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=22708).The command I'm running is:
indexer --config /etc/sphinx.conf --rotate postdelta threaddeltaAm I wrong in thinking the deltas are still called postdelta and threaddelta?

2) I'm getting an occasional error when people are searching:
Database error in vBulletin 3.6.10:

Invalid SQL:

INSERT INTO search
(userid, titleonly, ipaddress, personal, query, searchuser, forumchoice, /*prefixchoice,*/ sortby, sortorder, searchtime, showposts, orderedids, dateline, searchterms, displayterms, searchhash, completed)
VALUES
(78408,
1,
'71.119.247.53',
1,
'events',
'',
'20,102,133',
'lastpost',
'DESC',
0.01146, 0,
'1507990,1202618,1497231,1424353,1425729,1370173,1 327330,1327625,1275317,1280585,1053389,1247121,122 0128,1173630,1187708,1191528,1161131,1114547,10781 19,1024452,947834,887728,855358,853807,816227,7831 90,718582,649979,649714,644579,593661,565031,55614 1,539177,532216,501639,476210,459248,426142,424279 ,369567,339477,300242,269273,252339,204641,179446, 128824,109786,19493,19998',
1210052178,
'a:24:{s:5:\"query\";s:6:\"events\";s:10:\"searchuser\";s:0:\"\";s:9:\"exactname\";i:1;s:11:\"starteronly\";i:0;s:11:\"forumchoice\";a:1:{i:0;s:2:\"20\";}s:11:\"childforums\";i:1;s:9:\"titleonly\";i:1;s:9:\"showposts\";i:0;s:10:\"searchdate\";s:1:\"0\";s:11:\"beforeafter\";s:5:\"after\";s:6:\"sortby\";s:8:\"lastpost\";s:9:\"sortorder\";s:10:\"descending\";s:9:\"replyless\";i:0;s:10:\"replylimit\";i:0;s:14:\"searchthreadid\";i:0;s:9:\"saveprefs\";i:1;s:11:\"quicksearch\";i:0;s:10:\"searchtype\";i:0;s:7:\"exclude\";s:0:\"\";s:7:\"nocache\";i:0;s:4:\"ajax\";i:0;s:9:\"imagehash\";s:0:\"\";s:10:\"imagestamp\";s:0:\"\";s:6:\"userid\";i:0;}',
'a:8:{s:5:\"words\";a:1:{i:0;s:6:\"events\";}s:9:\"highlight\";a:1:{i:0;s:6:\"events\";}s:6:\"common\";a:0:{}s:5:\"users\";a:0:{}s:6:\"forums\";a:1:{i:20;i:20;}s:8:\"prefixes\";a:0:{}s:3:\"tag\";s:0:\"\";s:7:\"options\";a:3:{s:11:\"starteronly\";i:0;s:11:\"childforums\";i:1;s:6:\"action\";s:7:\"process\";}}',
'0b537c34de784ac889ce7bca2e26adeb',
1)
### SAVE ORDERED IDS TO SEARCH CACHE ###;

MySQL Error : Duplicate entry '0b537c34de784ac889ce7bca2e26adeb-lastpost-DESC' for key 2 Error Number : 1062
Date : Tuesday, May 6th 2008 @ 01:36:18 AM
Script : http://forums.nasioc.com/forums/search.php?do=process
Referrer : http://forums.nasioc.com/forums/search.php?do=process
IP Address : 71.119.247.53
Username : psychoskip
Classname : vB_Database
Any ideas... cause I'm lost here...

The database error was happening on both 3.6.8 and 3.6.10. The search is still working most of the time, but this seems to be having an issue thinking it's inserting a duplicate key into the table.

UPDATE: I reverted back to the original sphinx setup I was using for now. A couple of things I will have to say I noticed immediatly between the two version. The older version took significantly longer to index and took up more drive space, but the results seem to return much faster.

orban
05-06-2008, 08:56 AM
I am getting an error on the redirect page before the results pop though, nothing bad, just informational.

Wrap "if (is_array($vbulletin->GPC['prefixchoice'])) { .... }" around the that block

1) For some reason my updates on the deltas aren't working. It runs, but never finds anything to index. And there are definitly new posts and threads to index.

Are you sure the sphinx_counter table is set up properly

2) I'm getting an occasional error when people are searching:

Change the line with "$searchhash" to "$searchhash = md5(microtime());"

Try if that works

A couple of things I will have to say I noticed immediatly between the two version. The older version took significantly longer to index and took up more drive space, but the results seem to return much faster.

The newer Sphinx versions are much more complex... how large are your indices? I haven't noticed anything...

NickCat
05-06-2008, 03:11 PM
Wrap "if (is_array($vbulletin->GPC['prefixchoice'])) { .... }" around the that block



Are you sure the sphinx_counter table is set up properly



Change the line with "$searchhash" to "$searchhash = md5(microtime());"

Try if that works



The newer Sphinx versions are much more complex... how large are your indices? I haven't noticed anything...

I'm going to retry the upgrade this weekend.

I will make those edits and see what we get for changes as far as the errors.

The sphinx_counter was working properly, it was updating both counters each time I ran the indexer, it just wasn't adding any info to the deltas.

As far as the speed, my board is 21+ million posts. I could be mistaken, but the size difference seemed rather significant between the two setups.

My indices right now with the old version are as follows:
-rw-r--r-- 1 root root 6804366226 May 6 02:27 fulltext.spd
-rw-r--r-- 1 root root 276 May 6 02:27 fulltext.sph
-rw-r--r-- 1 root root 25592581 May 6 02:27 fulltext.spi
-rw------- 1 root root 0 May 6 02:28 fulltext.spl
-rw-r--r-- 1 root root 0 May 6 02:12 fulltext.spm
-rw-r--r-- 1 root root 1960560103 May 6 02:27 fulltext.spp
-rw-r--r-- 1 root root 0 May 6 12:05 fulltextdelta.spa
-rw-r--r-- 1 root root 1092441 May 6 12:05 fulltextdelta.spd
-rw-r--r-- 1 root root 276 May 6 12:05 fulltextdelta.sph
-rw-r--r-- 1 root root 88374 May 6 12:05 fulltextdelta.spi
-rw------- 1 root root 0 May 6 12:05 fulltextdelta.spl
-rw-r--r-- 1 root root 0 May 6 12:05 fulltextdelta.spm
-rw-r--r-- 1 root root 320815 May 6 12:05 fulltextdelta.spp
-rw-r--r-- 1 root root 0 May 6 02:28 thread.spa
-rw-r--r-- 1 root root 82741228 May 6 02:28 thread.spd
-rw-r--r-- 1 root root 221 May 6 02:28 thread.sph
-rw-r--r-- 1 root root 768559 May 6 02:28 thread.spi
-rw------- 1 root root 0 May 6 02:28 thread.spl
-rw-r--r-- 1 root root 0 May 6 02:28 thread.spm
-rw-r--r-- 1 root root 10011490 May 6 02:28 thread.spp
-rw-r--r-- 1 root root 0 May 6 12:05 threaddelta.spa
-rw-r--r-- 1 root root 9361 May 6 12:05 threaddelta.spd
-rw-r--r-- 1 root root 221 May 6 12:05 threaddelta.sph
-rw-r--r-- 1 root root 3139 May 6 12:05 threaddelta.spi
-rw------- 1 root root 0 May 6 12:05 threaddelta.spl
-rw-r--r-- 1 root root 0 May 6 12:05 threaddelta.spm
-rw-r--r-- 1 root root 1298 May 6 12:05 threaddelta.spp

I will report back with size of the new ones once I get a chance. It takes about 30-45 minuted for me to reindex the entire site. I used your new config as well, making the necessary edits, when I ran the setup, so that obviously changed the name of some things, but postdelta and thread delta remained the same. The speed difference may have just been perception though, since your script runs through the redirect and the search_sphinx.php I am currently using skips the redirect entirely.

eoc_Jason
05-07-2008, 01:28 PM
Hi Orban! Glad you are back.

I too plan on tinkering extensively with sphinx once I get my site upgraded to 3.7. I literally could not run my forum search if it wasn't for sphinx. Also I want to tinker with the vB search code itself because I hate how they filter legit results and can sometimes even give "no results found" when in fact results are returned (before being processed through their miles of code that does god knows what for god knows why)...

Yes, people that use the similar threads option, I know vB 3.6 will look at the fulltext index when creating the similar threads so you have some code to modify to redirect it to sphinx. I would assume 3.7 would do the same. If you drop your fulltext indexes you will probably get an error when creating a new thread.

I have to disagree with the person who posted above saying that large vB sites are a small percentage. I think there are a lot more than you realize, but people attack the search problem in different ways. I've seen some people use lots of slave servers with some serious hardware to try and alievate the problem. Some people just disable their search entirely... or use google... I think someone might have hacked together dtsearch too... But I have to agree that Sphinx is the best (and fastest) choice...

I was really hoping that the vB team would implement some ability to use Sphinx with 3.7, but it seems like that request has gone unanswered...

The biggest problem I see when people do a search (not using sphinx), if it doesn't return results within a matter of seconds they start clicking again, and again, and again... which queues up the same search over and over in mysql... It's transparent to the other member too until someone posts a new message, which then it locks the whole table and anyone just wanting to read another thread has to wait... and they start clicking refresh over and over which again starts sending more and more requests to mysql... Eventually the server runs out of memory and things go ape... A serious problem indeed...

Anyhow, I would love to help out and contribute what I can once I get iTrader coded for 3.7 and those people off my back. I *think* I posted some cron scripts, log rotate, initd script, and other stuff on one of the previous pages. If I didn't and someone would like them let me know and I'll post what I have.


I have a cron script that updates the delta every 15min (you can change it to whatever) and once a day (like 5am) it rebuilds the whole indexes.
I also have an initd script (I run redhat) for starting & stopping sphinx. It creates a pid file and all that jazz... It's just a basic script but it works...
Along with the above initd script I wrote a logrotate script, since I keep logs for the searches & sphinx output.

ivanp
05-10-2008, 09:00 AM
In vB 3.7.0 & vB Blog 1.0.4 the following tables have FULLTEXT indices:

post -> FULLTEXT KEY title (title, pagetext);
thread -> FULLTEXT KEY title (title);
blog -> FULLTEXT KEY title (title);
blog_text -> FULLTEXT KEY title (title,pagetext);
socialgroup -> FULLTEXT KEY name (name, description);

The idea is to find a way to use Sphinx to completely replace vB FULLTEXT queries and to drop all these indices.

And then to switch to InnoDB tables, to prevent the rest of the locking queries.

--------------- Added 1210415448 at 1210415448 ---------------

Here is my setup... We should bundle similar script in the package.


Cron jobs:

Every week: /usr/sphinx/indexer.sh --all
Every hour: /usr/sphinx/indexer.sh thread
Every 5 minutes: /usr/sphinx/indexer.sh postdelta threaddelta


indexer.sh:


#!/bin/sh

LOCKFILE=/var/lock/sphinx.cron.lock
INDEXER_CONF=/usr/sphinx/etc/sphinx.conf
INDEXER_BIN=/usr/sphinx/bin
INDEXER_LOG=/var/log/sphinx/indexer.log

[ -f $LOCKFILE ] && exit 0

trap "{ rm -f $LOCKFILE ; exit 255; }" EXIT

touch $LOCKFILE

$INDEXER_BIN/indexer --config $INDEXER_CONF --rotate $1 $2 $3 $4 >> $INDEXER_LOG

exit 0

TechGuy
05-10-2008, 01:55 PM
I have installed Sphinx with the information provided here, however, now my unanswered threads search is showing answered threads. It's almost as if the the SetFilterRange isn't actually doing anything. I've placed comments in the code to make sure it is getting to the point of setting the filter range, however, it still reveals posts that have more than 0 replies.

The url used to build the search is search.php?do=process&replyless=1&replylimit=0&searchdate=30&beforeafter=after&e xclude=54

Jase2
05-10-2008, 07:35 PM
Hello,

Sorry if I seem to be missing something already said, but what exactly is this and what does it do ?

orban
05-10-2008, 09:42 PM
I have installed Sphinx with the information provided here, however, now my unanswered threads search is showing answered threads. It's almost as if the the SetFilterRange isn't actually doing anything. I've placed comments in the code to make sure it is getting to the point of setting the filter range, however, it still reveals posts that have more than 0 replies.

The url used to build the search is search.php?do=process&replyless=1&replylimit=0&searchdate=30&beforeafter=after&exclude=54

If you are using the script posted in post 1, I wrote there it only works when searching threads, because Sphinx doesn't have replycount integer stored for posts. I don't think I'm checking for "replyless", must have missed that... maybe if you add a check for that variable and force the thread index it might work :)

TechGuy
05-10-2008, 11:22 PM
Orban,

Thanks for the quick reply. I'm not sure what you mean by check for the variable and force the thread index.

The search_sphinx.php has the following code which appears to check for replyless:
if ($vbulletin->GPC['titleonly'] and ($vbulletin->GPC['replyless'] OR $vbulletin->GPC['replylimit'] > 0))

if ($vbulletin->GPC['replyless'] == 1)
$cl->SetFilterRange('replycount', 0, $vbulletin->GPC['replylimit']);
else
$cl->SetFilterRange('replycount', $vbulletin->GPC['replylimit'], 9999999999);

And replycount is in the postsrc in sphinx.conf

orban
05-11-2008, 07:25 AM
You added replycount to your postsrc?

You do realize that this is really buggy, because when a thread with 0 replies is in the main post index, it will show up as having 0 replies until you update the main index?

This might be why you're getting answered threads...

You can try to uncomment the lines starting with #

("admin mode", and the two print_r()s, and try to figure out where it's going wrong... of the replycount range is set, and if all returned posts/thread ids actually have replycount 0 or not)

ivanp
05-11-2008, 08:29 AM
In vB v3.7.0 fulltext search is used in:

- search.php: there are bunch of hooks we can use. It would be much better solution than to patch search.php itself.

- includes/functions_search.php: in function fetch_similar_threads, there is a hook 'search_similarthreads_fulltext'.

vB Blog v1.0.5:

- includes/class_blog_search.php: no hooks

Jase2
05-11-2008, 09:26 AM
Anyone answer my previous question ? :)

ivanp
05-11-2008, 09:35 AM
Anyone answer my previous question ? :)

We are using Sphinx to replace slow vBulletin internal & fulltext search.

TechGuy
05-11-2008, 10:30 PM
Has any progress been made on getting the advanced search options working?

Jase2
05-14-2008, 02:25 PM
So, it replaces the default vBulletin search ?

Deriel
05-14-2008, 03:16 PM
So, it replaces the default vBulletin search ?

Yes, exactly.

Somewhere, above 1 million posts, the vB search and the Fulltext search became both too slow, too server intensive. Solution: Sphinx.

Jah-Hools
05-15-2008, 07:34 PM
I echo what you guys say. Sphinx saved our board without getting at least one more server - maybe two. Disappointing that integrating it isn't high on the vB team's priority list.

Wotcher Spinbal,

A quick question if I may.

I see you are using Google Custom search on your site..

So are you using Sphinx just for the VB Forum search?

Are most members using Google Custom as the default search? And only some using the VB search?

Thanks M8,

See - at 2 million posts search speed isnt a problem for me (so far) - what IS a problem is the forum members deep, deep unhappiness with the way the VB search works.. Google is the best way to search my forum (it handles 2 character content way better) so I am about to pull the trigger on Google Custom Business edition.. I wanted to make a last check to see if Sphinx would do the job better.. But all I can glean from these posts is 'use sphinx if VB search is slow' I don't see any raves about improved user functionality.. Does Sphinx simply improve speed for big forums and reduce the extra machines in a cluster of servers?

Its search functionality I am after.. Does Sphinx add functionality to the forum users experience - or just add speed?

Many thanks..

kmike
05-16-2008, 04:06 AM
Yes, Sphinx produces much more relevant search results than the default vB search engines, either builtin or MySQL fulltext. It takes into account keyword proximity when calculating relevance.
I'm not sure though how it compares to Google.

Deriel
05-16-2008, 10:49 AM
I think that Sphinx works better than Google CSE. You have total control over Sphinx but not over the content indexed by Google nor the time that Google takes to index new posts.

Jase2
05-16-2008, 02:05 PM
Can you still use Sphinx for a pretty small forum? I'm not a big fan of the default hehe.

Spinball
05-30-2008, 12:30 PM
Wotcher Spinbal,

A quick question if I may.

I see you are using Google Custom search on your site..

So are you using Sphinx just for the VB Forum search?

Are most members using Google Custom as the default search? And only some using the VB search?

Thanks M8,

See - at 2 million posts search speed isnt a problem for me (so far) - what IS a problem is the forum members deep, deep unhappiness with the way the VB search works.. Google is the best way to search my forum (it handles 2 character content way better) so I am about to pull the trigger on Google Custom Business edition.. I wanted to make a last check to see if Sphinx would do the job better.. But all I can glean from these posts is 'use sphinx if VB search is slow' I don't see any raves about improved user functionality.. Does Sphinx simply improve speed for big forums and reduce the extra machines in a cluster of servers?

Its search functionality I am after.. Does Sphinx add functionality to the forum users experience - or just add speed?

Many thanks..
I don't have details of how many people use the Google search or how. But it's a useful alternative to the Sphinx search. We can only use Sphinx for the vB forums search. I don't have the time to look at learning how Sphinx works unfortunately.

eoc_Jason
05-31-2008, 09:17 PM
See - at 2 million posts search speed isnt a problem for me (so far) - what IS a problem is the forum members deep, deep unhappiness with the way the VB search works.. Google is the best way to search my forum (it handles 2 character content way better) so I am about to pull the trigger on Google Custom Business edition.. I wanted to make a last check to see if Sphinx would do the job better.. But all I can glean from these posts is 'use sphinx if VB search is slow' I don't see any raves about improved user functionality.. Does Sphinx simply improve speed for big forums and reduce the extra machines in a cluster of servers?

Its search functionality I am after.. Does Sphinx add functionality to the forum users experience - or just add speed?

One big problem with the stock vB code is that it likes to do a lot of post-filtering of the returned results. It happened way too often (to me and others) that searches where I know a keyword should appear (because I put the word in a specific place to do the test) vB will return "No results found" because it drops the score so low that it rejects it, even if it was the ONLY result...

Sphinx has it's own scoring system, and depending on how you set it up it can be extremely powerful. But if you want to gain the full benefit you will also need to probably peel through the search.php file and rip out some of the vB scoring so that it doesn't remove relevant results.

Also with sphinx it will return results in the same thread (or post) like view. So having results returned by date or a certain user or whatever is a lot more powerful than what google could probably do.

rebelde
06-02-2008, 04:24 PM
I was desperate to get tag search working, so I've worked it out myself. I used Orban's 0.1 "plug-in" version as a base, with Sphinx-0.9.8-rc2 on vBulletin 3.7.1. Only small changes were necessary to support tags.

I don't use prefixes in my forum, so the prefix implementation is incomplete. I just added the prefixes in the full-text index so they are searchable just like any other text in a thread or post. I would be happy to try to walk somebody (with basic PHP skills) through what I think is necessary to get it to work properly.

I have attached my versions of Orban's files. Here are the details of my changes:

In Orban's search_sphinx.php,

after:
#$vbulletin->GPC['prefixchoice']

add:
if (!empty($verified_tag ))
$cl->SetFilter('tagid', array($verified_tag['tagid']) );


This isn't required, but I prefer 'extended' mode so people can use quotes in searches to find phrases:

Replace:
if (!$vbulletin->config['Sphinx']['mode']) $vbulletin->config['Sphinx']['mode'] = 'SPH_MATCH_ALL';
With:
if (!$vbulletin->config['Sphinx']['mode']) $vbulletin->config['Sphinx']['mode'] = 'SPH_MATCH_EXTENDED';



In sphinx.config, I have made a number of changes.
- added tags and prefixes to the indexes
- changed post.visible=1 to post.visible > 0 so moderators can see deleted posts (bug fix)
- use a minimum word length of 1 (instead of 4). Sphinx can handle it without problem.


(Please excuse my new username. I haven't posted in this thread before, but I have been a member of vB.org for four years under a different username.)

And finally, thank you Orban! If you are around, turn on private messaging and I'll help however I can.

orban
06-03-2008, 07:57 AM
Thanks rebelde, adding prefixes to the full text index is a great idea, I don't know why I haven't considered this. Also re-enabled PMs ;)

ivanp
06-04-2008, 09:21 AM
Nice job orban and rebelde! It works great.

I think this code in search_sphinx.php:
if (!$vbulletin->config['Sphinx']['host']) $vbulletin->config['Sphinx']['host'] = 'localhost';

should be changed to:
if (!$vbulletin->config['Sphinx']['host']) $vbulletin->config['Sphinx']['host'] = $vbulletin->config['MasterServer']['servername'];


Two questions:
1) Is there a fix to make wildcards work?
2) Is there a transliteration support for Sphinx?

rebelde
06-04-2008, 12:03 PM
I'm glad it is working for you.
1) Is there a fix to make wildcards work?
From what I can tell from the Sphinx docs, you need to add:
min_prefix_len = 3
prefix_fields = title (maybe more?)
enable_star = 1

I'm not sure where you add these lines, though. I would think that they go in:
index thread : post - first two
index threaddelta : post - first two
searchd - enable_star = 1

Let us know if you find out where they go, and if it slows searches much.

Transliteration? I'm not sure. Check the Sphinx docs.

ivanp
06-05-2008, 10:31 AM
/search.php?do=finduser&u=... is not working properly.

It returns either:

Sorry - no matches. Please try some different terms. %1$s

or returns very old results.

--------------- Added 1212666226 at 1212666226 ---------------

e Sphinx docs, you need to add:
min_prefix_len = 3
prefix_fields = title (maybe more?)
enable_star = 1

I am afraid of this:

http://www.sphinxsearch.com/doc.html
However, indexing prefixes will make the index grow significantly (because of many more indexed keywords), and will degrade both indexing and searching times.

Waiting to hear feedback if moderators really need this feature.

BillP
06-06-2008, 04:00 AM
We are running the new plugin with 3.7.1

When one clicks on a username and chooses "Find More Posts By <user>", the following warning appears on the "Your search is in progress" page:

Warning: in_array() [function.in-array]: Wrong datatype for second argument in [path]/search_sphinx.php on line 160

Then it refreshes, and returns numerous errors followed by the results:

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Any ideas on how to fix this?

rebelde
06-06-2008, 01:56 PM
When one clicks on a username and chooses "Find More Posts By <user>"

/search.php?do=finduser&u=... is not working properly.

I tested it, and have no trouble at all with these searches.
/search.php?do=finduser&u=1
/search.php?do=finduser&u=1&starteronly=1
It works properly and shows the threads in descending order for me.

BillP
06-06-2008, 02:53 PM
I tested it, and have no trouble at all with these searches.
/search.php?do=finduser&u=1
/search.php?do=finduser&u=1&starteronly=1
It works properly and shows the threads in descending order for me.


Did you use the search_sphinx.php directly from the download on the first post of this thread, or have you edited it with fixes that orban (or others) have suggested along the way?

We aren't running any other hacks in the system. These ARE warnings, maybe it's just the warning vs. error setting in PHP and these can be ignored?

DaiTengu
06-06-2008, 09:38 PM
Turning off warnings is a cosmetic fix, but the underlying issues still exist.


Did you update your sphinx.conf with the new settings suggested here: https://vborg.vbsupport.ru/showpost.php?p=1506564&postcount=541 ?

BillP
06-06-2008, 09:47 PM
I know! I'm just trying to figure out why my (unmodded) site is throwing warnings, and others are not! I've turned off error reporting for now (only in search.php), but don't want to do it forever.

I didn't modify anything, I used orban's main scripts. I will take a look at the others.

ivanp
06-09-2008, 08:03 AM
Should be AdminCP/Message Searching Options/Search Result Sharing turned on?

Why is
REPLACE INTO " . TABLE_PREFIX . "search
changed to
INSERT INTO " . TABLE_PREFIX . "search

I've often get 'duplicate entry' mysql error in database log.

--------------- Added 1213002737 at 1213002737 ---------------

Still having finduser issues.

I found that when user is having space in its name, it doesn't return any result.

ivanp
06-10-2008, 11:00 AM
There is definitely some problem with finduser on vB 3.7.0.

When disabled Sphinx hooks, everything works fine.

When enabled, it returns significantly smaller result set, completely skipping posts dating Jan-Jun 2008!?

Cyn
06-27-2008, 02:38 PM
We are having chronic search issues. We only have 11 million posts but this seems to be a resolution for us.

A couple of questions :

1 - Will the "New Posts" link still produce new posts results based on a member's last visit?

2 - What about searches of private forums? How does Sphinx handle these? We would not want our members and moderators to lose search functions for forums they have access to as a usergroup.

Thanks!

rebelde
06-27-2008, 02:43 PM
Should be AdminCP/Message Searching Options/Search Result Sharing turned on?

Why is
REPLACE INTO " . TABLE_PREFIX . "search
changed to
INSERT INTO " . TABLE_PREFIX . "search

I've often get 'duplicate entry' mysql error in database log.

I changed that to "REPLACE" and the database errors have stopped. I'm not sure why it was changed. It doesn't make any sense to me.

Cyn,
All of Sphinx's search results are still filtered through the user permissions, so nobody sees anything that he/she is not supposed to.

Cyn
06-27-2008, 02:53 PM
Thanks rebelde.

I assume New Posts works as usual?

rebelde
06-27-2008, 03:31 PM
I assume New Posts works as usual?Yes. You would have seen something somewhere in this thread if it didn't work.

Cyn
06-27-2008, 03:46 PM
Excellent. Thanks!

ALanJay
06-30-2008, 03:59 PM
Thanks rebelde, adding prefixes to the full text index is a great idea, I don't know why I haven't considered this. Also re-enabled PMs ;)

Thanks Orban for the update - great work.

Do the current uploaded files include the various suggested modifications that have been made or will I still need to make them?

wtrk
07-08-2008, 01:45 AM
i have a question, can sphinx also search other databases and show the results in vbulletin? like for example can i consolidate a search of vbulletin, photopost and reviewpost into a single result displayed in vbulletin with sphinx?

amcd
07-08-2008, 04:29 AM
i have a question, can sphinx also search other databases and show the results in vbulletin? like for example can i consolidate a search of vbulletin, photopost and reviewpost into a single result displayed in vbulletin with sphinx?
Since sphinx is a general purpose search tool, I suppose it could be done, but not with the code posted in this thread. It would require major modification of search.php and related modules and templates.

wtrk
07-08-2008, 06:12 PM
Since sphinx is a general purpose search tool, I suppose it could be done, but not with the code posted in this thread. It would require major modification of search.php and related modules and templates.

i understand that thats why beyond the scope of this thread. i just wanted to know if it was possible. thanks.

RedWingFan
07-11-2008, 09:04 PM
I'm somewhat geeked: I have Sphinx running on my test forum! Finally got indexer and searchd to run and load. NO errors from what I can tell--this has to be a first! :D

I did the changes to vB 3.7 (uploaded search_sphinx.php, sphinxapi.php, added the two plugins), but I can't tell if I'm using Sphinx or not. Is there any way I can tell? Searches were fast to begin with on this smaller forum, so time is not a good benchmark here. Any other way to tell? I did enter a new post with the word "privileges" in it, and I see it did not show up in Sphinx (using search at the command line) after I ran the "indexer --all" command for a second time.

I just want to see IF it is working before I test HOW it is working. It seems like it went too easily for it to be working properly... :D

RedWingFan
07-13-2008, 03:08 PM
It's working. If I kill searchd, I get an error that it can't connect to "localhost". I now have cron running as well, and new posts are indexed within three minutes.

Speed is actually slightly slower for Sphinx on this smaller test forum, but we don't have that many posts. I may bite the bullet and try this on the main forum soon.

RedWingFan
07-15-2008, 03:39 PM
We are running the new plugin with 3.7.1

When one clicks on a username and chooses "Find More Posts By <user>", the following warning appears on the "Your search is in progress" page:


Then it refreshes, and returns numerous errors followed by the results:

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Any ideas on how to fix this?

We just had this same error. I can't duplicate it myself, but another user mentioned that he was doing an advanced search for "Threads started by user" and got multiple instances of this error. I tried the same search (no other options chosen), and it worked OK. Seems random...?

eoc_Jason
07-15-2008, 05:06 PM
i have a question, can sphinx also search other databases and show the results in vbulletin? like for example can i consolidate a search of vbulletin, photopost and reviewpost into a single result displayed in vbulletin with sphinx?

There's no reason you couldn't... But it would be custom coding for you... You would have to create extra sources to index first. Then deal with how you are going to search them.

P.S. Sphinx 0.9.8 Gold (or whatever you want to call the final) has been released today!

RedWingFan
07-16-2008, 01:09 PM
We just had this same error. I can't duplicate it myself, but another user mentioned that he was doing an advanced search for "Threads started by user" and got multiple instances of this error. I tried the same search (no other options chosen), and it worked OK. Seems random...?

Not random. If I choose "View Results as Posts", I get the errors. If I view results as threads, no errors.

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

The proper search results do display below, however...

mute
07-17-2008, 02:44 PM
I just upgraded to 0.9.8, but am running some incredibly old search.php code (on 3.6). So far so good. I had to do a major reorg of my sphinx.conf, but I couldn't resist some of the new options (unlink_old is nice because this box has limited diskspace).

I'm curious if anyone has a definitive status report on what is and is not working properly on 3.7. I see orban has updated the first post recently, but I wasn't sure if all of the quirks that are listed were still applicable.

We're not ready to upgrade to 3.7 yet, but I'd really like to have "Find all posts/threads by user" hitting Sphinx when we do, those queries are sort of brutal for us.

Also, does anyone have a decent guide on "How to search" laying around? Our users seem to have trouble figuring out search operands, searching for phrases, etc. I was hoping someone who uses sphinx might have a FAQ entry done up that explains what you can and can't do with this setup.

Simetrical
07-18-2008, 04:59 PM
We're not ready to upgrade to 3.7 yet, but I'd really like to have "Find all posts/threads by user" hitting Sphinx when we do, those queries are sort of brutal for us.
If it only searches for posts by the user and nothing else (no search terms added), you should just add an index on (userid, dateline) to your post table. Then it will be as fast as fetching the posts for a thread. You don't need an external engine like Sphinx to do this fast. The situation with threads is similar, (postuserid, dateline). Use Sphinx for fulltext searching; MySQL proper is still just fine for searches without any text aspect.

mute
07-18-2008, 07:38 PM
If it only searches for posts by the user and nothing else (no search terms added), you should just add an index on (userid, dateline) to your post table. Then it will be as fast as fetching the posts for a thread. You don't need an external engine like Sphinx to do this fast. The situation with threads is similar, (postuserid, dateline). Use Sphinx for fulltext searching; MySQL proper is still just fine for searches without any text aspect.

We have 47 million posts, and 2.6 million threads. Trust me, it puts a hurting even on our 8-way Opteron box with 12gb of ram.

snakes1100
07-18-2008, 11:19 PM
Confirmed as working on vBulletin 3.7.2 Patch Level 1

cron:
postdelta - 5min
threaddelta - 5min
reindexed nightly

Seem to ran into one issue tho of only returning 500 results when searching for posts by a user with 15k, found a hard limit set to 500 in sphinx_search, raised that limit to 20000 to match everything else.

Odd enough it now returned only 1000 results, which still isn't right, as there is no other limit set to 1000, will have to chk it out some more.

Full index & restart didn't help.

Note, found the 1000 hard limit set in sphinxapi.php

New search now returned 7100 posts by that user, missing about 8k in posts, back to the drawing board.

Showing results 1 to 25 of 7074
Search took 0.52 seconds.

6 million post board w/about 2500 online.

Add-on:
Seems i may have found a bug someplace, ill post part of the error, if orban or anyone wants the full error msg i have it saved, but doing a search for a keyword that has only about 3500 results produced a duplicate key entry and a db error page, after, the browser reported a script still running error and wouldnt let me refresh the page, FF stopped the script and then a refresh cured the db error page, doing the search again for the same keyword resulted in a good search & no error.


Invalid SQL:

INSERT INTO search
(userid, titleonly, ipaddress, personal, query, searchuser, forumchoice,

/*prefixchoice,*/ sortby, sortorder, searchtime, showposts, orderedids, dateline, searchterms, displayterms, searchhash,

completed)
VALUES
(641509,
0,
'',
1,
'bollywood',
'',
'',
'lastpost',
'DESC',
0.10879, 0,

### SAVE ORDERED IDS TO SEARCH CACHE ###;

MySQL Error : Duplicate entry '0089875ee33efd9d611cfff6ca28d9e4-lastpost-DESC' for key 2
Error Number : 1062

HFB
07-27-2008, 02:02 PM
It appears that Sphinx will be included in vB 4.0
http://www.vbulletin.com/forum/showpost.php?p=1588292&postcount=7

RedWingFan
07-28-2008, 03:41 AM
Not random. If I choose "View Results as Posts", I get the errors. If I view results as threads, no errors.

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

The proper search results do display below, however...

More info, if anyone cares to debug this. For now, I just put error_reporting(0); as the first line in includes/search_functions.php to kill the error message. It doesn't cure the problem, but at least removes a small amount of end-user panic. :D

Here is how to duplicate it:

1) Use no search terms
2) Enter a valid username
3) Choose "Threads started by user", and check Exact Name
4) View results as posts

The function's affected lines are here (strpos is the affected line):

// ###################### Start process_quote_removal #######################
function process_quote_removal($text, $cancelwords)
{
$lowertext = strtolower($text);
foreach ($cancelwords AS $word)
{
$word = str_replace('*', '', strtolower($word));
if (strpos($lowertext, $word) !== false)
{
// we found a highlight word -- keep the quote
return "\n" . str_replace('\"', '"', $text) . "\n";
}
}
return '';
}
//


So, I'm thinking the $cancelwords array is empty when being passed to this function. (Follow my logic?) Not knowing enough about this or the vB search function, or the Sphinx file I uploaded, I don't know where I'd begin troubleshooting this. The function is used at around line 2817 in search.php:

$post['pagetext'] = preg_replace('#\[quote(=(&quot;|"|\'|)??.*\\2)?\](((?>[^\[]*?|(?R)|.))*)\[/quote\]#siUe', "process_quote_removal('\\3', \$display['highlight'])", $post['pagetext']);
//
//


If there's something that'll work, to fix this, I'm all ears.

Beyond that, the search is working fine on our test forum...and I'm anxious to run this on our live forum as soon as I can get this ironed out.

Jah-Hools
08-21-2008, 02:29 PM
Sorry for being a total dunce but..

Does the Sphinx search display its results in an "in line" VB format? (so that in line moderation actions may be applied to selected results?)

Reason I ask is - right now I have Google custom search + the regular VB search + TAG search..

Newbies probably just use the Google CS as its the first box in the drop down menu - but that doesnt give moderators any tools to work with the results and do any 'in line' moderation (merging, renaming, changing prefixes, tagging etc)

Power users probably use both..:o

Regular users are probably confused about what to use...:confused:

The Google Custom Search is great for searching phrases

But its not a very clear solution with the various options - its a bit irksome that I may have to write a how to search guide that advises when to use what search engine..:down:

2 + million posts - is it time for Sphinx?

Is it a good 'all in one solution with an "in line" moderation functionality?

Server speed / load isn't an issue here, my quest is purely to improve search quality for the forum members.

Thanks in advance.. Sorry if this butts in... or is OT.

:)

--------------- Added 1219333047 at 1219333047 ---------------

It appears that Sphinx will be included in vB 4.0
http://www.vbulletin.com/forum/showpost.php?p=1588292&postcount=7

Holey moley!:up:

Jon
09-04-2008, 04:53 AM
We just upgraded from vBulletin 3.6.10 / Sphinx 0.9.7 to vBulletin 3.7.3 / Sphinx 0.9.8 using the scripts provided here. The sort order when displaying results as threads seems to be messed up now. I believe in our old setup we had this fixed by using the sort_search_items() function on the result set. Is this function still available in vBulletin 3.7.3? If so, how do I implement it?

eoc_Jason
09-09-2008, 05:44 PM
Jah-Hools - Yes, the search hack here basically uses Sphinx indexes to find the results, which are then returned to the vB search process and displayed like internal search results...

If performance isn't an issue, then probably not worth trying to implement unless you have some spare time and are bored. Yes with sphinx you can use phrases and various expressions to make a much more powerful search (read the sphinx docs online, it goes into detail about that). However, you still occationally run into the issue of vBulletin liking to filter out some search results it thinks are not relevant...

Jon - Yes that function is still in the functions_search.php file.

I just came here to download the code someone posted using the new sphinx code + tag search... If I can intergate tag searching I will be happy...

I'm still using the old code that uses a separate file and edits the search.php to make it run. I'm not sure if I'm ready to switch over or not, like I said I'm about to take a look. I had the sort_search_items in my existing code, and plan to keep that functionality in my new code. I'll post a file or instructions once I'm done. I can't believe I've waited this long to upgrade to vB 3.7.x, but I suppose it's time...

I think it's great the vB team are finally addressing their search shortcomings and integrating in alternate search technologies... It doesn't take long for a site to reach a few million posts (especially with a lot of Off-Topic sites).

ferreo
09-10-2008, 03:33 AM
You guys have to remember... we aren't the majority, we are sadly the very small minority of their install base, and frankly we are their target audience any longer. We've already bought the software, we pay a measly $30 a year for upgrades, so that puts us in the category of don't fix it.

It is really sad though. We are pushing 21.2 millions posts, and without this hack the site would have eaten itself. As it was, people would have to wait an average of 15-60 seconds when someone would post a really nasty search. Towards the ends I had literally hundreds is disallowed common, but still very useful, words on the search just to limp by.

Being probably just 1%, or less, of their customer base puts us at a real disadvantage when it comes to solving issues like this.

It seems as though the development team over there is resistant to change these days. The smallest and silliest example I saw the other day was an argument on .com about the feasibility of adding a gtalk IM to user profiles. The still don't believe enough people use it to add it in. Maybe that's changed on 3.7.0, but I'm still on the 3.6 branch while I wait for the rest of our hacks we use to be ported.

I know this thread isn't for solicitation, but I know Teck's posts were removed as he was deemed to be "vending." I personally would be very interested in his packaged commercial solution if he actually develops it.

Sphinx saved my butt, and I'd like to ensure that I have the option for many years to come.


I would like to chip in to this thread, especially with the aim to provide a real customer testimonial to TECK's sphinx solution.

Our forums (http://www.purseforum.com/) had been struggling with the vb-internal search at about 1 million posts, the mysql fulltext search gave in at around 2.5 million. At that point, the processes on our front ends would be just eating up resources, waiting for the responses from the sql backends, a sphinx solution had to come into play, stat. A nice chap that I found on here installed an implementation that followed the instructions from this thread, the solution was sufficient, lacked a lot of deep functionality though. I do not recall what branch I upgraded vB to in May, but the old sphinx implementation was just not cutting it anymore.

I emailed TECK first about server optimization work, as my boxes were struggling badly during peak traffic times. It turned out that he had a sphinx solution of his own, he promised a commercial solution for large vB installs. I agreed to pay a premium price for it, which turned out to be the best investment I had yet done for my business and community.

I realize that it is not within every webmaster's means to pay several thousand dollars for a custom software solution. However, considering the performance of the product and TECK's dedication to his customers, I have no regrets whatsoever. Due to the painful circumstance that my host doesn't give root to managed boxes, the install of the search and optimizing my servers was a real pain in the f***ing ass. TECK often spent long nights until 3am working to get the solution working in my tedious environment. If I am to spend thousands for code, I expect proper customer service, and TECK delivered.

Now I have a perfectly working vB install and search solution with sphinx 0.9.8, my servers have now plenty of resources and room for future growth, and our members are delighted to have a fast and accurate, rock-solid search. Money well spent.

Feel free to contact me if you have any other questions.

Jon
09-10-2008, 09:44 AM
Jon - Yes that function is still in the functions_search.php file.

I just came here to download the code someone posted using the new sphinx code + tag search... If I can intergate tag searching I will be happy...

I'm still using the old code that uses a separate file and edits the search.php to make it run. I'm not sure if I'm ready to switch over or not, like I said I'm about to take a look. I had the sort_search_items in my existing code, and plan to keep that functionality in my new code. I'll post a file or instructions once I'm done. I can't believe I've waited this long to upgrade to vB 3.7.x, but I suppose it's time...

Jason, I would appreciate that (as would a significant number of my members :) ).

And you're not alone in postponing upgrades. It took me years to upgrade from vBulletin 2.x.. I believe SA.com is still running that version. Heavily modified I'm sure, but still :D

Kaelon
09-16-2008, 01:29 PM
I've been reading through these posts, but am wondering if there is a concise listing of the latest updates/guides to using Sphinx with vBulletin. I see that the first post was last modified in May 2008; have there been any bug-fixes/improvements on Sphinx with vBulletin since then?

xnetco
10-12-2008, 02:38 PM
Has anyone had an issues with not all posts being indexed?

Tim

TECK
10-14-2008, 04:01 AM
It is very easy to see if Sphinx indexed your posts, from console:
$ search "post title or content here" --config /etc/sphinx.conf

If it returns results, you know Sphinx is not the culprit.

Spinball
10-14-2008, 08:35 AM
From admin CP, find a user, from the drop-down menu near the top for that user, do 'find posts by user' and you get

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

--------------- Added 1223994206 at 1223994206 ---------------

I am also getting a couple of database errors per minute which look like this:
Database error in vBulletin 3.7.2:

Invalid SQL:

INSERT INTO search
(userid, titleonly, ipaddress, personal, query, searchuser, forumchoice, /*prefixchoice,*/ sortby, sortorder, searchtime, showposts, orderedids, dateline, searchterms, displayterms, searchhash, completed)
VALUES
(199144,
0,
'82.45.64.35',
1,
'',
'',
'',
'post.dateline',
'DESC',
0.41381, 1,
'7910162,7909683,7901421,7897556,7897435,7893091,7 891793,7891237,7890849,7890831,7890681,7883569,788 3453,7883405,7879058,7879034,7878857,7878747,78787 01,7873052,7873025,7872710,7872139,7870002,7869307 ,7865412,7865397,7858431,7855677,7854580,7854561,7 854552,7850714,7850693,7849703,7849688,7849004,784 8755,7848736,7848689,7841987,7841959,7841885,78413 84,7841345,7834603,7833936,7833119,7832472,7832123 ,7832111,7827946,7827737,7825843,7825294,7825061,7 824531,7824480,7824437,7822528,7820735,7818312,781 3687,7813676,7810657,7805266,7805174,7805125,78024 52,7793552,7789673,7786793,7786728,7786477,7782994 ,7782960,7781194,7778313,7773823,7773802,7772909,7 772858,7772774,7771517,7769068,7768999,7768966,776 5669,7764148,7764145,7764140,7764128,7764121,77641 14,7764111,7758989,7758968,7749807,7743705,7737046 ,7736969,7732475,7731262,7731233,7731188,7726538,7 726485,7720900,7720785,7720776,7720637,7720591,772 0531,7720440,7720289,7711014,7710978,7708282,77080 17,7707792,7701848,7701797,7698239!
,7698089,7694948,7694687,7694620,7694595,7694518,7 694495,7688325,7688277,7687918,7687759,7687660,768 7625,7684177,7676849,7671353,7660707,7657314,76560 13,7655765,7652852,7647751,7647741,7647460,7647453 ,7647396,7645990,7644340,7642598,7642102,7642011,7 641985,7639370,7639367,7638532,7635655,7635552,763 0940,7629198,7628772,7627899,7627725,7626677,76258 96,7625789,7625778,7623074,7623051,7618844,7618819 ,7618798,7618753,7618718,7615635,7614744,7613102,7 611940,7611906,7609673,7609187,7608659,7607236,760 6473,7605869,7605585,7605558,7605534,7605440,75980 45,7598029,7597878,7597560,7597466,7597393,7595869 ,7595668,7593681,7591317,7589223,7589076,7589063,7 589047,7589030,7588341,7587985,7587606,7586505,758 1381,7581368,7581226,7580732,7580720,7580687,75806 78,7580628,7580580,7574523,7573417,7573332,7573302 ,7570499,7570479,7570475,7570404,7566942,7566921,7 562532,7562417,7562406,7562310,7562212,7561992,756 1600,7561591,7561275,7561232,7561209,7561098,75607 16,7560686,7560315,7559990,7559864,755!
9818,7559679,7559664,7559637,7559531,7552998,75513 79,7550063,7549983,7
549968,7549938,7542743,7542408,7541111,7541097,754 1084,7540205,7540199,7540152,7539856,7539838,75398 02,7539754,7529899,7529880,7529867,7529514,7529478 ,7529415,7529341,7529310,7529249,7529241,7529231,7 528448,7526898,7520976,7520378,7520366,7519948,751 9917,7517303,7517259,7516412,7514267,7513816,75121 22,7511965,7506440,7506431,7506429,7506361,7506348 ,7498172,7498144,7497922,7497541,7497510,7497479,7 495560,7495546,7492065,7491901,7491881,7491869,748 6804,7486783,7486768,7486720,7484166,7484157,74835 47,7483525,7483439,7483430,7482868,7482845,7482831 ,7480220,7480186,7480169,7478760,7476151,7475535,7 475530,7473277,7473249,7473017,7472970,7472940,747 2927,7472913,7472906,7472901,7472893,7461283,74599 90,7459938,7459905,7450408,7448863,7445663,7445648 ,7445616,7445587,7445523,7445087,7445047,7445035,7 444995,7444974,7444963,7444946,7443716,7443704,744 3157,7443134,7442854,7441626,7441502,7439405,74390 61,7439056,7438995,7433513,7433466,7433449,7433438 ,7433420,7430835,7430827,7430812,743076!
2,7430750,7430646,7428899,7426674,7426379,7425996, 7424770,7424739,7424727,7423765,7423747,7423731,74 23676,7423671,7423664,7418334,7418322,7418311,7418 208,7418196,7414469,7414431,7414416,7404690,740464 6,7404622,7404599,7404485,7404448,7402360,7402340, 7402219,7401855,7401838,7401829,7401820,7401818,74 01804,7401790,7401772,7401761,7401752,7401085,7398 252,7398246,7398027,7397717,7395856,7395842,739569 4,7395676,7395675,7395635,7395580,7395564,7391565, 7385480,7385460,7384731,7384632,7384584,7384529,73 84500,7384486,7384475,7382921,7378980,7378900,7378 058,7377876,7370600,7370470,7370438,7368364,736833 9,7368298,7366974,7366863,7366852,7363126,7363116, 7363104,7363096,7363056,7363048,7363047,7363040,73 63033,7361440,7361420,7359385,7357792,7357784,7356 178,7356104,7355478,7355242,7355214,7353450,735122 1,7351183,7351169,7351154,7351093,7351059,7350646, 7350602,7350565,7347493,7345110,7344986,7344941,73 44921,7344901,7343615,7343606,7337949,7337926,7337 907,7337823,7337783',
1223993854,
'a:25:{s:5:\"query\";N;s:10:\"searchuser\";N;s:9:\"exactname\";N;s:11:\"starteronly\";i:0;s:3:\"tag\";N;s:11:\"forumchoice\";a:0:{}s:12:\"prefixchoice\";N;s:11:\"childforums\";i:0;s:9:\"titleonly\";b:0;s:9:\"showposts\";b:1;s:10:\"searchdate\";N;s:11:\"beforeafter\";N;s:6:\"sortby\";N;s:9:\"sortorder\";N;s:9:\"replyless\";N;s:10:\"replylimit\";N;s:14:\"searchthreadid\";i:0;s:9:\"saveprefs\";N;s:11:\"quicksearch\";N;s:10:\"searchtype\";i:1;s:7:\"exclude\";N;s:7:\"nocache\";N;s:4:\"ajax\";i:0;s:11:\"humanverify\";N;s:6:\"userid\";i:199144;}',
'a:8:{s:5:\"words\";a:1:{i:0;N;}s:9:\"highlight\";a:1:{i:0;s:0:\"\";}s:6:\"common\";a:0:{}s:5:\"users\";a:1:{i:199144;s:6:\"Yemeth\";}s:6:\"forums\";a:0:{}s:8:\"prefixes\";a:0:{}s:3:\"tag\";s:0:\"\";s:7:\"options\";a:3:{s:11:\"starteronly\";i:0;s:11:\"childforums\";i:0;s:6:\"action\";s:7:\"process\";}}',
'2ebb9cb46ee5005c200207e9e3343161',
1)
### SAVE ORDERED IDS TO SEARCH CACHE ###;

MySQL Error : Duplicate entry '2ebb9cb46ee5005c200207e9e3343161-post.dateline-DESC' for key 2
Error Number : 1062
Request Date : Tuesday, October 14th 2008 @ 03:17:33 PM
Error Date : Tuesday, October 14th 2008 @ 03:17:34 PM
Script : http://www.avforums.com/forums/search.php?do=finduser&u=199144
Referrer : http://www.avforums.com/forums/xbox-360-game-previews/761672-dead-space-uk-release-24-oct-08-a.html
IP Address : 82.45.64.35
Username : Yemeth
Classname : vB_Database_Slave
MySQL Version :


--------------- Added 1224000005 at 1224000005 ---------------

Looks like the code is doing an insert onto the slave server causing the duplicate record error. Is this a bug in the code?
This is very urgent as we are having several database errors per minute.

--------------- Added 1224007861 at 1224007861 ---------------

There is also an error when running this:
http://www.avforums.com/forums/search.php?do=finduser&userid=1
(the my posts link in the menu)
which gives
Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

Warning: strpos() [function.strpos]: Empty delimiter in [path]/includes/functions_search.php on line 829

snakes1100
10-14-2008, 08:13 PM
@spinball
are you using the plugin version of the hack or the older manually edited version?

ive not seen this issue and i run sphynx on a few servers/sites, ive not seen this issue yet.

Spinball
10-16-2008, 06:38 AM
@spinball
are you using the plugin version of the hack or the older manually edited version?

ive not seen this issue and i run sphynx on a few servers/sites, ive not seen this issue yet.

The plugin as per the updated instructions in the first post in this thread.

I changed the INSERT back to a REPLACE in the search_sphinx.php script which stopped the database errors, and I put error_reporting(0); as the first line in includes/search_functions.php to kill the error messages, but there are still other anomalies, and I think I'm going to have to offer someone $ to fix this properly for us.

Lizard King
10-16-2008, 01:16 PM
The plugin as per the updated instructions in the first post in this thread.

I changed the INSERT back to a REPLACE in the search_sphinx.php script which stopped the database errors, and I put error_reporting(0); as the first line in includes/search_functions.php to kill the error messages, but there are still other anomalies, and I think I'm going to have to offer someone $ to fix this properly for us.
I may suggest you to contact Teck privately. His sphinx script is working flawlesly on lots of sites.

Spinball
10-16-2008, 07:52 PM
I may suggest you to contact Teck privately. His sphinx script is working flawlesly on lots of sites.

Will do, thanks

kerplunknet
10-23-2008, 02:14 AM
Has anyone figured out the "Find More Posts by UsernameHere" issue? It results in these error messages, but the results seem to work properly:

Warning: strpos() [function.strpos]: Empty delimiter. in /includes/functions_search.html on line 868

Warning: strpos() [function.strpos]: Empty delimiter. in /includes/functions_search.html on line 868

Warning: strpos() [function.strpos]: Empty delimiter. in /includes/functions_search.html on line 868

Warning: strpos() [function.strpos]: Empty delimiter. in /includes/functions_search.html on line 868

Warning: strpos() [function.strpos]: Empty delimiter. in /includes/functions_search.html on line 868

Warning: strpos() [function.strpos]: Empty delimiter. in /includes/functions_search.html on line 868

Warning: strpos() [function.strpos]: Empty delimiter. in /includes/functions_search.html on line 868

I really don't want to put Sphinx live until this is fixed... any advice would be welcome.

I am wondering if it has something to do with using an older 3.6 version of vBulletin.

RedWingFan
10-23-2008, 03:15 AM
Has anyone figured out the "Find More Posts by UsernameHere" issue?

I think I posted earlier in this thread (a couple pages back) that I put "error_reporting(0);" on one of the pages, which got rid of the error message. But it didn't fix the problem... :erm: I didn't have time to track through all of the code to find out what was missing in the query.

kerplunknet
10-23-2008, 04:39 AM
Hi RedWingFan,

Thanks for the tip. That is a good workaround so the warning does not get shown, but a *real* fix for this would be ideal. I am going to have our main site developer look at this code quickly. Maybe it's an easy fix. Thank you for the workaround, though.

Spinball
10-27-2008, 08:38 AM
You can turn off the errors that way, but doing a search for posts by a user is still broken. This is crippling the search capabilities and the abilities of the moderators to do their jobs.
This latest sphinx modification is broken.

ALanJay
10-27-2008, 02:44 PM
Noticed recently that if you want search on any punctuation the default settings only include in the index letters if you change in your sphinx.conf

charset_type = sbcs
charset_table = 0..9, A..Z->a..z, +, &, _, a..z

it will search on letters / numbers + & _ obviously you can add other characters if you want to search on them.

LauraS
11-17-2008, 03:04 PM
Running into a problem with sphinx in terms of getting statistics for users. Whenever I try to click on Find all posts by username or Find all threads started by username in the statistics section for a user I always get the following message back:

1. Sorry - no matches. Please try some different terms. %1$s

Any of you guys have any ideas? I know for a face there are threads/posts for the user I'm analyzing. Almost looks like something isn't being escaped out in the sphinx code. I installed sphinx using the plugin method with search_sphinx.php file. Normal search on site seems to work. It's just seems messed up when trying to analyze user stats.

Running vbulletin 3.7.3 and sphinx 0.9.8.1.

My output when I run all the indexing:

indexing index 'post'...
collected 27212 docs, 22.0 MB
sorted 3.0 Mhits, 100.0% done
total 27212 docs, 22008013 bytes
total 4.702 sec, 4680607.00 bytes/sec, 5787.38 docs/sec
indexing index 'postdelta'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.010 sec, 0.00 bytes/sec, 0.00 docs/sec
indexing index 'thread'...
collected 7214 docs, 0.2 MB
sorted 0.0 Mhits, 100.0% done
total 7214 docs, 195938 bytes
total 0.116 sec, 1687564.25 bytes/sec, 62132.35 docs/sec
indexing index 'threaddelta'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.010 sec, 0.00 bytes/sec, 0.00 docs/sec
distributed index 'fulltext' can not be directly indexed; skipping.
distributed index 'threadtitles' can not be directly indexed; skipping.
rotating indices: succesfully sent SIGHUP to searchd (pid=30404).

Danny

Deriel
12-11-2008, 10:38 AM
1. Sorry - no matches. Please try some different terms. %1$s

This will sound ridiculous (and it is, but...): I simply upgraded to 0.9.9 and the message is gone :confused:

LauraS
12-19-2008, 02:05 PM
This will sound ridiculous (and it is, but...): I simply upgraded to 0.9.9 and the message is gone :confused:

Thanks Deriel. When it becomes stable we will probably install it.

Danny

mute
01-01-2009, 03:49 AM
So.. looks like Jelsoft backed down from supporting Sphinx in 4.0. Frankly I'm pretty disappointed, I honestly don't see how in the past how many years now that we've been running Sphinx that they couldn't at least unofficially support it like they do memcache, etc.

For us it means we've got to wait even longer to even begin thinking about upgrading to 4.0 when it comes out, until someone has a Sphinx implementation that a) doesn't cost thousands, and b) is able to replicate stock functionality more closely than the current one does.

TechGuy
01-01-2009, 11:53 AM
That is a huge disappointment. Are they planning on doing something else with search? I mean, they've got to know there's a growing number of big boards!

mute
01-01-2009, 02:33 PM
That is a huge disappointment. Are they planning on doing something else with search? I mean, they've got to know there's a growing number of big boards!

From what I gather, there are "improvements" in the search (front end functionality I assume) on the way for 4.0, but a Sphinx backend won't be even unofficially supported in 4.0.

As I've stated over in the 4.0 discussion thread, I'm pretty flabberghasted that they haven't found the time to bang something out since we started using Sphinx in 2006. I simply can't run a version of vBulletin that does not have a Sphinx enabled search hack available (even if the one I'm currently running is a little wonky).

Simetrical
01-02-2009, 12:09 PM
That is a huge disappointment. Are they planning on doing something else with search? I mean, they've got to know there's a growing number of big boards!
They've said they'll make a search API that should make extensions that support Sphinx (or Lucene, etc.) considerably easier to write.
As I've stated over in the 4.0 discussion thread, I'm pretty flabberghasted that they haven't found the time to bang something out since we started using Sphinx in 2006.
Don't be. People running large sites are a small percentage of their customer base. They can't weigh our needs too heavily in the scheme of things: they need to focus on the ones who give them the most money.

Wayne Luke
01-02-2009, 12:17 PM
From what I gather, there are "improvements" in the search (front end functionality I assume) on the way for 4.0, but a Sphinx backend won't be even unofficially supported in 4.0.

Its more than "improvements", the entire search architecture is being refactored and rewritten to be more scalable.

tmc
01-13-2009, 10:27 PM
Yeah, this basically isn't working for me at all.. It's pretty unfortunate.

Any hope of an update for 3.8? :)

I get the same problems with User search as the people above, but sorting just doesn't work at all like it should.

Kevlar
01-14-2009, 01:00 PM
Its more than "improvements", the entire search architecture is being refactored and rewritten to be more scalable.
Is it going to be able to handle 15+ million posts in a timely fashion?

Simetrical
01-15-2009, 12:41 PM
I find it hard to believe that anything based on pure MySQL will be able to handle that many posts nearly as well as Sphinx at the moment. MySQL just doesn't have efficient searching right now. You could probably get it to more or less work given enough hardware, if you did stuff like keeping a separate search table and not running updates against it live (to avoid MyISAM's table-wide write locks). But you can't get around the fact that it's just really slow.

JamesN
01-16-2009, 02:01 AM
Once Sphinx Search is applied, are there any indexes you can remove or any FullText search settings, etc to switch off/delete?

amcd
01-16-2009, 02:22 AM
Yes. From vbulletin admincp, change search to fulltext index so that vb does not try to update the word tables. Then drop the fulltext indices from the post and thread tables.

JamesN
01-16-2009, 02:33 AM
Apologies to ask a stupid question but how do I "drop the fulltext indices from the post and thread tables."

I got my server suppot to install Sphinx for me.

amcd
01-16-2009, 02:42 AM
You have to run SQL queries to drop indices. If you do not know about this, ask for help from server support. Wrong commands can delete/damage data and it may not be reversible.

JamesN
01-16-2009, 06:28 AM
Thanks, this has been done.

Is there anything else that should be done?

--------------- Added 1232105322 at 1232105322 ---------------

I'm now getting these errors upon creating new threads. Any idea why?

Database error in vBulletin 3.8.0:

Invalid SQL:

SELECT thread.threadid, MATCH(thread.title) AGAINST ('test') AS score
FROM thread AS thread

WHERE MATCH(thread.title) AGAINST ('test')
AND thread.open <> 10


LIMIT 5;

MySQL Error : Can't find FULLTEXT index matching the column list

amcd
01-16-2009, 10:17 AM
Do you have similar threads turned on? If not, then this does not look like a default vbulletin query. Must be some addon or mod.

JamesN
01-16-2009, 10:31 AM
Yep, similar threads is turned on. Shouldn't it be for it to work with Sphinx?

amcd
01-16-2009, 11:09 AM
Unfortunately, no. The sphinx implementation posted in this thread does not work with similar threads.

JamesN
01-16-2009, 11:27 AM
Oh ok - thanks, is there anything else that needs to be switched off for it to work correctly?

mta
01-16-2009, 01:17 PM
I've installed and integrated sphinx search 0.98.1 w/ VB 3.7.4p1 using the included plugin and instructions on the first post of this thread.

I'm limiting sphinx and vb to 500 results.

Everything appears to be working fine, except clicking on "Find all posts/threads by user..." from within the member's profile when the user has hundreds ( vs tens ) of posts. This results in the following error:

The page you are trying to view cannot be shown because it uses an invalid or unsupported form of compression.

MySQL error logging is on, but no errors are showing up in the logs.

I'm using everything stock as provided in examples ( sphinx.conf ) and search_sphinx.php, save I uncommented prefixchoice for vb 3.7 and re-replaced INSERT INTO w/ REPLACE INTO.

I'll post configs if necessary.

Any suggestions/help?

Thanks!

--------------- Added 1232121975 at 1232121975 ---------------

It may be worthy to note that going to Search/Advanced Search and searching for posts/threads by user works without issue.

amcd
01-16-2009, 04:39 PM
That search does not use sphinx AFAIK. And the error you are reporting looks like a Firefox caching problem.

tmc
01-20-2009, 08:27 AM
Running into a problem with sphinx in terms of getting statistics for users. Whenever I try to click on Find all posts by username or Find all threads started by username in the statistics section for a user I always get the following message back:

1. Sorry - no matches. Please try some different terms. %1$s

Any of you guys have any ideas? I know for a face there are threads/posts for the user I'm analyzing. Almost looks like something isn't being escaped out in the sphinx code. I installed sphinx using the plugin method with search_sphinx.php file. Normal search on site seems to work. It's just seems messed up when trying to analyze user stats.
Anyone have a solution for this yet?? I basically can't turn Sphinx back on until I know how to fix it.. :/

I'm using 3.8.0

djxcee
01-30-2009, 08:10 AM
I'm sorry if it's already answered (can't go though 44 pages...)

How much CPU load would sphinx take off compare to the default search engine?

amcd
01-30-2009, 08:34 AM
That depends on a large number of factors and cannot be quantified by a rule of thumb. But believe us when we say that it will be worth the effort.

Kevlar
02-01-2009, 06:19 PM
I'm sorry if it's already answered (can't go though 44 pages...)

How much CPU load would sphinx take off compare to the default search engine?
It makes a huge difference ... but then again, I have 15 million posts to index.

JamesN
02-28-2009, 08:39 PM
If I want to revert back to the default vBulletin search engine for a while, what do I need to do?

ibautocommunity
03-09-2009, 02:29 PM
If I want to revert back to the default vBulletin search engine for a while, what do I need to do?

You would have to remove the sphinx plugin / hacks and rebuild the search index.

Simetrical
03-10-2009, 02:00 PM
I have a weird problem on my site. Compare these two searches:

http://www.twcenter.net/forums/search.php?do=process&query=Darth
http://www.twcenter.net/forums/search.php?do=process&query=Darth&showposts=1

They search for exactly the same thing, the word "Darth". The first one (results as threads) shows all the recent occurrences of the word, as expected. But the second one (results as posts) cuts off at August 2, 2007. The effect doesn't occur for other words, like "Vader":

http://www.twcenter.net/forums/search.php?do=process&query=Vader
http://www.twcenter.net/forums/search.php?do=process&query=Vader&showposts=1

Does anyone have any ideas?

amcd
03-10-2009, 05:23 PM
because it returns only 500 results?

Simetrical
03-11-2009, 12:03 PM
So you're saying that if the number of results is limited, they won't be sorted by date before they're limited, but after? Is there any way to fix that? Alternatively, if I just increase max_matches to 2000 or something, that's probably not going to be a big problem, right?

amcd
03-11-2009, 02:09 PM
I really don't know. I am just thinking aloud.

Spinball
03-22-2009, 08:54 AM
Just wanted to post a note here to thank the folks who pointed me to Floren Munteanu's Sphinx solution which is now installed at www.AVForums.com.
I was so impressed that I made a video at http://www.youtube.com/watch?v=zJkVHcqfEBQ (choose the HD version).
Thanks, guys.

mlx
03-22-2009, 01:07 PM
I was so impressed that I made a video at http://www.youtube.com/watch?v=WIdsOA_hFpM (choose the HD version).
Thanks, guys.

LOL unless you are too stupid to setup a cronjob correctly it's plain and simple wrong that the search index only updates every day using the free solution.

Spinball
03-22-2009, 03:58 PM
LOL unless you are too stupid to setup a cronjob correctly it's plain and simple wrong that the search index only updates every day using the free solution.

Ok it was every day when I last used it. Have things changed since then? Apologies if so. Please let me know and I will change the video.
As I understand it, though the Sphinx reindex in the free solution was quite server intensive and took a while?
(By the way there is need to be offensively defensive about the free solution - we used it for ages and I'm not running down the efforts that people have put in for free).

snakes1100
03-22-2009, 04:11 PM
You set the cron to run when you want it to run, it was never locked to 1 day that i was aware of and i've ran it since the first free release here.

Server "intensive", that would depend on your hardware.

mlx
03-22-2009, 04:17 PM
Ok it was every day when I last used it. Have things changed since then? Apologies if so. Please let me know and I will change the video.
As I understand it, though the Sphinx reindex in the free solution was quite server intensive and took a while?
(By the way there is need to be offensively defensive about the free solution - we used it for ages and I'm not running down the efforts that people have put in for free).

Sure, re-indexing all configured indexes will take some time and might be quite server intense on a database with millions of posts. So are you saying that the paid solution is running "indexer --all --rotate" to re-index all configured indexes every 20 minutes in no time?

Or is it updating the delta indexes every 20 minutes pretty much as described here in 2007 (https://vborg.vbsupport.ru/showpost.php?p=1207641&postcount=336)?

Spinball
03-22-2009, 06:04 PM
Sure, re-indexing all configured indexes will take some time and might be quite server intense on a database with millions of posts. So are you saying that the paid solution is running "indexer --all --rotate" to re-index all configured indexes every 20 minutes in no time?

Or is it updating the delta indexes every 20 minutes pretty much as described here in 2007 (https://vborg.vbsupport.ru/showpost.php?p=1207641&postcount=336)?

Honestly I don't know. And that's the point, really. The wife and me own our forums and with the development, marketing and admin jobs I do, I plain don't have the time to understand the details of the search, either the free one here or the paid one. That's why I paid someone else to sort it all out for me.
All I know is that the indexes are updated every 10 minutes and the load on the servers is not significant.
In the free solution running on our forums, it was reindexing every day because that's how I set it up based on what information I could get out of this thread a few months ago. Maybe it wasn't set up as optimally as it could, but then that's because I found this thread to be a bit overwhelming and the instructions a bit complex. Least ways that's how I felt about it.
Most forum admin are not programmers or database experts, or particularly want to be, I would imagine. Or, in many instances, have the funds to pay someone else to do it.

That's why the free plugins released here which install with zero technical ability are so popular.

mlx
03-22-2009, 06:17 PM
I see.

Still ... the point is that the free solution can be setup the same way. Our search index is updating every 5 minutes without any significant load too. Rebuilding all indexes is a totally different thing. That setup is totally based on the free information within this thread.

That's probably the reason for the slightly offensive post earlier today ... sorry for that ;)

Spinball
03-22-2009, 06:45 PM
I see.

Still ... the point is that the free solution can be setup the same way. Our search index is updating every 5 minutes without any significant load too. Rebuilding all indexes is a totally different thing. That setup is totally based on the free information within this thread.

That's probably the reason for the slightly offensive post earlier today ... sorry for that ;)

No problem :). Still the search keyword suggestions are cool, don't you think? Some useful info there for marketing purposes also.

Simetrical
03-23-2009, 12:38 PM
Just wanted to post a note here to thank the folks who pointed me to Floren Munteanu's Sphinx solution which is now installed at www.AVForums.com.
I was so impressed that I made a video at http://www.youtube.com/watch?v=zJkVHcqfEBQ (choose the HD version).
Thanks, guys.
I'm curious, did he suggest that you post here?
Ok it was every day when I last used it. Have things changed since then? Apologies if so. Please let me know and I will change the video.
As I understand it, though the Sphinx reindex in the free solution was quite server intensive and took a while?
No. I run it every hour and don't even notice the load, either from indexing or from searching. I could probably run it every five minutes with no problem.
(By the way there is need to be offensively defensive about the free solution - we used it for ages and I'm not running down the efforts that people have put in for free).
I think people are not so much defensive about the free solution as annoyed at TECK for continually advertising his paid solution for personal profit in a thread devoted to the free solution. He had a bunch of posts in this thread deleted some time ago (https://vborg.vbsupport.ru/showthread.php?p=1464439#post1464439) for advertising, and since then you're the second or third person to bump the thread and e-mail everyone watching it to advertise his solution as well.

Nothing wrong with paid solutions for those who have trouble with the free ones, but I don't think this is the place to repeatedly advertise them, which is what's been happening in this thread over the past few months. It's off-topic, this thread is about the free hack. People can put this stuff in their signatures if they want to advertise it, or run Google ads, or mention it as an option when they're giving free help or otherwise posting on-topic.

Spinball
03-23-2009, 01:15 PM
I'm curious, did he suggest that you post here?

No. I run it every hour and don't even notice the load, either from indexing or from searching. I could probably run it every five minutes with no problem.

I think people are not so much defensive about the free solution as annoyed at TECK for continually advertising his paid solution for personal profit in a thread devoted to the free solution. He had a bunch of posts in this thread deleted some time ago (https://vborg.vbsupport.ru/showthread.php?p=1464439#post1464439) for advertising, and since then you're the second or third person to bump the thread and e-mail everyone watching it to advertise his solution as well.

Nothing wrong with paid solutions for those who have trouble with the free ones, but I don't think this is the place to repeatedly advertise them, which is what's been happening in this thread over the past few months. It's off-topic, this thread is about the free hack. People can put this stuff in their signatures if they want to advertise it, or run Google ads, or mention it as an option when they're giving free help or otherwise posting on-topic.

Ok I understand your point completely and I want to stress that I made my own mind up to thank the people who directed me towards the other solution. I posted here merely cos I was so enthusiastic and maybe I let that get the better of me.
I understand also that you would get annoyed at someone pimping their solution and I don't want to be accused of being associated with that so I will not post in here any more about it. :o
If you could reindex every 5 minutes then I would suggest that you do. Seems to me that users will want their posts to appear in the search ASAP.

Simetrical
03-23-2009, 01:26 PM
Ah, it looks like it's actually every 20 minutes that I have the deltas built. No one's ever complained that it takes 10 minutes on average for their posts to appear in the index, so I'm fine with leaving it as it is.

It takes under 200 ms for rotate.sh to run, so actually I could probably run it every minute if I really felt like it.

TECK
03-24-2009, 10:24 AM
Sure, re-indexing all configured indexes will take some time and might be quite server intense on a database with millions of posts. So are you saying that the paid solution is running "indexer --all --rotate" to re-index all configured indexes every 20 minutes in no time?

Or is it updating the delta indexes every 20 minutes pretty much as described here in 2007 (https://vborg.vbsupport.ru/showpost.php?p=1207641&postcount=336)?
If you do an "indexer --all --rotate" every 20min, you will blow your servers. It takes in average 20-30min to rebuild the indexes from scratch on a 10GB database. In my product, the indexes are updated every 10min, regardless of the database size. It takes in average 30sec-1min to refresh the data, all the time with no "indexer --all --rotate" commands used ever. Of course, that includes the threads/posts that were deleted or edited. What is the point to store a deleted post into indexes. Not to mention that if you edit the contents of a post, the deleted keywords should not be available into search. When you rebuild the indexes, all "errors" are gone... but that occurs every 24hrs for the vB.org product.

Also, I don't know if you consider this advertising... because I don't. It is a logical explanation on how my system works. It is like talking about Community Server at vBulletin... yet I don't see anyone chasing the people who reply about an open asked question related to their product and replied by one of their reps.

mlx
03-24-2009, 11:45 AM
Yeah I'm sure your product is great for those who can afford it and probably superior compared to the free solution. However we knew that before. No need to repeat it over and over again. I guess that's why some people become pissed off here. Easy as that.

Just my 2 cents though.

TECK
03-24-2009, 12:13 PM
<i>> No need to repeat it over and over again.</i>
I totally agree. In the previous posts, I only replied to your technical question and passed the general questions.

Regards,

Simetrical
03-24-2009, 12:51 PM
If you do an "indexer --all --rotate" every 20min, you will blow your servers. It takes in average 20-30min to rebuild the indexes from scratch on a 10GB database. In my product, the indexes are updated every 10min, regardless of the database size. It takes in average 30sec-1min to refresh the data, all the time with no "indexer --all --rotate" commands used ever. Of course, that includes the threads/posts that were deleted or edited. What is the point to store a deleted post into indexes. Not to mention that if you edit the contents of a post, the deleted keywords should not be available into search. When you rebuild the indexes, all "errors" are gone... but that occurs every 24hrs for the vB.org product.
That doesn't seem like a huge advantage to me. The deltas seem to work well enough. I don't actually know how Sphinx works, though ― you're saying that the solution posted here, rotating only the deltas, doesn't pick up edited posts? Any other disadvantages to it?

TECK
03-25-2009, 01:34 AM
Personally, I think it is important to have accurate results. If deleted or edited posts still show in search results (when it should not) as well that once a day you perform a full reindex (when you should not), that affects the overall search accuracy as well the server performance. Think of it this way: you have several threads in a forum where the users change a price for their "to be sold" items published few days ago. Because your search index is not updated right away (10min max), other forum users will never know that the price on certain items was revised until next day when indexes are rebuilt entirely...

A better example, related to search accuracy. Let's presume you use the default vBulletin search, query the entire posts for 'spaghetti' (most intensive) and display the results as threads. Then, you perform the same type of search with the vB.org Sphinx search. You will notice the number of missing results very easy, while performing the search with vB.org Sphinx product. There are many other aspects that I rather not cover, because it will sound like I'm trying to advertise my product... Fell free to ask more questions in my forum.

charlie71
04-18-2009, 05:06 PM
Its a great piece of software!
May one of you can help me with my problem:

Search for word -> No results
Search for the same word again -> Results are found

kontrabass
04-20-2009, 12:58 PM
Is there a definite solution for the "duplicate key" errors? Had been running Orban's original Sphinx solution for 2 years... Migrated site to new servers, then implemented Orbans version .1 solution. Now I'm getting these errors like many others in this thread:


MySQL Error : Duplicate entry 'c7ff13943221ad39284628de371af860-lastpost-DESC' for key 2
Error Number : 1062


I've tried repairing, optimizing, and truncating the table, No change :(

Someone mentioned modifying the php to read "REPLACE" instead of "INSERT" ? I'm running 3.6

Thanks!

kmike
05-17-2009, 07:05 AM
Someone mentioned modifying the php to read "REPLACE" instead of "INSERT" ? I'm running 3.6I'm not sure why there are duplicate keys, but looking at the script, I guess that changing INSERT to REPLACE at the end of the script will indeed help.


Another note for those running Sphinx search: if you have "finduser" action handled by Sphinx, it will _not_ find user's posts comprised entirely of separators, i.e. not containing any accepted characters from Sphinx charset_table.
Some examples of the posts ignored by Sphinx:
.....
:) :( :)
--------->
It may appear as not important, but it's something to remember when the user's post count differs from the number of his posts found by Sphinx.

DaiTengu
05-28-2009, 08:08 PM
I'm looking to upgrade to 3.8, and I've seen a few posts stating that 0.1 is not working properly there. Can anyone confirm/deny this?

mlx
05-29-2009, 06:21 AM
We are still using the old instructions (https://vborg.vbsupport.ru/showpost.php?p=1283359&postcount=387) with vB 3.8.2 and didn't hear any complaints yet, so I believe it's still working nicely. Not sure about that plugin version though.

DaiTengu
05-29-2009, 09:26 PM
We are still using the old instructions (https://vborg.vbsupport.ru/showpost.php?p=1283359&postcount=387) with vB 3.8.2 and didn't hear any complaints yet, so I believe it's still working nicely. Not sure about that plugin version though.


Yeah, I'm using the plugin version.


Maybe I'll just have to spend some time running some more test upgrades.

mute
06-16-2009, 03:51 PM
Has anyone looked into the MySQL binary support in Sphinx 0.9.9? It seems to me like this would greatly simplify the integration of Sphinx into vB. The gist of it is:

"The ultimate new feature couple is MySQL binary protocol and SphinxQL query language. Meaning that searchd can now pretend it's mysqld. Meaning that you can use ye good olde mysql command-line client to connect to searchd and fire your queries using regular SELECT syntax!"

For more info: http://sphinxsearch.com/docs/current.html#sphinxql

kmike
06-21-2009, 10:41 AM
SphinxQL isn't quite ready for the production (http://www.mysqlperformanceblog.com/2009/04/19/talking-mysql-to-sphinx/) at this moment.

RedWingFan
07-15-2009, 04:01 PM
I've had an odd problem come up.

We only have Sphinx running on our private testing-only forum, where our staff puts it through the paces a bit. I noticed a few days ago that, during a search, I came up with this error message:

unknown local index 'threaddelta' in search request

Another of our staff got that error yesterday. I've been trying other searches on and off, but I can't duplicate the error.

On the server side, here's what I found. In my Sphinx data directory, all the files look OK, except for the threaddelta.* files. In fact, there is a set of threaddelta.* files that hasn't been touched since May 29. However, there is now a new group of files, threaddelta.new.* in the same directory that are getting updated every three minutes by our cron job.

I realize I could delete all the files rebuild the indexes, which I will do (this isn't exactly a critical forum, as it's just for testing), but I would still like to know how Sphinx is generating the "threaddelta.new.* files vs. the original "threaddelta.*" files. None of the config files or cron entries have been touched since last year, when I set this up.

We're about to roll out Sphinx on our main forum as we're getting slammed with traffic lately, but I'm still hesitant due to unresolved bugs...

UK Jimbo
07-15-2009, 04:27 PM
On the server side, here's what I found. In my Sphinx data directory, all the files look OK, except for the threaddelta.* files. In fact, there is a set of threaddelta.* files that hasn't been touched since May 29. However, there is now a new group of files, threaddelta.new.* in the same directory that are getting updated every three minutes by our cron job.

Is the cronjob that's running the indexer creating any output?

Do you see any output if you run the indexer from the command line?

Can you copy/paste the command line you're using along with any output back here?

mute
07-15-2009, 04:40 PM
SphinxQL isn't quite ready for the production (http://www.mysqlperformanceblog.com/2009/04/19/talking-mysql-to-sphinx/) at this moment.

Well, a few months have gone by, hopefully Andrew has fixed most of the outstanding bugs by now. I'm still excited at the prospect of leaving most of the vB search code alone, and just hooking in before the queries get executing and diverting them to sphinx (or something along those lines).

Our sphinx implementation has been going strong for a few years now, and while we still don't have "Find all posts/threads" queries hitting it, or any of the new search functionality, I'm still enormously pleased with what it can do for you.

RedWingFan
07-15-2009, 07:20 PM
Is the cronjob that's running the indexer creating any output?

Do you see any output if you run the indexer from the command line?

Can you copy/paste the command line you're using along with any output back here?

Here is the command line I use (within cron):

/usr/home/xxxx/sphinx/bin/indexer --config /usr/home/xxxx/sphinx/bin/sphinx_rr.conf --rotate postdelta threaddelta

I just ran it with this output:

Sphinx 0.9.8-rc2 (r1234)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/home/xxxx/sphinx/bin/sphinx_rr.conf'...
indexing index 'postdelta'...
collected 10 docs, 0.0 MB
collected 10903 attr values
sorted 0.0 Mvalues, 100.0% done
sorted 0.0 Mhits, 100.0% done
total 10 docs, 2932 bytes
total 0.019 sec, 152923.38 bytes/sec, 521.57 docs/sec
indexing index 'threaddelta'...
collected 2 docs, 0.0 MB
collected 100 attr values
sorted 0.0 Mvalues, 100.0% done
sorted 0.0 Mhits, 100.0% done
total 2 docs, 40 bytes
total 0.010 sec, 4000.00 bytes/sec, 200.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=605).


It's frustrating because it is an intermittent error--I kept trying to reproduce it but had no luck.

The output looks OK. But, it's updating the threaddelta.new.* files, not the threaddelta.* files (which remain at zero bytes, dated 5/29/09).

It's not a huge deal, since I can dump and regenerate all the indexes, but I just don't want this to happen if we put this on our "production" forum, and was curious to know how it happened.

UK Jimbo
07-15-2009, 07:57 PM
Those .new files relate to the --rotate option and the indexer read more about it here: http://sphinxsearch.com/docs/current.html#ref-indexer

I would suspect that something preventing the indexer from rotating the files out perhaps the file permissions?

I'd suggest deleting all of the threaddelta files and then re-indexing.

RedWingFan
07-15-2009, 09:19 PM
Those .new files relate to the --rotate option and the indexer read more about it here: http://sphinxsearch.com/docs/current.html#ref-indexer

That was what I suspected--something like a temporary file, IOW.

I would suspect that something preventing the indexer from rotating the files out perhaps the file permissions?

Nothing has changed on the server, so I'd attribute it to some kind of glitch in one (or all?) of those files dating back to May. Permissions and ownership matches all the others in the same directory, which was the first thing I checked.

I just tried deleting the threaddelta files, and reran the command line:

/usr/home/shtv/sphinx/bin/indexer --config /usr/home/shtv/sphinx/bin/sphinx_rr.conf threaddelta

This created a new set of threaddelta.* files. OK, so far so good. But then I go and retry with the --rotate option, and we're back to having the .new.* files, and the threaddelta.* files don't get updated.

In the same directory, the post.* and postmeta.* files are all working properly.

Deeply weird...I could see if I had changed the configuration of this mess awhile ago, but I actually haven't touched it since July last year, when I first installed it. No other changes on the server, and we have plenty of disk space.

Still poring over the Sphinx docs you pointed to...but am not seeing much else helpful yet.

Thanks!

UK Jimbo
07-15-2009, 09:35 PM
have you tried restarting searchd and then trying it all over again?

Sounds like a strange one. Perhaps worth posting on the forum over at sphinxsearch.com

RedWingFan
07-16-2009, 01:07 AM
have you tried restarting searchd and then trying it all over again?


I'm willing to try anything. Although, the problem is with the indexer...unless something in searchd is somehow preventing the files from rotating properly (maybe searchd reporting that it's "busy", in other words, so the .new.* files don't get rotated in).

Sounds like a strange one. Perhaps worth posting on the forum over at sphinxsearch.com

I'll do a search over there--thanks! I'll probably try dumping the entire index, regenerating a new one, restart searchd, etc., and start with a clean slate before pestering them too much over there. :D

UK Jimbo
07-16-2009, 07:00 AM
I'm willing to try anything. Although, the problem is with the indexer...unless something in searchd is somehow preventing the files from rotating properly (maybe searchd reporting that it's "busy", in other words, so the .new.* files don't get rotated in).

In my mind it's 50:50 whether it's the indexer or searchd which is causing the problem. The indexer seems to be creating the index happily with the .new file name. After that it's the job of the search daemon to rotate the new index in.

With some of those .new files there what happens if you signal a restart to searchd with (not tested but I think this is correct). killall -HUP searchd

That's the same method that the indexer uses to signal a rotate to searchd. I can't remember if searchd keeps a system log of it's activity. if so the restarts (and any possible problems) might be reported.

Good luck!

RedWingFan
07-18-2009, 01:57 AM
I did:

searchd --config /.../my.conf --stop

...to stop, then restarted searchd. First time I ran the indexer, the .new.* files all were deleted. One hour and about 20 rotations later, they're still gone.

Thanks much--it worked!

UK Jimbo
07-31-2009, 09:48 AM
Afraid I've not trawled the whole thread for this but here's a product that makes the install of the two small plugins even easier. Use at your own risk, etc...

RedWingFan
08-05-2009, 01:51 AM
Afraid I've not trawled the whole thread for this but here's a product that makes the install of the two small plugins even easier. Use at your own risk, etc...

Thanks! I'll try it out and let you know how it works. I'm about to pull the plug on fulltext and try Sphinx on our main (production) forum. Can't live w/o InnoDB tables, as I'm finding. I'll report back here, good or bad. :D

The hardest part is trying to remember what I did to get Sphinx running on our test forum. Working my way through the posts here, and my own notes...I think I'm getting it. ;)

UK Jimbo
08-05-2009, 07:08 AM
The hardest part is trying to remember what I did to get Sphinx running on our test forum. Working my way through the posts here, and my own notes...I think I'm getting it. ;)

I'd recommend (this is true of any production roll-out):


Backup test site
Place a full copy of the live code (and db dump too if you think it's needed)
Go through migration/roll-out process on your test platform making notes
Repeat the above until you're confident of the steps you need to make
Repeat on production. It'll be easy as you've just done it on the test platform!

RedWingFan
08-05-2009, 02:10 PM
I have it running OK now.

Really basic (OK, stupid :D ) question. What do I need to turn OFF in vB to keep it from using its own internal search engine? IOW, aren't the built-in vBulletin search index tables going to be populated with the newest posts as they are made? From the way I see it, Sphinx bypasses the whole vB system, so vB shouldn't need to store any of the search terms anymore, unless another search process uses it. I've already dropped the fulltext indexes and changed the search type from fulltext back to vB internal, and am switching selected tables over to InnoDB to clear up the load issues we're having.

amcd
08-06-2009, 07:34 AM
I have it running OK now.

Really basic (OK, stupid :D ) question. What do I need to turn OFF in vB to keep it from using its own internal search engine? IOW, aren't the built-in vBulletin search index tables going to be populated with the newest posts as they are made? From the way I see it, Sphinx bypasses the whole vB system, so vB shouldn't need to store any of the search terms anymore, unless another search process uses it. I've already dropped the fulltext indexes and changed the search type from fulltext back to vB internal, and am switching selected tables over to InnoDB to clear up the load issues we're having.
No.

First set the search type back to fulltext. This will tell VB not to populate the search tables (word and another one). The onus of maintaining search data now shifts to MySQL.

Then drop the fulltext indices from post and thread tables.

RedWingFan
08-06-2009, 01:33 PM
I'll give that a try, thanks! My post and thread tables are converted to InnoDB, so they can't accept fulltext anyway.

Will doing this generate any kind of error, either from MySQL or vB? I don't think it will, but our visitors have a way of shaking out any type of rare bug or hiccup, when I least expect it. ;)

amcd
08-06-2009, 02:58 PM
If your tables are innodb, there will certainly be an error when you try to set search back to fulltext.

RedWingFan
08-11-2009, 03:24 PM
Just looked at this again: if I try to change it back to fulltext, it will attempt to change the InnoDB tables back to MyISAM. Don't want that!

In the settings table, "fulltextsearch" has its "value" column set to "0". There are other parameters in that row. Think I'm safe to change "value" to "1" (and rebuild the datastore cache)?

I just have to comment that since changing tables back to InnoDB and implementing Sphinx, our forum runs SO much better now! I had fulltext previously. Searching for "Steve" as a search word, it would take 35-40 seconds to get results. (Steve is our forum's owner, so his name appears in most threads.) With Sphinx, the searches come back on average around 0.4 seconds. And with the InnoDB change, we don't have stacks of queries waiting in the queue anymore.

The only hiccup I've had is that I once again had a set of *.new.* files, this time for my post indexes. I killed and restarted searchd, and reran the update, and it all rotated properly. My clue was a forum member saying he couldn't search for his posts for the past few days. Sure enough, the stale indexes were dated around the time he was unable to find his posts. I may have to run a cron job to check for any *.new.* files in that directory, and possibly put together a shell script to kill and restart searchd.

amcd
08-12-2009, 04:55 AM
I may have to run a cron job to check for any *.new.* files in that directory, and possibly put together a shell script to kill and restart searchd.I would rather look for the cause of the malfunction. If most people do not get such problems, why should you?

--------------- Added 1250056638 at 1250056638 ---------------

Think I'm safe to change "value" to "1" (and rebuild the datastore cache)? I have no idea. Maybe someone from Jelsoft can answer. Or you can try reading the AdminCP PHP script which changes search type.

UK Jimbo
08-12-2009, 07:06 AM
I would rather look for the cause of the malfunction. If most people do not get such problems, why should you?

I agree on this one.

RedWingFan do you have a log of what the indexer is doing incuding any messages from the cronjobs?

Shamil.
08-13-2009, 06:10 PM
Is this verified to work with the latest vB, 3.8.4 ?

RedWingFan
08-16-2009, 06:12 PM
I have no idea. Maybe someone from Jelsoft can answer. Or you can try reading the AdminCP PHP script which changes search type.

Well, I changed the value in the database, regenerated the options datastore, and it's working properly (as though I had chosen "Fulltext"). So that's a success!

RedWingFan do you have a log of what the indexer is doing incuding any messages from the cronjobs?

Memory allocation error during rotate. It happened again last night, and I found the memory allocation error in the log. I changed this value in my conf file: "seamless_rotate=0". Searching will be interrupted a bit when the indexes rotate, but since I do this during off-hours, it won't affect many users at all. (Apparently, Sphinx's indexer will load indexes into RAM so that users can still search while the old indexes are being replaced by the new.) The mem_limit is at 512MB right now, but I don't want to increase that and possibly starve everything else on the server. http://www.sphinxsearch.com/docs/current.html#conf-seamless-rotate

One other setting looks a bit deceptive: max_matches is set for 1000 in my conf file, but I'm only pulling in 500 in vB's search. What Sphinx does, for max_matches, is send back the BEST 1000 matches, not necessarily just running the search and returning the FIRST 1000 matches it finds. My point here is that I should be setting the maximum search results in the conf file and in vB to be the same number. Basically I'm searching for 1000 best matches, but throwing away 500 of them for visitors. I will probably bump the forum to display those 1000 matches. Visitors will think they're getting a bonus. ;) http://www.sphinxsearch.com/docs/current.html#conf-max-matches

Here is something else: http://www.sphinxsearch.com/docs/current.html#conf-enable-star . You can search Sphinx using the asterisk ("star") as a wildcard. I have thought of enabling this, but I am thinking that the search_sphinx.php file (or vB itself) would strip out the asterisk and make no difference during searches. This would be a neat addition. Has anyone else here tried it?

I came across another option for the indexer. When indexing, there is a --merge option for the indexer, which will merge your delta indexes with your main indexes, rather than generating new indexes once per day. I have it running now where the indexes are regenerated once each night. Would there be any disadvantage to using merge? If it goes correctly, it should work just as well, since you have essentially the same indexes when you're finished. But I can also see a tiny opportunity for the main indexes to get corrupted. Otherwise, --merge takes less time and CPU cycles, which is attractive. http://www.sphinxsearch.com/docs/current.html#index-merging

Finally, I see that Sphinx also has a plugin for MySQL, where you can specify using SphinxSE as an additional engine in MySQL. It will not do us much good here, I know, but I could see a future use for it between vB and Sphinx. http://www.sphinxsearch.com/docs/current.html#sphinxse

IMHO, given how much Sphinx's popularity is growing, and after having pored over the documentation this afternoon, it is disappointing that there apparently will not be any built-in support for Sphinx in vB 4.0. Sphinx can do a lot, and is a lot more flexible, than the built-in vB search, as well as MySQL's own fulltext indexing. A shame we'll probably still have to patch these hacks together to use Sphinx...

amcd
08-17-2009, 04:42 AM
I came across another option for the indexer. When indexing, there is a --merge option for the indexer, which will merge your delta indexes with your main indexes, rather than generating new indexes once per day. I have it running now where the indexes are regenerated once each night. Would there be any disadvantage to using merge? If it goes correctly, it should work just as well, since you have essentially the same indexes when you're finished. But I can also see a tiny opportunity for the main indexes to get corrupted. Otherwise, --merge takes less time and CPU cycles, which is attractive. http://www.sphinxsearch.com/docs/cur...#index-merging (http://www.sphinxsearch.com/docs/current.html#index-merging)moved/merged/edited/deleted posts are a huge reason to regenerate the whole index periodically.

RedWingFan
08-17-2009, 03:35 PM
moved/merged/edited/deleted posts are a huge reason to regenerate the whole index periodically.

Oh jeez...yeah, you're right! I didn't even think of that!

RedWingFan
08-24-2009, 05:48 PM
Here is something else: http://www.sphinxsearch.com/docs/current.html#conf-enable-star . You can search Sphinx using the asterisk ("star") as a wildcard. I have thought of enabling this, but I am thinking that the search_sphinx.php file (or vB itself) would strip out the asterisk and make no difference during searches. This would be a neat addition. Has anyone else here tried it?

Just curious if anyone has tried this at all. (This was buried in my last lengthy post.) I believe Sphinx also has phrase searching, where you enter search terms in quotes, but I have a feeling that vB strips out anything that's not alphanumeric...

Both of these would be a welcome addition to vB search. :)

columbusgeek
09-09-2009, 07:47 PM
RedWingFan are you using this on your music forum? Was trying to find examples of people using it, but having a hard time trying to find out if they are or not. Its a lot of work to get this installed on the forum, I want to make sure its worth it.

RedWingFan
09-09-2009, 08:27 PM
Yes, we're using it, and aside from a few minor quirks (which I can't even remember right now), it is working great! I don't know if it helped server loads or not (we had to do several changes at once to get us out of our overloads), but it returns searches much faster than using vB's own built-in search.

I still with vB would develop something official for this. The built-in searching (using vB or fulltext) is just too hard on the server resources...

amcd
09-10-2009, 06:24 AM
RedWingFan are you using this on your music forum? Was trying to find examples of people using it, but having a hard time trying to find out if they are or not. Its a lot of work to get this installed on the forum, I want to make sure its worth it.
Believe me, it is worth the trouble.

cobaku
09-29-2009, 01:24 AM
i somehow managed to set it up was difficult for me cause it was with latest stable version of sphinx

I have 1 problem i can only search latest data
is the problem this line
$cl->SetLimits(0, $vbulletin->config['Sphinx']['limit'], 1000);


indexer --config /etc/sphinx/sphinx.conf --all

did not work for me

do you see anything strange below ?

search araba
Sphinx 0.9.8.1-release (r1533)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/etc/sphinx/sphinx.conf'...
index 'post': query 'araba ': returned 1000 matches of 7898 total in 0.004 sec

displaying matches:
1. document=80773, weight=2, forumid=994, threadid=390272, userid=16943, postuserid=16943, dateline=1240060017
2. document=84269, weight=2, forumid=1086, threadid=391331, userid=16952, postuserid=16952, dateline=1240498824
3. document=86419, weight=2, forumid=955, threadid=391866, userid=16974, postuserid=16974, dateline=1240837075
4. document=91826, weight=2, forumid=955, threadid=393613, userid=16974, postuserid=16974, dateline=1241437858
5. document=94475, weight=2, forumid=955, threadid=394697, userid=16974, postuserid=16974, dateline=1241693582
6. document=97095, weight=2, forumid=955, threadid=395466, userid=18089, postuserid=18089, dateline=1241947814
7. document=102383, weight=2, forumid=986, threadid=396996, userid=18498, postuserid=18498, dateline=1242514332
8. document=103128, weight=2, forumid=989, threadid=397406, userid=17130, postuserid=17130, dateline=1242605991
9. document=107136, weight=2, forumid=955, threadid=399391, userid=18089, postuserid=18089, dateline=1242998153
10. document=116811, weight=2, forumid=1152, threadid=402536, userid=16952, postuserid=16952, dateline=1244043857
11. document=123798, weight=2, forumid=926, threadid=404828, userid=16952, postuserid=16952, dateline=1244710168
12. document=128288, weight=2, forumid=955, threadid=406456, userid=17037, postuserid=17037, dateline=1245165814
13. document=148144, weight=2, forumid=1190, threadid=413235, userid=21936, postuserid=21936, dateline=1208151468
14. document=210224, weight=2, forumid=955, threadid=427389, userid=16952, postuserid=16952, dateline=1246699018
15. document=210783, weight=2, forumid=939, threadid=427756, userid=16974, postuserid=16974, dateline=1246781891
16. document=214556, weight=2, forumid=944, threadid=430330, userid=18089, postuserid=18089, dateline=1247149305
17. document=227784, weight=2, forumid=955, threadid=438935, userid=18089, postuserid=18089, dateline=1248347435
18. document=243160, weight=2, forumid=1220, threadid=451956, userid=25234, postuserid=25234, dateline=1249068635
19. document=246529, weight=2, forumid=1220, threadid=455202, userid=25234, postuserid=25234, dateline=1249081047
20. document=258231, weight=2, forumid=946, threadid=466152, userid=25264, postuserid=25264, dateline=1249240240

words:
1. 'araba': 7898 documents, 11201 hits

index 'postdelta': query 'araba ': returned 0 matches of 0 total in 0.000 sec

words:
1. 'araba': 0 documents, 0 hits

index 'thread': query 'araba ': returned 772 matches of 772 total in 0.001 sec

displaying matches:
1. document=372217, weight=1, forumid=956, dateline=1222623543, replycount=0, postuserid=16925, firstpostid=30496, lastpost=1222623543
2. document=374052, weight=1, forumid=1152, dateline=1224631688, replycount=0, postuserid=16943, firstpostid=35303, lastpost=1224631688
3. document=374598, weight=1, forumid=955, dateline=1225042981, replycount=0, postuserid=16930, firstpostid=36528, lastpost=1225042981
4. document=375077, weight=1, forumid=989, dateline=1225303575, replycount=0, postuserid=17146, firstpostid=37667, lastpost=1225303575
5. document=375583, weight=1, forumid=960, dateline=1225568475, replycount=13, postuserid=17130, firstpostid=38699, lastpost=1225569202
6. document=376008, weight=1, forumid=1066, dateline=1225729012, replycount=0, postuserid=17130, firstpostid=39542, lastpost=1225729012
7. document=376943, weight=1, forumid=962, dateline=1226785215, replycount=0, postuserid=16927, firstpostid=42252, lastpost=1226785215
8. document=379723, weight=1, forumid=1042, dateline=1230140294, replycount=0, postuserid=16922, firstpostid=49398, lastpost=1230140294
9. document=380356, weight=1, forumid=1042, dateline=1230341207, replycount=1, postuserid=16922, firstpostid=50241, lastpost=1230343991
10. document=382927, weight=1, forumid=930, dateline=1234112980, replycount=0, postuserid=18089, firstpostid=57175, lastpost=1234112980
11. document=383068, weight=1, forumid=942, dateline=1234249993, replycount=0, postuserid=16952, firstpostid=57670, lastpost=1234249993
12. document=383861, weight=1, forumid=940, dateline=1234901559, replycount=3, postuserid=16943, firstpostid=60303, lastpost=1234944063
13. document=384377, weight=1, forumid=1128, dateline=1235439321, replycount=1, postuserid=16974, firstpostid=61944, lastpost=1235511123
14. document=385050, weight=1, forumid=1081, dateline=1235918502, replycount=3, postuserid=18089, firstpostid=64607, lastpost=1235990640
15. document=386488, weight=1, forumid=947, dateline=1236764155, replycount=0, postuserid=16974, firstpostid=69070, lastpost=1236764155
16. document=387599, weight=1, forumid=940, dateline=1237369853, replycount=0, postuserid=16974, firstpostid=71921, lastpost=1237369853
17. document=390272, weight=1, forumid=994, dateline=1240060017, replycount=0, postuserid=16943, firstpostid=80773, lastpost=1240060017
18. document=391331, weight=1, forumid=1086, dateline=1240498824, replycount=1, postuserid=16952, firstpostid=84269, lastpost=1240500212
19. document=391866, weight=1, forumid=955, dateline=1240837075, replycount=0, postuserid=16974, firstpostid=86419, lastpost=1240837075
20. document=393613, weight=1, forumid=955, dateline=1241437858, replycount=0, postuserid=16974, firstpostid=91826, lastpost=1241437858

words:
1. 'araba': 772 documents, 780 hits

index 'threaddelta': query 'araba ': returned 0 matches of 0 total in 0.000 sec

words:
1. 'araba': 0 documents, 0 hits


is it normal nothing returns from postdelta and threaddelta

--------------- Added 1254255717 at 1254255717 ---------------

if i can ever overcome this problem i will make a post sphinx search for dummies this post is so messed up.

Raun
10-21-2009, 05:06 AM
If I am installing this on a vbulletin installation that is setup as follows what box would it go on?

1load balancer
4 web servers
1 master DB
1 slave DB

DaiTengu
10-21-2009, 05:35 AM
If I am installing this on a vbulletin installation that is setup as follows what box would it go on?

1load balancer
4 web servers
1 master DB
1 slave DB

slave DB server would be your best bet.

UK Jimbo
10-21-2009, 07:22 AM
If I am installing this on a vbulletin installation that is setup as follows what box would it go on?

1load balancer
4 web servers
1 master DB
1 slave DB

It depends on the load on the two db servers. My guess would be that the least loaded in this configuration will be the master db server. Both master and slave are handling the same number of write (INSERT/UPDATE/DELETE) queries but the slave will be handling a load of reads too.

With a bit of scripting wrapping index copying (rsync probably) and restarting searchd you could generate the indexes on one machine and then copy them out to another one. This might be handy if you want to build some kind of redundancy into your setup or be able to balance sphinx traffic.

Raun
10-21-2009, 02:32 PM
If I were to install it on both DBs would that work the same as copying the indexes?

UK Jimbo
10-21-2009, 02:54 PM
If I were to install it on both DBs would that work the same as copying the indexes?

There are two separate parts of using Sphinx.


Indexing (building up your index of data from the database)
Serving results with searchd


If you were able to serve results from either of the two machines then you'd be able to balance some of the load and also handle a failure of one of the instances.

To save processing time it should be quicker to index on one of the two machines and then copy the indexes across. The alternative is to have each of the machines create their own index which will use more processing power.

Raun
10-21-2009, 03:33 PM
Sorry for all the noob questions but I am a little new to this.

Once sphinx is installed on the DB and the config is setup how is it hooked into vb to serve the results there? Is that built into the config somewhere? Do I need to install anything else to get the results served?

abuthabit
10-29-2009, 09:46 PM
Hello

sphinx installed in our forums and it works fine.. but when you type something that it's not included in the index for example " jkbljkhbjk" We got error as below

FireFox:
Content Encoding Error

The page you are trying to view cannot be shown because it uses an invalid or unsupported form of compression.
* Please contact the website owners to inform them of this problem.

Safari

Safari can’t open the page.
Safari can’t open the page “http://www.****.com/ee/search.php?do=process”. The error is: “cannot decode raw data”

IE8
IE can't display this webpage

No errors in the log: lightty .. Mysql .. sphinx logs

sphinx/query.log :
[Fri Oct 30 01:36:31.808 2009] 0.000 sec [all/0/attr- 0 (0,500)] [threadtitles] jkbljkhbjk

Installation guide from the following post
https://vborg.vbsupport.ru/showpost.php?p=1283359&postcount=3877

How could I fix this issue?

profanitytalker
10-30-2009, 06:49 AM
Can someone send me their edited search.php file? Mine won't work. I get confused by the last line of the edit...

Comment out "unset($datecut);" (-> "#unset($datecut);") so includes/sphinx.php can use it (for date range search). About line 12xx. You have to add a "#".

Where do I add the "#"?

abuthabit
10-31-2009, 09:35 AM
Can someone send me their edited search.php file? Mine won't work. I get confused by the last line of the edit...

Comment out "unset($datecut);" (-> "#unset($datecut);") so includes/sphinx.php can use it (for date range search). About line 12xx. You have to add a "#".

Where do I add the "#"?

This line:
unset($datecut);

add # before unset in the same line

so it should be like this

#unset($datecut);

cobaku
11-13-2009, 09:53 PM
This thing is too hard to set up. I have failed on my 10th try again.

ssh search works just fine

but search.php rejects to paste results to web !
i am guessing search_sphinx.php does not work on 3.8.4 which is my only guess

Anyone tried the new method on post #1 on 3.8.4 ? send search_sphinx.php pls.

Vbulletin Sphinx plugin is sold for 2000$ :p

ATForums
12-05-2009, 02:26 PM
Hi,

Using this with great success - trying to allow 3 char words. I have changed sphinx to include 3 chars ("wii") and results are returned from the CLI, so Sphinx seems OK.

But, in VB, I always get a no results message when I search for "wii" - I changed the min search size in VB Options to 3, but it doesn't help.

Any ideas?

John

--------------- Added 1260032491 at 1260032491 ---------------

Hi,

Using this with great success - trying to allow 3 char words. I have changed sphinx to include 3 chars ("wii") and results are returned from the CLI, so Sphinx seems OK.

But, in VB, I always get a no results message when I search for "wii" - I changed the min search size in VB Options to 3, but it doesn't help.

Any ideas?

John

Err...stupid me....I needed to restart searchd....did that and now all is well... :)

Thought all I had to do was rebuild indexes...

John

RedWingFan
12-10-2009, 07:23 PM
Sphinx 0.9.9-Stable is now out. I just compiled it on our dedicated box, killed and restarted searchd, and am rebuilding the indexes. So far, so good! I will have to wait until next month to see what new parameters I can work with to optimize it further (I'm under the gun for a paying job right now...gotta take the money while I can!)

ivanp
12-22-2009, 07:15 PM
Is there any update for this plugin for vBulletin 4?

columbusgeek
12-22-2009, 07:29 PM
Is there any update for this plugin for vBulletin 4?

vB4 has been gold for less the 24 hours now. Are you serious?

kontrabass
12-29-2009, 06:19 PM
I'm going to get some guys on this - VB4.0's search is way too slow for our forum of 5M posts. It'll take us some time though, we just started playing around with 4.0 in a test enviro. Floren's commercial sphinx solution (Searchlight) will have 4.0 support in the future as well.

RedWingFan
12-29-2009, 06:48 PM
We have close to 4.5 million posts, and even in our private test board with only around 170,000 posts, search drags. I can't see going back to MySQL fulltext search any time soon. (Plus, there are too many other issues with 4.0 that will keep us from upgrading for quite a few months.)

snakes1100
12-29-2009, 07:00 PM
We have close to 4.5 million posts, and even in our private test board with only around 170,000 posts, search drags. I can't see going back to MySQL fulltext search any time soon. (Plus, there are too many other issues with 4.0 that will keep us from upgrading for quite a few months.)

I'd guess it's those Sheep wondering over from Novi & clogging your server up w/ wool! :D

RedWingFan
12-29-2009, 08:24 PM
I'd guess it's those Sheep wondering over from Novi & clogging your server up w/ wool! :D

True dat! :D

;)

--------------- Added 1262125966 at 1262125966 ---------------

P.S. I'd blame the Steelers, actually--our server is in Pittsburgh. It's just the boneheaded server admin who lives near the Sheep...erm, Whalers, Dead Wings, etc. ;)

eoc_Jason
01-02-2010, 01:19 PM
Yeah, but for most of us Searchlight is not very affordable...

I'm still super pissed at Jelsoft for not having any sort of sphinx plug-in / switch for vB 4.0... From all the previous development talk I was led to believe they were going to implement it (finally)...

It's like they don't really care about the big boards... They are implementing all these other crazy features that maybe 0.2% of people *might* use, but not taking a hard look at the core of their product... *sigh*

I'm about to install vB 4.0 on my test server. At the very least I'm going to try and code sphinx for the basic search stuff. To standardize things I'm going to try and code it with the PECL::Sphinx library.

RedWingFan
01-02-2010, 03:01 PM
Yeah, but for most of us Searchlight is not very affordable...

I'm afraid to ask how much... :( We run on donation funds ONLY.

I'm still super pissed at Jelsoft for not having any sort of sphinx plug-in / switch for vB 4.0... From all the previous development talk I was led to believe they were going to implement it (finally)...

I only got the idea that the search system was more "modular" now (??), meaning it would be easier to support alternate search engines like Sphinx. All I really noticed in the AdminCP was that you only have one option called "DB Search" to choose from, and there is NO clue as to whether this is a built-in vB search or fulltext or something new entirely. It couldn't be that difficult to code for Sphinx over MySQL fulltext...and to be honest, I'd rather have it done officially vs. using a third party hack or plugin. (Not knocking them, since I'm more than guilty of using many third party plugins myself.)

It's like they don't really care about the big boards... They are implementing all these other crazy features that maybe 0.2% of people *might* use, but not taking a hard look at the core of their product... *sigh*

Then again, the percentage of "big boards" is small also. But those of us who need Sphinx and use it are now at the point where we can't just fall back to fulltext without possibly bringing a server to its knees. We are at that point now. Sphinx also seems to have a lot more available in terms of configuring user searches AND indexing.

I'm about to install vB 4.0 on my test server. At the very least I'm going to try and code sphinx for the basic search stuff. To standardize things I'm going to try and code it with the PECL::Sphinx library.

I don't know if there's any code I can "borrow" from the current Sphinx hack I'm using. But I do know that the two deal breakers for us are Sphinx and iTrader (which gets a lot of use in our Marketplace area)...all the other plugins and template mods we're using can be easily replaced. Definitely will wait a few versions, too, since the 4.0.0 seems rushed and I'm finding at least a dozen problems with it already...

Kevlar
01-02-2010, 08:59 PM
I purchased vB4 on the premise that the new search engine would work for big forums, after playing with it and inquiring with "the people" I have heard otherwise. Even the search on their own site with vB4 is slow and it is no where the size of my forum. If that is the case, I couldn't go back to vB search even if I wanted to ... my forum would come to a grinding halt.

So while I do have vB4 purchased... I'm currently in a holding pattern. :(

RedWingFan
01-03-2010, 02:30 AM
So while I do have vB4 purchased... I'm currently in a holding pattern. :(

We are as well. I put it on our private test forum so the staff could get used to it, but it'll be quite a few months before we can put it on our main forum. The only good thing is that we had an expired license, so I can at least upgrade to 3.8.x and get us current in that area.

Is it just me, or do some parts of vB4 appear unfinished? IMHO it seems like it could have gone through another RC version or two before going gold. I've got a list of about a dozen issues with it so far--really, some are very minor, but still make me hesitant to recommend it even for new applications just yet.

Kevlar
01-03-2010, 04:47 PM
vB4 seems unfinished... but then again, most software the comes out these days is far from finished. Is it ready for production environments? Probably if you are within the 70% of the average users of vB. I think I (we) fall outside of the average hence why we notice all the little problems more. I would have no problem installing vB4 on other small forums that I used to run, but those were smaller unmodified cookie cutter forums... nothing like the monster I deal with now.

eoc_Jason
01-07-2010, 12:06 PM
I dunno what the average size vB forum is... But you would think if you are selling a forum product, you *want* your clients to be successful. If that means they are going to hit a brick wall after a few million posts in their DB, then you are severely limiting your market. There are quite a few sites that are running very old versions of VB, opting to just patch and write custom code on their own then wait for jelsoft to come up with the official features.

FYI, Searchlight base cost is $2,000 USD...

I haven't had a chance to install 4.0 yet, but I'll check through the code to see what they did to eliminate the table locking. It's still moot though because if search results don't return within a second then people start getting click-happy or give up (both unacceptable).

Imagine if Google took 5 minutes to return every search you make? How many people would be googling?

Kevlar
01-07-2010, 12:23 PM
This post gives some insight to what they did to eliminate locking...
http://www.vbulletin.com/forum/entry.php?2376-Part-1-vB4mance-Helping-communities-grow-performance-data-model-changes-in-vB-4-0

You can read it if you want, but basically it allows changing the table from myISAM to InnoDB so it is no longer table locking but row locking instead (if I understand it correctly).

alexi
01-07-2010, 12:42 PM
I dunno what the average size vB forum is... But you would think if you are selling a forum product, you *want* your clients to be successful. If that means they are going to hit a brick wall after a few million posts in their DB, then you are severely limiting your market. There are quite a few sites that are running very old versions of VB, opting to just patch and write custom code on their own then wait for jelsoft to come up with the official features.

FYI, Searchlight base cost is $2,000 USD...

I haven't had a chance to install 4.0 yet, but I'll check through the code to see what they did to eliminate the table locking. It's still moot though because if search results don't return within a second then people start getting click-happy or give up (both unacceptable).

Imagine if Google took 5 minutes to return every search you make? How many people would be googling?

It is a lot of money but in terms of happy customers it's one of the best investments we ever made. I also looked at it in terms of how much time we would have to spend producing a home grown sphinx solution and then the fact that it would have a lot of the glitches described in this thread. When I added it all up searchlight made sense. I do understand that a lot of boards, even big boards with high traffic run on donations and it's just not affordable.

RedWingFan
01-07-2010, 02:55 PM
vB4 seems unfinished... but then again, most software the comes out these days is far from finished.

Having written my own web apps in PHP (and a few in JS/AJAX), I can agree that just about any software out there is a "work in progress". Especially mine! ;) However, there are some minor but very noticeable quirks in the vB4 interface that should never have gotten past the beta stage. I will say that functionally, the whole package does work...but it's minor interface issues that make using the forum a pain.

One that irks me no end is that if you use Quick Reply, you can't tab once over to a "post" button anymore. When I'm replying to dozens of posts per day, I tend to stick to a keyboard, and I use all of the shortcuts. Having to hop constantly over to a mouse to click one button is a huge oversight IMHO.

Most of what I found are minor quirks like that. Having said that, though, I've already found some usable plugins that replace what I have in my 3.x installations.

I dunno what the average size vB forum is... But you would think if you are selling a forum product, you *want* your clients to be successful. If that means they are going to hit a brick wall after a few million posts in their DB, then you are severely limiting your market.

I may have said this earlier, but I do know search appears to be more modular now. IOW, even though Sphinx is not built in, it appears that you can now plug in a module for a search and it would appear in the search engine selector (under "dbsearch") in admincp.

The search files themselves are in a new location, from what I can tell, and there are also new tables in the database for search. So, I know it has been somewhat reworked but, like you say, it's still slow, and...

Imagine if Google took 5 minutes to return every search you make? How many people would be googling?

I don't think that search speeds of 5-10 seconds are all that bad--as a long-time forum user (as many of our members are), I have come to expect that over the dozen or so years I've been using online forums. BUT...when vB takes 20, 30, sometimes 40 seconds to retrieve an often-used word in our forum, and there's a tool (called Sphinx) that returns same in 0.4 seconds...what choice would I make? Yep, I'd find any way I could to get Sphinx running.

Beside speed, I've mentioned elsewhere in these forums (even in this thread perhaps?) that I can configure what Sphinx indexes. With MySQL, I'm stuck with whatever is set up in the my.cnf file. I could reconfigure it, but then I'm doing customizations on MySQL for only one application (vB) and possibly hampering the performance of others. With Sphinx, our members are happy simply because they can now search for three-letter terms (where previously, it was four), and I've eliminated a lot of words in the stopwords file so that some common music-based titles can now be searched. I'm actually a bona-fide hero now because members can search for "Who's Next"! :D

FYI, Searchlight base cost is $2,000 USD...


Ummm, no thanks. We'd pay a couple hundred bucks, but for something that costs almost what hosting does for a year, and far more than the vB product itself...? We can't justify that, not on our donation-based (and advertising-free) funding. Might be justifiable for forums that are far busier than ours, but we can barely scrape up enough donations to keep going for a year at a time...

--------------- Added 1262887000 at 1262887000 ---------------

This post gives some insight to what they did to eliminate locking...
http://www.vbulletin.com/forum/entry.php?2376-Part-1-vB4mance-Helping-communities-grow-performance-data-model-changes-in-vB-4-0

You can read it if you want, but basically it allows changing the table from myISAM to InnoDB so it is no longer table locking but row locking instead (if I understand it correctly).

I just read it, and you read it correctly. Although, one thing they overlooked in their description of vB3 is that we had two search choices: built-in vB search, and fulltext. We already had InnoDB table types set up with vB's built-in search, so this is nothing new to us. In fact, it goes back even further when we changed hosts three or four years ago and we hit a brick wall in performance; this was also back when fulltext was not yet an option in vB3.

In a way, it seems vB searching has taken a step backward. *sigh* We're now back to relying on a built-in search.

The main problem is that MySQL just doesn't scale well! Sphinx helps because it bypasses MySQL entirely during the search process. Sphinx isn't that hard to set up, even without root access.

eoc_Jason
01-12-2010, 01:21 PM
Switching to InnoDB is not really a viable option... Can you imagine how many more threads there will be specifically for InnoDB tuning? As if the people whom didn't have a clue about my.cnf before are going to magically make this work... lol.

I spent yesterday installing sphinx (again) on my test server, not sure how it got uninstalled... Anyhow I'm using 0.9.9 (released Dec 2, 09)... I also installed the latest 3.8.x vBulletin....

After looking over the search.php file, it seems there's quite a few hooks available, so I'm going to try and code using those instead of having to edit the search.php file directly. I'm going to code this from scratch rather than try to patchwork more of Orban's code...

My question to you guys, what version of vBulletin are you running? For security reasons you can PM me directly if you don't feel comfortable posting in the thread, or you can just say 3.6, 3.7, 3.8, 4.0 and skip the minor version. Also if you could tell me how many threads/posts your forum has that would be helpful too.

I'm not 100% sure how I will release this yet (free vs paid)... Once I get closer to a final working product I'll probably ask a couple of you guys to help beta test this for me. I do not plan on even looking at 4.0 for a while... People just have too many negative things to say right now and there are still too many bugs....

mute
01-12-2010, 01:59 PM
Before IB released info on 4.0, they said Sphinx was in the works. Later, when they came out and said "oh, no sorry, no sphinx" I called them out on it and finally got a more concrete answer. From what I've been able to gather, they DO have a Sphinx implementation for 4.0, but they aren't releasing it because they don't consider it finished.

My guess is that they aren't happy with the "search lag" associated with using Delta indexes, and that's why they are holding back. There are realtime updates for Sphinx in the pipeline, but it's not stable yet. Personally, I think it's really stupid that they are probably holding big boards back from upgrading because of the lag between indexes, which is an easy trade off compared to the terrible search performance of the default search.

We're running vb 3.6+backported security fixes with ~61 million posts and ~3.1 million threads. Our Sphinx implementation is based around the one from this thread, but I never got around to finishing it so that "Find all posts/threads by user" searches go to Sphinx. I'd love to have that functionality, but I don't really have the time to do it myself. We do a delta index every 5 minutes, and our users are a lot happier than they were years ago using the fulltext search. Granted, the Sphinx search that most of us are using is sort of crappy and doesn't take advantage of boolean operators and all the cool new stuff available in newer versions of Sphinx, but I'll take .5s search results over that stuff any day of the week.

Like most of you, we're not touching 4.0 with a 10 foot pole until they've figured out what they are doing and fix all of the performance issues 4.0 has, such as the search, or the hundreds of queries per page on the CMS side of things.

As for Searchlight, I like Floren and I think his product is probably great, but I'm not paying that either. Not because I don't have the money, but I just couldn't justify the expenditure with so little information about the product (I realize it's not final yet). Spending 10 times the cost of the vB license for a better search engine? Pass. I'll wait for someone in the community to work out a 4.0 version. Now that the search is full of hooks, you can hook in pre-search-query and return Sphinx results without a bunch of nasty file hacks. It's going to be a lot cleaner than it used to be. I think the only "difficult" part about the whole deal will be setting up Sphinx for people who are cpanel-junkies.

As for InnoDB, we've been running InnoDB on a handful tables for probably 5 years now, and don't have many real complaints about it, even with the drawbacks associated with it.

kontrabass
01-12-2010, 02:39 PM
My question to you guys, what version of vBulletin are you running? For security reasons you can PM me directly if you don't feel comfortable posting in the thread, or you can just say 3.6, 3.7, 3.8, 4.0 and skip the minor version. Also if you could tell me how many threads/posts your forum has that would be helpful too.

3.6

7.7M posts
540k threads

Good luck!

eoc_Jason
01-12-2010, 02:40 PM
Thanks for the feedback mute. I would be interested to see what the vB team has come out with thus far (for 4.0) but I doubt they would release the file to anyone.

I've been writing things out and reading over the sphinx documentation with a fine tooth comb. Sphinx DOES support Boolean mode, along with like half a dozen other methods. One thing I plan on doing is offering ALL the features on the vB search page, so no template edit required there to strip things down or remove features... Also I'm going to try and implement the various search modes based on how a user inputs their search criteria... i.e. if they use quotes "This must match exactly" then it will use a phrase search.

I have an idea on how to keep near-time up-to-date with the reply #, views, last poster, etc... So that you wouldn't have to wait for your full re-index to update those values.

I also need to make note to utilize the "similar threads" search to go to sphinx. I do not know if there is a hook for that (I remember the code is burred somewhere), that *might* require a file edit, or disable similar threads if you want to delete your fulltext indexes.

mute
01-12-2010, 02:58 PM
Sphinx DOES support Boolean mode, along with like half a dozen other methods.

Yeah, what I meant to say is that the Sphinx implementation floating around doesn't support it despite having support in Sphinx. I don't think anyone is going to complain if there are some file edits necessary to have 100% of the vB search functionality replicated on Sphinx. I surely would not :)

RedWingFan
01-12-2010, 03:22 PM
Switching to InnoDB is not really a viable option... Can you imagine how many more threads there will be specifically for InnoDB tuning? As if the people whom didn't have a clue about my.cnf before are going to magically make this work... lol.

A lot of us have already been using InnoDB for years, so it's nothing new. It is always accounted for when we submit to tweaking threads over at vb.com . At many hosts though, InnoDB is nowhere near optimized and, if you're on a shared account, you can't get MySQL tweaked anyway. Folks with boards that small really don't need InnoDB though. (Although I found with the old phpBB2, I did need to change a couple of tables to InnoDB since it was locking for a minute at a time.)

I do not plan on even looking at 4.0 for a while... People just have too many negative things to say right now and there are still too many bugs....

Unfortunately I'm only interested in a 4.0 implementation--Sphinx is the only major reason now that we won't switch our boards over. The 3.x version works good enough for us. I'm at the "don't give a crap" point now with the forum owner. The members can like what they have with Sphinx, or just shut the **** up, IMHO.

My guess is that they aren't happy with the "search lag" associated with using Delta indexes, and that's why they are holding back. There are realtime updates for Sphinx in the pipeline, but it's not stable yet.

I run deltas every two minutes. The server barely feels it--the posts and threads index in less than a second. If you think about it, how often do you go searching for posts that were made in the last few minutes? Rarely at all. I couldn't care less if there are real time updates via Sphinx. Nobody has complained they can't find a post they made five minutes ago.

I think the only "difficult" part about the whole deal will be setting up Sphinx for people who are cpanel-junkies.

Good point. I won't even consider a host that uses that Cpanel crap. I've had to use a few for clients who had existing sites I was working on, and it just frustrated me having a lack of control over many things. Give me SSH and SFTP and that's all I need. I've done it that way for 13 years.

I've been writing things out and reading over the sphinx documentation with a fine tooth comb. Sphinx DOES support Boolean mode, along with like half a dozen other methods. One thing I plan on doing is offering ALL the features on the vB search page, so no template edit required there to strip things down or remove features... Also I'm going to try and implement the various search modes based on how a user inputs their search criteria... i.e. if they use quotes "This must match exactly" then it will use a phrase search.

I noticed the same thing when I went over Sphinx's documentation a couple of months ago--there are so many options for both searching AND indexing that it seems a shame we are only able to use part of it. Phrase-based searching (with quotes), boolean, etc., would be a huge jump over what the default offers.

If it doesn't cripple anything, I would not mind beta testing any new search facilities that crop up here on our 3.x production forum. (Specs below, per your request. ;) ) For the 4.0, since it is private, I'm willing to try anything at this point. The staff doesn't mind helping me out on the user end of things, and they seem to find quirks I can never generate on my own. :D

Our forum: 200,000 threads; 4.6 million posts; 22,000 members. Running vB 3.7.x but looking to upgrade to 3.8.latest :D since we got the license upgrade several weeks ago.

mute
01-12-2010, 03:38 PM
At some point when I actually have a test 4.0 install running with our real dataset, I'm willing to help test as well. Then again, if it works with 1000 posts it's going to work with 60 million, so I'm not really that useful :)

I would really love an updated version that runs on 3.6 that takes advantage of all the Sphinx goodies, since I don't see us moving to 4.0 for close to a year. All of the custom code we've written has to be ported and tested, and being the lone admin on a site this big has my hands full a lot of the time.