The Arcive of Official vBulletin Modifications Site.It is not a VB3 engine, just a parsed copy! |
|
#1
|
|||
|
|||
Sphinx Search
Sphinx Implementation for vBulletin:
Version 0.1 Hooray! Just sharing as usual, let the discussions begin (in b4 TECK "MINE IS BETTER") Only tested with Sphinx-0.9.8-rc2 (r1234; Mar 29, 2008). If you are upgrading from my old tutorial, backup your search.php (you know, just in case you need the old hacked up version again) and restore the original from the zip/tar, no more file modifications! http://sphinxsearch.com/downloads.html Tested on 3.6.10, should work on 3.7 if you modify /*insert query*/ on Line 522 (I removed 'prefixchoice' field because it doesn't exist in 3.6) No support for tags/thread prefix yet, because I don't have access to a 3.7 installation at the moment Similar threads is also being worked on Alpha release for some feedback, hopefully it will be production ready soon I assume you already have Sphinx up and running... see attached sphinx.conf.example for a minimalistic setup Installation notes inside search_sphinx.php Well yeah enjoy. And PM me if you need help The old post is here: https://vborg.vbsupport.ru/showpost....&postcount=387 The Good:
The Bad:
The Ugly:
*The Infamous Post Sorting Quirk What happens here is that when you "Search Entire Posts" and "Show Results as Threads", do you want you threads sorted by:
Our Sphinx setup does not have first post and last post dateline stored in its post index (and it would be pretty much useless too) so the first two options are not available. vBulletin offers a function called "sort_search_items()" (search.php:633 3.7) which could, in theory, be used to sort the threads by last post dateline. It does not fix the problem though. Let's assume we set maxresults to 5. We are searching for threads for "funny". We have 7 threads created today: 1. Thread "Cows", Created 08:00, Last Post 17:00 | "Funny Cows", Created 09:00 2. Thread "Cats", Created 09:00, Last Post 14:00 | "Funny Cats", Created 14:00 3. Thread "Dogs", Created 10:00, Last Post 12:00 | "Funny Dogs", Created 11:00 4. Thread "Mice", Created 11:00, Last Post 15:00 | "Funny Mice", Created 13:00 5. Thread "Rats", Created 12:00, Last Post 13:00 | "Funny Rats", Created 12:00 6. Thread "Eels", Created 13:00, Last Post 19:00 | "Funny Eels", Created 18:00 7. Thread "Fish", Created 14:00, Last Post 18:00 | "Funny Fish", Created 17:00 Do we want to show threads 6, 7, 2, 4, 5 (Sphinx)? Or do we want to show threads 6, 7, 1, 4, 2 (vB)? vBulletin finds all 7 posts, orders them by last post descending, and grabs the top 5. Sphinx will find the newest 5 matching posts and then returns you the associated threads. Reordering search results with "sort_search_items()" does not fix the problem because there might be older threads with very recent replies that Sphinx won't even consider. Let's consider an 8th thread: 8. Thread "Bees", Created 2002, Last Post 20:00 | "Funny Bees", Created 2002 vBulletin will list this one on top, Sphinx will not consider it. So even re-sorting the search items will not make this thread appear. |
#2
|
||||
|
||||
Nice find! I'll play around with it once I get some time.
|
#3
|
|||
|
|||
Obviously the only options you will have on the advanced search page are:
Key Words: Search In: Thread Titles/Posts Sort Results by: Relevancy, Date Asc, Date Desc Search in Forums: And I guess searching by username will still be the built in way. (As in, without a search term, just list his posts.) Gonna try to hack that up, when I make it work I'll release it I hope But the fact you can index 4k posts/second is absolutely insane, and that was with 800 users online... |
#4
|
||||
|
||||
Hmm, yes, that looks interesting, bookmarked for later.
|
#5
|
|||
|
|||
Also means I can remove that 400mb fulltext index from post table making MySQL even faster.
The right tool for the job. Filtering by forumid already works, so does sorting by date. And it still says 0.000003 seconds. Incredible. |
#6
|
||||
|
||||
Hmm good timing. I got on here today to see if there were any other resources out there for searching and vbulletin and this showed up in the results.
We've had soooo much trouble keeping our search up. We're using the fulltext search right now with the search on its own server on tables reduced in size. Huge pain and it still doesn't return some results. Keep us updated please, this looks cool. |
#7
|
||||
|
||||
Awsome!
If I get some time tonight (probably not!) I will download Sphinx and give it a look. What kind of data do you have to test this with? We're looking at about 9 million records on our live post table (millions more archived). I'm very curious how well this would hold up to that amount of data. |
#8
|
|||
|
|||
Can I get a peek at your sphinx.conf?
|
#9
|
|||
|
|||
wow, you are fast! thanks. I'm tossing it 24 million posts to see what it does
|
#10
|
|||
|
|||
*waits for post index to build*
So far so good. It ripped through 1,652,726 thread titles in about 2 minutes, on a machine replicating a very active forum, and one running a test upgrade from 3.5.5 to 3.6.1 So far, I'm happy! I think with a little work this could be amazing. The api is a little unfriendly when it comes to errors and what not, but with some polishing and figuring out the targeting of searches and by name, and we're good to go. Orban you are a hero among men! Just FYI: thread table: collected 1658976 docs, 48.1 MB sorted 5.1 Mhits, 100.0% done total 1658976 docs, 48070959 bytes total 148.426 sec, 323872.56 bytes/sec, 11177.16 docs/sec post table: collected 8860446 docs, 1416.9 MB sorted 140.2 Mhits, 100.0% done total 8860446 docs, 1416892676 bytes total 3168.862 sec, 447129.84 bytes/sec, 2796.10 docs/sec that is word length of 4 and no stopwords. |
|
|
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
More Information | |
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|