Go Back   vb.org Archive > Community Discussions > Forum and Server Management
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #271  
Old 01-18-2007, 07:37 PM
DigitalCrowd's Avatar
DigitalCrowd DigitalCrowd is offline
 
Join Date: Nov 2001
Posts: 25
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Well, then you get into having to fetch the current lastpost field of all matching threads and that would be, on larger boards, significant overhead. Now you move away from just searching Sphinx, to now search the database as well, then sorting your resulting arrays and pretty soon... you have a SLOW search again.

After further testing..

I rebuilt my index, search results in order. I did a reply to a post about the third the way down, not using the word "the" (my search word) in it. Now, when I do a search for "the", the third post down is out of order.

The ONLY way for a thread to get bumped up, is if the search word has been used again at a later time in that thread.

The Best way to do search, IMHO is to give more weight to recent threads, but get away from sort by date search results. Even Google doesn't offer this, but we are so accustomed to it in the forum world that getting people to break from it is hard. The best way would be to optimize how Sphinx (or the code that makes the call to Sphinx) weighs results.

I did notice on the Sphinx Forums that the sort mode "SPH_SORT_TIME_SEGMENTS" was made to address this and that if it doesn't work to our liking, it can be modified to perform better. I will have too look into it.

Without adding overhead to the search process, I think the days of instant lastpost sorting are gone, unless you can rebuild your full index every 15 minutes or so and for some people, that index process might last that long or longer.
Reply With Quote
  #272  
Old 01-25-2007, 08:35 PM
kontrabass kontrabass is offline
 
Join Date: Feb 2002
Posts: 139
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Boy, if ever a hack was ripe for a commercial opportunity... A no-fuss, easy-to-install sphinx search for VBulletin with full search functionality and smart results ordering - how many big boarders would pay big money for this? (I would... it'd be a lot cheaper than buying another new server... ). I certainly HOPE this free development continues, but I look forward to the time when bugs like this aren't an issue.
Reply With Quote
  #273  
Old 01-29-2007, 06:26 PM
stinger2's Avatar
stinger2 stinger2 is offline
 
Join Date: Jul 2005
Posts: 274
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by kontrabass View Post
Boy, if ever a hack was ripe for a commercial opportunity... A no-fuss, easy-to-install sphinx search for VBulletin with full search functionality and smart results ordering - how many big boarders would pay big money for this? (I would... it'd be a lot cheaper than buying another new server... ). I certainly HOPE this free development continues, but I look forward to the time when bugs like this aren't an issue.
bookmarking this
Reply With Quote
  #274  
Old 01-29-2007, 07:32 PM
Nerudo Nerudo is offline
 
Join Date: Jul 2005
Posts: 6
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

It´s amazing. I´m bookmarking too.
Reply With Quote
  #275  
Old 02-14-2007, 05:34 PM
jason|xoxide jason|xoxide is offline
 
Join Date: Jul 2006
Location: Exton, PA
Posts: 42
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I wouldn't classify this as a bug, just more of an oversight. The max_matches variable in the sphinx.conf file is ignored when using the PHP API script. It doesn't matter if it's left at the default 1000, the 1500 that orban's file has been modified to use, or 1000000 as the config file says not to do. If you want to change the number of results returned then you need to change line 15 of 'includes/sphinx.php'.

Old:
PHP Code:
$cl->SetLimits 0$vbulletin->options['maxresults'] ); 
New (replace '2500' with the number you want):
PHP Code:
$cl->SetLimits 0$vbulletin->options['maxresults'], 2500 ); 
Otherwise, great work! This has really sped up searching on the forums I have used for testing (6K posts, 820K posts, and 3.2M posts).
Reply With Quote
  #276  
Old 02-19-2007, 06:22 PM
eoc_Jason's Avatar
eoc_Jason eoc_Jason is offline
 
Join Date: Dec 2001
Location: Houston, TX
Posts: 493
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Just installed on a forum with ~5.4 million posts... Before some searches would literally take forever (I had a query kill script that would kill the thread after a minute), now searches come back in less than a second!

Slow searches are a killer on a forum since it locks the post table. Also users seem to have a habit of clicking the search button multiple times if results aren't returned within a few seconds.

One thing, search results I've noticed don't always come back sorted by date properly. I skimmed over a few posts talking about this, I guess I need to go back and read it more in-depth. I'm sure the fix wouldn't be too hard.

Here's the data from the initial build, even with such a large index file it is still super fast.

Quote:
indexing index 'post_index'...
collected 5562411 docs, 1881.1 MB
sorted 189.9 Mhits, 100.0% done
total 5562411 docs, 1881073642 bytes
total 523.946 sec, 3590205.25 bytes/sec, 10616.38 docs/sec
indexing index 'post_index_delta'...
collected 75 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 75 docs, 18407 bytes
total 0.294 sec, 62659.52 bytes/sec, 255.31 docs/sec
indexing index 'thread_index'...
collected 357824 docs, 10.6 MB
sorted 1.2 Mhits, 100.0% done
total 357824 docs, 10635814 bytes
total 17.875 sec, 595012.62 bytes/sec, 20018.19 docs/sec
indexing index 'thread_index_delta'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.010 sec, 0.00 bytes/sec, 0.00 docs/sec
skipping index 'fulltext_post_index' (distributed indexes can not be directly indexed)...
skipping index 'fulltext_thread_index' (distributed indexes can not be directly indexed)...
Reply With Quote
  #277  
Old 02-19-2007, 08:58 PM
kontrabass kontrabass is offline
 
Join Date: Feb 2002
Posts: 139
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by eoc_Jason View Post
Just installed on a forum with ~5.4 million posts... Before some searches would literally take forever (I had a query kill script that would kill the thread after a minute), now searches come back in less than a second!

Slow searches are a killer on a forum since it locks the post table. Also users seem to have a habit of clicking the search button multiple times if results aren't returned within a few seconds.

One thing, search results I've noticed don't always come back sorted by date properly. I skimmed over a few posts talking about this, I guess I need to go back and read it more in-depth. I'm sure the fix wouldn't be too hard.

Here's the data from the initial build, even with such a large index file it is still super fast.
Thanks for the report! Mind if I ask a couple questions (gathering info to see where I would stand): Are you running a slave DB server for searches? What kind of hardware is behind your database? Thanks
Reply With Quote
  #278  
Old 02-19-2007, 09:39 PM
Mickie D Mickie D is offline
 
Join Date: Jun 2002
Posts: 430
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

this is fantastic well done all involved

im having a small issue i need some help with

1) if i issue the search command from ssh it gives me alot of results with words and if i do a search in the forums it gives me 1 or 2 results which i know there is alot more

2) i have a few words that i like included in the vboptions that are 3 letter words how do i enable them in sphinx without enabling all 3 letter words ?

any ideas ?

cheers
Reply With Quote
  #279  
Old 02-20-2007, 04:58 PM
mute mute is offline
 
Join Date: Dec 2002
Location: Phoenixville, PA
Posts: 152
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by DigitalCrowd View Post
Well, then you get into having to fetch the current lastpost field of all matching threads and that would be, on larger boards, significant overhead. Now you move away from just searching Sphinx, to now search the database as well, then sorting your resulting arrays and pretty soon... you have a SLOW search again.

After further testing..

I rebuilt my index, search results in order. I did a reply to a post about the third the way down, not using the word "the" (my search word) in it. Now, when I do a search for "the", the third post down is out of order.

The ONLY way for a thread to get bumped up, is if the search word has been used again at a later time in that thread.

The Best way to do search, IMHO is to give more weight to recent threads, but get away from sort by date search results. Even Google doesn't offer this, but we are so accustomed to it in the forum world that getting people to break from it is hard. The best way would be to optimize how Sphinx (or the code that makes the call to Sphinx) weighs results.

I did notice on the Sphinx Forums that the sort mode "SPH_SORT_TIME_SEGMENTS" was made to address this and that if it doesn't work to our liking, it can be modified to perform better. I will have too look into it.

Without adding overhead to the search process, I think the days of instant lastpost sorting are gone, unless you can rebuild your full index every 15 minutes or so and for some people, that index process might last that long or longer.
Digital,

Have you managed to find a fix for this short of doing a full reindex? We're still just doing incremental updates, but I am still pretty annoyed about the "out of order" results.
Reply With Quote
  #280  
Old 02-20-2007, 05:49 PM
kmike kmike is offline
 
Join Date: Oct 2002
Posts: 169
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by DigitalCrowd View Post
Well, then you get into having to fetch the current lastpost field of all matching threads and that would be, on larger boards, significant overhead. Now you move away from just searching Sphinx, to now search the database as well, then sorting your resulting arrays and pretty soon... you have a SLOW search again.
....
Without adding overhead to the search process, I think the days of instant lastpost sorting are gone, unless you can rebuild your full index every 15 minutes or so and for some people, that index process might last that long or longer.
Actually there is a function in vB which does exactly that (sorts the results by the specified field):
sort_search_items() in includes/functions_search.php
It's just one query, and it runs on a slave server if it's set up. Also its overhead depends only on the number of returned search results.
And you don't even need to run through it on every search, only when listing the results as threads, sorted by lastpost.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 02:10 AM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.06175 seconds
  • Memory Usage 2,300KB
  • Queries Executed 14 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)ad_showthread_firstpost
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (2)bbcode_php
  • (5)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)navbar
  • (3)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (3)pagenav_pagelinkrel
  • (10)post_thanks_box
  • (10)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (10)post_thanks_postbit_info
  • (10)postbit
  • (10)postbit_onlinestatus
  • (10)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete