Thread: Sphinx Search
View Single Post
  #450  
Old 11-01-2007, 03:05 PM
eoc_Jason's Avatar
eoc_Jason eoc_Jason is offline
 
Join Date: Dec 2001
Location: Houston, TX
Posts: 493
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Didn't want to see this thread die as I still heavy rely on Sphinx for my vB search. It's disappointing that the vB team *still* has not come up with a built-in solution for searching that is acceptable for large forums.

One bit of advice - The code in the vB search does a lot of weighting & filtering. I have many instances where I search for a specific word that is in a post, and doing a raw search I can find it, but after vB works its magic it will give a 'no results found'. So don't think something is broken, I guess technically it is, but it's 'by design'.

Anyhow, here's some bits of code that might help get your sphinx running a little smoother.

One thing you really should do is run the search daemon & indexer under a non-root user, I use 'sphinx'. If yours is different than simply adjust the files accordingly. you will probably need to make a few directories mentioned in the scripts below and have them owned by your user that sphinx is running as. (like for the lock file & log)

I run on Redhat / CentOS, here's my script that goes in the /etc/rc.d/init.d/ directory. I call it 'searchd'. Simply use 'chkconfig' to add it and have it start up when you system boots.

Code:
#!/bin/sh
#
# searchd   This script starts and stops the sphinx search engine
#
# chkconfig: - 80 15
#
# description: Stand Alone Search Engine
# processname: searchd
# config: /usr/local/etc/sphinx.conf
# pidfile: /var/run/searchd/searchd.pid


# Source function library.
. /etc/rc.d/init.d/functions

RETVAL=0

start() {
        echo -n "Starting Sphinx: "
        sudo -u sphinx /usr/local/bin/searchd --config /usr/local/etc/sphinx.conf > /dev/null 2>&1
        RETVAL=$?
        if [ $RETVAL -eq 0 ]; then
          success startup
          touch /var/lock/subsys/searchd
        else
          failure startup
        fi
        echo
        return $RETVAL
}

stop() {
        echo -n "Shutting Down Sphinx: "
        kill `cat /var/run/searchd/searchd.pid`
        RETVAL=$?
        if [ $RETVAL -eq 0 ]; then
          success shutdown
          rm -f /var/lock/subsys/searchd /var/run/searchd/searchd.pid
        else
          failure shutdown
        fi
        echo
        return $RETVAL
}

restart() {
        stop
        start
}

case "$1" in
  start)
        start
        ;;
  stop)
        stop
        ;;
  restart)
        restart
        ;;
  status)
        status searchd
        ;;
  *)
        echo $"Usage: $0 {start|stop|restart|status}"
        exit 1
esac

exit $RETVAL
Second, here is my cron script. I made a directory called 'cron.quarterly' that runs every 15min. Similar to the cron.hourly, cron.daily, etc... You can do whatever. If you make that directory be sure to edit your /etc/crontab file accordingly too. I added a time-check so once a day sphinx will do a full-reindex, and it makes a lock file so if yours takes a long time to reindex you won't run into issues. Basically at 5am it will do a re-index. I chose that time because it's a low usage period for my forum and server. By default the cron.daily scripts run at 4am so you wouldn't want to do it then since everything else will be eating CPU cycles.

Code:
#!/bin/sh

# the lockfile is not meant to be perfect, it's just in case the
# two sphinx cron scripts get run close to each other to keep
# them from stepping on each other's toes.

LOCKFILE=/var/lock/subsys/sphinx_indexer

# If the lockfile exists then exit!

if [ -f $LOCKFILE ]; then
  echo "Lockfile already exists, not running sphinx indexer!"
  exit
fi;

touch $LOCKFILE

compareh=$(date +%k)
comparem=$(date +%M)

if [ $compareh -eq "5" ] && [ $comparem -le "14" ]; then
  sudo -u sphinx /usr/local/bin/indexer --config /usr/local/etc/sphinx.conf --rotate --all > /dev/null 2>&1
else
  sudo -u sphinx /usr/local/bin/indexer --config /usr/local/etc/sphinx.conf --rotate post_index_delta thread_index_delta > /dev/null 2>&1
fi;

rm -f $LOCKFILE

exit 0
I also have my logs in their own /var/log/searchd/ directory, you set this in the sphinx config (most people probably just have them in the /var/log dir). Again that directory will need to be owned by the sphinx user that you use.

Code:
/var/log/searchd/*.log {
        missingok
        compress
        postrotate
          if test -n "`ps acx|grep searchd`"; then
            /sbin/service searchd restart  2> /dev/null > /dev/null || true
          fi
        endscript
}
Oh, and for people wanting to know how to implement the sort_search_items(), here's some code. It goes in the includes/sphinx.php file.

At the end where there is the following:
Code:
                        if ($vbulletin->GPC['titleonly'] == $vbulletin->GPC['showposts'])
				$orderedids[$docinfo['attrs'][$sphinx_switch_fields]] = $docinfo['attrs'][$sphinx_switch_fields];
                        else
				$orderedids[] = $doc;
                }       
        }       
        else
		$orderedids = array();
}
Replace that with:

Code:
                        if ($vbulletin->GPC['titleonly'] == $vbulletin->GPC['showposts'])
                        {
                                $orderedids[$docinfo['attrs'][$sphinx_switch_fields]] = $docinfo['attrs'][$sphinx_switch_fields];
                                $itemids[$docinfo['attrs'][$sphinx_switch_fields]] = $docinfo['attrs'][$sphinx_switch_fields];
                        }
                        else
                        {
                                $orderedids[] = $doc;
                                $itemids["$doc"] = true;
                        }
                }
        }
        else
        {       
                $orderedids = array();
        }
        
        // #############################################################################
        // now sort the results into order
        // #############################################################################
        if (!$vbulletin->GPC['titleonly'] OR $vbulletin->GPC['showposts'])
        {       
                // sort by database field
                if ($vbulletin->GPC['sortby'] == 'post.dateline' || $vbulletin->GPC['sortby'] == 'lastpost')
                {       
                        if (empty($itemids))
                        {       
                                $errors[] = array('searchnoresults', $displayCommon);
                        }
                        else
                        {       
                                // remove dupes and make query condition
                                $itemids = iif($vbulletin->GPC['showposts'], 'postid', 'threadid') . ' IN(' . implode(',', array_keys($itemids)) . ')';
                                
                                // sort the results and create the final result set
                                $orderedids = sort_search_items($itemids, $vbulletin->GPC['showposts'], $vbulletin->GPC['sortby'], $vbulletin->GPC['sortorder']);
                        }
                }
        }
        
// END Results  
}
Be sure to leave the unset line at the very very bottom.


I'm about to make the upgrade to vB 3.6.8 whatever, and I'm running php 5.2.4, so I'll let you know how the upgrade goes. I want to look over everyones modifications and updates in this thread. Then I'll post some more files and stuff if necessary.
Reply With Quote
 
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01374 seconds
  • Memory Usage 1,804KB
  • Queries Executed 11 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD_SHOWPOST
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (5)bbcode_code
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)post_thanks_box
  • (1)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (1)post_thanks_postbit_info
  • (1)postbit
  • (1)postbit_onlinestatus
  • (1)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • reputationlevel
  • showthread
Included Files:
  • ./showpost.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showpost_start
  • bbcode_fetch_tags
  • bbcode_create
  • postbit_factory
  • showpost_post
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • showpost_complete