Didn't want to see this thread die as I still heavy rely on Sphinx for my vB search. It's disappointing that the vB team *still* has not come up with a built-in solution for searching that is acceptable for large forums.
One bit of advice - The code in the vB search does a lot of weighting & filtering. I have many instances where I search for a specific word that is in a post, and doing a raw search I can find it, but after vB works its magic it will give a 'no results found'. So don't think something is broken, I guess technically it is, but it's 'by design'.
Anyhow, here's some bits of code that might help get your sphinx running a little smoother.
One thing you really should do is run the search daemon & indexer under a
non-root user, I use 'sphinx'. If yours is different than simply adjust the files accordingly.
you will probably need to make a few directories mentioned in the scripts below and have them owned by your user that sphinx is running as. (like for the lock file & log)
I run on Redhat / CentOS, here's my script that goes in the /etc/rc.d/init.d/ directory. I call it 'searchd'. Simply use 'chkconfig' to add it and have it start up when you system boots.
Code:
#!/bin/sh
#
# searchd This script starts and stops the sphinx search engine
#
# chkconfig: - 80 15
#
# description: Stand Alone Search Engine
# processname: searchd
# config: /usr/local/etc/sphinx.conf
# pidfile: /var/run/searchd/searchd.pid
# Source function library.
. /etc/rc.d/init.d/functions
RETVAL=0
start() {
echo -n "Starting Sphinx: "
sudo -u sphinx /usr/local/bin/searchd --config /usr/local/etc/sphinx.conf > /dev/null 2>&1
RETVAL=$?
if [ $RETVAL -eq 0 ]; then
success startup
touch /var/lock/subsys/searchd
else
failure startup
fi
echo
return $RETVAL
}
stop() {
echo -n "Shutting Down Sphinx: "
kill `cat /var/run/searchd/searchd.pid`
RETVAL=$?
if [ $RETVAL -eq 0 ]; then
success shutdown
rm -f /var/lock/subsys/searchd /var/run/searchd/searchd.pid
else
failure shutdown
fi
echo
return $RETVAL
}
restart() {
stop
start
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
restart
;;
status)
status searchd
;;
*)
echo $"Usage: $0 {start|stop|restart|status}"
exit 1
esac
exit $RETVAL
Second, here is my cron script. I made a directory called 'cron.quarterly' that runs every 15min. Similar to the cron.hourly, cron.daily, etc... You can do whatever. If you make that directory be sure to edit your /etc/crontab file accordingly too. I added a time-check so once a day sphinx will do a full-reindex, and it makes a lock file so if yours takes a long time to reindex you won't run into issues. Basically at 5am it will do a re-index. I chose that time because it's a low usage period for my forum and server. By default the cron.daily scripts run at 4am so you wouldn't want to do it then since everything else will be eating CPU cycles.
Code:
#!/bin/sh
# the lockfile is not meant to be perfect, it's just in case the
# two sphinx cron scripts get run close to each other to keep
# them from stepping on each other's toes.
LOCKFILE=/var/lock/subsys/sphinx_indexer
# If the lockfile exists then exit!
if [ -f $LOCKFILE ]; then
echo "Lockfile already exists, not running sphinx indexer!"
exit
fi;
touch $LOCKFILE
compareh=$(date +%k)
comparem=$(date +%M)
if [ $compareh -eq "5" ] && [ $comparem -le "14" ]; then
sudo -u sphinx /usr/local/bin/indexer --config /usr/local/etc/sphinx.conf --rotate --all > /dev/null 2>&1
else
sudo -u sphinx /usr/local/bin/indexer --config /usr/local/etc/sphinx.conf --rotate post_index_delta thread_index_delta > /dev/null 2>&1
fi;
rm -f $LOCKFILE
exit 0
I also have my logs in their own /var/log/searchd/ directory, you set this in the sphinx config (most people probably just have them in the /var/log dir). Again that directory will need to be owned by the sphinx user that you use.
Code:
/var/log/searchd/*.log {
missingok
compress
postrotate
if test -n "`ps acx|grep searchd`"; then
/sbin/service searchd restart 2> /dev/null > /dev/null || true
fi
endscript
}
Oh, and for people wanting to know how to implement the
sort_search_items(), here's some code. It goes in the
includes/sphinx.php file.
At the end where there is the following:
Code:
if ($vbulletin->GPC['titleonly'] == $vbulletin->GPC['showposts'])
$orderedids[$docinfo['attrs'][$sphinx_switch_fields]] = $docinfo['attrs'][$sphinx_switch_fields];
else
$orderedids[] = $doc;
}
}
else
$orderedids = array();
}
Replace that with:
Code:
if ($vbulletin->GPC['titleonly'] == $vbulletin->GPC['showposts'])
{
$orderedids[$docinfo['attrs'][$sphinx_switch_fields]] = $docinfo['attrs'][$sphinx_switch_fields];
$itemids[$docinfo['attrs'][$sphinx_switch_fields]] = $docinfo['attrs'][$sphinx_switch_fields];
}
else
{
$orderedids[] = $doc;
$itemids["$doc"] = true;
}
}
}
else
{
$orderedids = array();
}
// #############################################################################
// now sort the results into order
// #############################################################################
if (!$vbulletin->GPC['titleonly'] OR $vbulletin->GPC['showposts'])
{
// sort by database field
if ($vbulletin->GPC['sortby'] == 'post.dateline' || $vbulletin->GPC['sortby'] == 'lastpost')
{
if (empty($itemids))
{
$errors[] = array('searchnoresults', $displayCommon);
}
else
{
// remove dupes and make query condition
$itemids = iif($vbulletin->GPC['showposts'], 'postid', 'threadid') . ' IN(' . implode(',', array_keys($itemids)) . ')';
// sort the results and create the final result set
$orderedids = sort_search_items($itemids, $vbulletin->GPC['showposts'], $vbulletin->GPC['sortby'], $vbulletin->GPC['sortorder']);
}
}
}
// END Results
}
Be sure to leave the unset line at the very very bottom.
I'm about to make the upgrade to vB 3.6.8 whatever, and I'm running php 5.2.4, so I'll let you know how the upgrade goes. I want to look over everyones modifications and updates in this thread. Then I'll post some more files and stuff if necessary.