PDA

View Full Version : Using Memcached to optimize vB hacks


MoMan
08-12-2010, 10:00 PM
As we all know, there are lots of great hacks here at vB.org, but some of them are certainly not written with large forums in mind. On my site, www.pentaxforums.com (http://www.pentaxforums.com/forums/), which averages 1,200-2,000 simultaneous members, I would love to install every useful hack that I come across, but I can't really afford to have silly statistics and gimmicks add global queries to the database.

Many hacks have one or more of these issues:
-Add global database queries
-Use slow/redundant database queries
-Repetitively perform strenuous computations

I've therefore turned to Memcached to cache frequently-updated yet non-critical data in order to save queries and increase page generation time. I've applied this to the following hacks, just to name a few:
-Top poster on forum home (5 minute caching)
-Forumhome social group stats (5 minute caching)
-Moderated posts / subscribed threads in notifications (5 minute caching)
-Cyb advanced new posts (5 minute caching)

In addition, I use this for:
-Fully caching the current forum activity in index.php
-Fully caching the output of showgroups.php

All in all, I've saved a total of 5 queries on forumhome and 3 global queries by applying the cache, and also significantly reduced page generation time, which is the real biggie. For example, without caching, my forumhome would take about 0.30 seconds to generate. With caching, however, the generation time falls to about 0.07 seconds - that's over 4x faster!

I'd like to share with you the code I wrote to accomplish the caching. It assumes you have Memcached installed on your server. This particular code is for the forumhome top poster hack. As you can see, it's quite simple in structure, and can therefore easily be adapted to other hacks as well.

The general pseudocode is:


connect to cache
get data from cache
if data expired:
get data from database
update cache



// Cache data for a certain time period to reduce queries!
$cache = array();
$memcache = new Memcache;
$memcache->connect('MEMCACHED SERVER IP GOES HERE', 'MEMCACHED PORT GOES HERE');

// Set for how long to cache, in seconds
$cache['limit'] = 600;

// The keys used for caching
$cache['datafile'] = 'some_string';

// Check cache state
$data = $memcache->get($cache['datafile']);

if ($data === false)
{
// Begin main add-on code
$postcount = $db->query_first("
SELECT COUNT(*) AS count
FROM " . TABLE_PREFIX . "moderation AS moderation
INNER JOIN " . TABLE_PREFIX . "post AS post ON (post.postid = moderation.primaryid)
WHERE moderation.type = 'reply'
");

$threadcount = $db->query_first("
SELECT COUNT(*) AS count
FROM " . TABLE_PREFIX . "moderation AS moderation
INNER JOIN " . TABLE_PREFIX . "thread AS thread ON (thread.threadid = moderation.primaryid)
WHERE moderation.type = 'thread'
");
// End main add-on code

// Save the data we just fetched to the cache
$memcache->set($cache['datafile'],array($postcount['count'],$threadcount['count']),MEMCACHE_COMPRESSED,$cache['limit']);

}
else
{
// Use data from the cache
list($postcount['count'],$threadcount['count']) = $data;
}

// Code that uses the data
$vbulletin->userinfo['poststomoderate'] = $postcount['count'];
$notifications['poststomoderate'] = array(
'phrase' => $vbphrase['posts_awaiting_moderation'],
'link' => 'http://www.pentaxforums.com/forums/moderation.php?do=viewposts&type=moderated' . $vbulletin->session->vars['sessionurl_q'],
'order' => 10
);
$vbulletin->userinfo['threadstomoderate'] = $threadcount['count'];
$notifications['threadstomoderate'] = array(
'phrase' => $vbphrase['threads_awaiting_moderation'],
'link' => 'http://www.pentaxforums.com/forums/moderation.php?do=viewthreads&type=moderated' . $vbulletin->session->vars['sessionurl_q'],
'order' => 10
);

// Close the connection
$memcache->close();


Here, $postcount['count'] and $threadcount['count'] is the data from the queries which ends up being cached. The nice thing is that even if the data is fetched from the cache and not the database, it can be accessed through the same variable. This is because you can store anything in memcache- even templates that have already been evaluated.

If you currently don't have memcached installed on your server, it's quite easy to install using PECL. To check if you have it installed, upload a script to your server that contains the following code:
die(class_exists('Memcache'));

A few notes:

it's good practice to add vbulletin options for the cache timeouts so you can keep things centralized, and $vbulletin->config for the memcached servers
you can substitute time() calls with the constant TIMENOW if you're inside vbulletin
only use my method for code with global queries or intensive computation! A trivial mysql query is usually faster than connecting to memcache and reading from it. Try running some basic benchmarks using microtime() to find out if caching is worth it.
Try to keep the individual data fragments you cache as small as possible. The smaller the data, the faster it can be fetched.


So, in conclusion, if used properly, this code can speed up your forum tremendously. If you have a big board, try giving it a spin! Also, if you're interested in reducing your server's memory load, look into installing APC for PHP. Note that on a production environment, it would be better to have a global memcache connection instead of initializing it every time you try fetching data.

I hope you guys found this useful!

ssslippy
08-28-2010, 07:21 PM
How hard is this to adapt to xcache. Running only 2 servers and no reason to run memcache yet.

MoMan
08-28-2010, 09:18 PM
I found the xcache API here: http://xcache.lighttpd.net/wiki/XcacheApi

So you'd basically want to change the memcache set and get calls to xcache_set and xcache_get

monkeyboy1916
09-01-2010, 09:56 PM
I'd love to see a method working with xCache. I'm not able to at the moment but will try when I can, so if someone figures out this method, please post it for others to make use of~

Great idea btw MoMan.

abdobasha2004
09-03-2010, 06:31 PM
great
thanks

ssslippy
09-06-2010, 01:06 AM
Would it be possible to see the code you used for the showgroups page?

MoMan
09-11-2010, 02:40 PM
I'm not going to post the whole thing as it's site specific and would likely break things on others' sites, but the general steps I followed are (note that there are other approaches as well, such as caching the entire query, but I figured it would be best performance-wise to just cache everything):

1. Strip out the navbar/header/footer from the SHOWGROUPS template
2. In showgroups.php, cache the evaled output of SHOWGROUPS in a variable called $HTML using the code model at the top of this thread
3. At the end of the file, print the output using GENERIC_SHELL. This will prevent caching of the navbar/header/footer, so that they are always current.

Note that if you use multiple styles that don't share the same graphics, you either have to set $cache['datafile'] = 'something' . $styleid, or use a global replace for image folder paths.

Shabcool
09-15-2010, 05:22 AM
thanks

compwhizii
09-15-2010, 02:41 PM
You should make of the Datastore system: http://members.vbulletin.com/api/vBulletin/vB_Datastore.html

MoMan
09-18-2010, 04:15 AM
I've found that going through the datastore is much slower than this on-the-fly approach (which is still fine from a design standpoint as long as you use $vbulletin->config for the server info and an option setting for the caching time), since memcached isn't good at dealing with large amounts of data.

as7apcool
09-25-2010, 05:35 AM
thanks 4 this article

MoMan
10-25-2010, 01:20 PM
A small update: if your server has PHP stability issues, I recommend you wrap a class_exists('Memcache') conditional around the memcached calls.

abdobasha2004
10-29-2010, 01:07 PM
extra ordinary
will give this a try
I got many mods that slows my forum down

labadora
07-10-2011, 04:56 AM
Would it be possible to see the code you used for the showgroups page?

Disasterpiece
07-10-2011, 05:47 AM
You are aware that there is an extra setting area for this in the config.php where you can switch the default datastore handler to memcached?

It's completely disastrous and unnecessary to re-code all hacks if this can be done with uncommenting a codeblock.

Read this: https://www.vbulletin.com/forum/entry.php/2391-Supercharge-your...

I highly recommend to update the first post.

kh99
07-10-2011, 10:28 PM
... there is an extra setting area for this in the config.php where you can switch the default datastore handler to memcached...

Well, if a mod isn't written to use the datastore then enabling memcached for the datastore won't change anything. And if it is written to use the datastore then it probably doesn't need to be changed. But I suppose you could argue that if you are going to change a mod then it makes more sense to change it to use the datastore.

MoMan
07-24-2011, 06:49 PM
One problem with the memcached is that it can't store anything over 1Mb, and it's slow at reading out large chunks of data. Not surprisingly, I've thus found the database datastore to be better (faster) than the memcached datastore as it contains quite a bit of data, especially if you're running vbseo. Therefore, caching the smaller stuff in memcache pays off, as does the extra effort taken to modify inefficient mods.

What I've done is I added a global $vbulletin->memcache state via global.php. As such, storing data is very easy as I don't need to create an instance of the memcache object every time I want to use it.

On my board, using memcached to cache the forumhome "who's online" block cuts index.php's generation time from 0.15s to about 0.07s. This is because fully caching the listings saves hundreds of evaluations forumhome_loggedinuser. I think that in total, I cache 15 or so mods, and it really pays off. This includes my very own vb3 sidebar mod, which would otherwise take over a second to fetch the most recent data from the databse.

One piece of advice I can give everyone is that you need to understand what memcache is good at prior to using it for caching. You don't want to write mods that rely on memcache to work, or cache things for more than an hour- as then, you're probably better off just using the database to begin with.

---------------------

Regarding showgroups.php, let me start by saying that nobody ever visits that page. But if you still want to cache it, there are two approaches: you can save the result of the query and fetch it from the cache, or you can cache the entire page. Note that unless you have hundreds of users on your showgroups page, you probably won't gain much from caching.

My forum has some 1,100 users listed, and takes between 1.5 and 2 seconds to generate if you don't cache. With the cache, it's spat out in under 0.1s, of which 0.07s can be attributed to memcache (again, on my server- this varies from site to site).

Here's the rough idea, with the actual vb code cut out as required by the license:

if ($cache['timeoffset'] < $cache['limit'])
{
// HTML for the output is obtained here if the cache is used
$HTML = $vbulletin->memcache->get($cache['datafile']);
}
else
{
// Execute the default script if needed


// 2 is the default location field and the one we always use in the template
// vbulletin code
// ************************************************** *****

// We're using GENERIC_SHELL, so populate the HTML variable
eval('$HTML = "' . fetch_template('SHOWGROUPS') . '";');

// Write the resulting HTML to the cache
$vbulletin->memcache->set($cache['datafile'],$HTML,MEMCACHE_COMPRESSED);

// Write the time of the caching
$vbulletin->memcache->set($cache['timefile'],TIMENOW);
}

I'm pretty sure I modified my SHOWGROUPS template in order to be able to do full HTML caching, but I forget.

---------------------

Another idea for those who want to be even more efficient is to get both the time and data in a single $memcache->get call, which will save you 1 get per pageload at the expense of fetching unecessary data if it turns out the cache need to be updated. You can also try playing around with $memcache->add, although I've found the behavior of this to be somewhat unpredictable.

MoMan
08-02-2011, 11:38 PM
FWIW I have come up with a new way to cache that doesn't have to check for a timestamp every time. This works especially nicely if you write your own memcached wrapper class.


$data = $memcache->get('key');

// Main content
if(!$data)
{
// generate $data
$page->cache->set('key',$data,MEMCACHE_COMPRESSED,86400);
}

MoMan
08-09-2011, 09:54 AM
I have now updated the article in the first post with my new approach, which only makes 1 memcache call per regular pageload, and 2 when the cache needs updating. The old approach used 2 calls per pageload and 3 when the cache needed updating

MoMan
10-31-2011, 07:13 AM
Here's my latest update to the caching function to make things even easier. Simply specify a cache key and a callback function - if the data exists in the cache, it will be fetched. Otherwise, the callback will be called and the returned data will be stored in cache and returned to the user.


/**
* Attempts to fetch key from the cache. On failure,
* calls the callback and stores the data
*/
public function fetch($key, $callback, $time = 3600)
{
$data = $this->memcache->get($key);

if (!$data)
{
$data = call_user_func($callback);
$this->memcache->add($key, $data, $time);
}

return $data;
}


Usage:

$cacheobj->fetch('data_key', 'some_function');

See php's documentation on call_user_func for more info on what $callback can contain.

punchbowl
01-31-2015, 07:52 PM
Found this through googling. Fantastic insight and great to see a top quality, modern looking forum, still on vb3.8. Great work op and thanks for sharing.