Go Back   vb.org Archive > vBulletin Modifications > Archive > vB.org Archives > General > Big Board Discussions
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Is this a solution for large threads? Details »»
Is this a solution for large threads?
Version: , by amcd amcd is offline
Developer Last Online: Apr 2018 Show Printable Version Email this Page

Version: Unknown Rating:
Released: 03-27-2007 Last Update: Never Installs: 0
 
No support by the author.

The topic of large threads being detrimental to mysql performance is known to all big board admins. But why is it this way with vbulletin? Do all forum software have this problem? Does Gaia online suffer from similar drawbacks?

vb stores the threadid and dateline in the post table. Whenever a bunch of posts needs to be selected, vb filters the post table by threadid and orders the results by dateline, and takes a few of the posts (of course there is an index involved). In other words the query is similar to

SELECT * FROM post WHERE threadid=$threadid ORDER BY dateline LIMIT 8575,15

which means that mysql will have to throw away the first 8575 results and then return the next 15. If my understanding is correct, this is the reason why large threads are bad. We need only 15 rows, but mysql has to analyse thousands.

The post table is read in this manner very frequently, every time showthread.php is run. Other accesses to the post table are far less frequent. They are showpost, newreply, newthread, edit post and miscellaneous moderator functions like delete post, move/copy post, merge posts and merge threads. vb's method of doing things means that the frequent requirement (showthread) is slow whereas the infrequent requirements are much simpler and faster.

Now consider an alternative.

What if the position of the post in the thread is stored in the post table? Lets call this field 'postpos'. The first post in a thread will have postpos=1, and the 19th post will have postpos=19.

Now, the query for showthread will be something like

SELECT * FROM post WHERE threadid=$threadid AND postpos BETWEEN 8576 AND 8590 ORDER BY postpos

This should be extremely fast, and should be equally fast for any page in a thread of any size.

Of course, this means that the postpos column must be updated by all relevant modules of vbulletin. It will need to be given a proper initial value when a new post is made, and it will need to be updated whenever a post is moved, copied, merged or deleted. It will also need to be updated if a thread is merged or split. The additional code on newreply will not be running any heavy query, so it will be quite fast. Other functions may be slow for large threads, but since all functions except adding a new post happen quite infrequently, it should not matter so much.

Am I making a fundamental mistake somewhere? Is this feasible? Can this be done using plugins and without code edits?

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #12  
Old 04-11-2007, 12:46 PM
louis_chypher's Avatar
louis_chypher louis_chypher is offline
 
Join Date: Mar 2006
Location: Boise, Idaho
Posts: 130
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

An approach I took on the matter of a forum gaining a large amount of post was to first install the mod [https://vborg.vbsupport.ru/showthread.php?t=113324]Auto Delete/Move Thread After X days [/url] and explore this mods capibilities.

A close inspection of the mod as orginally wrote allowed post(s) to be moved from forum(s) to a single destination forum.

What I wanted was to have the main forums on my site with the latest post and sub forums nested within each main forum holding posts from 120 days past. Using theAuto Delete/Move Thread After X days as a model I then changed the code to https://vborg.vbsupport.ru/showthrea...=113324&page=3 post #35. Which then allowed me to specify a souce forum to destination forum moves (main forum and 'archive' forum) as parameters. Now my main forums contian the latest topics of discussion (small db return result sets) and those main forums contain 'archived data' (large db result sets).

The mod runs as a cron job.
Reply With Quote
  #13  
Old 05-01-2007, 01:45 AM
Simetrical Simetrical is offline
 
Join Date: Nov 2006
Location: New York City
Posts: 28
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Here's a thought. Assume we know the number of posts in a thread. We do, it's in the thread table which we presumably have already queried to see whether the user can even view anything (thread deleted, etc.). First of all, to fetch the last page of posts (very common usage case):

SELECT postid, visible, userid
FROM post AS post
WHERE threadid = N AND visible IN (...)
ORDER BY dateline DESC
LIMIT ($numposts % $postsperpage);

I don't know about "reverse indices", but just reversing the sort will do fine. Basically you should sort ASC for the first half of the thread, DESC for the second half. That's already worlds better than the current status. (The funny LIMIT is necessary for the last page, of course.) But now the sneaky part. Say you want to go to the next page from the current one. Use:

SELECT postid, visible, userid
FROM post AS post
WHERE threadid = N
AND visible IN (...)
AND dateline > $datelineoflastpostonpage
ORDER BY dateline
LIMIT $postsperpage;

So instead of a link like showthread.php?t=N&page=77, you get links like showthread.php?t=N&after=1177986896. Exactly the number of needed rows get retrieved. Yeah, it will screw up if in the interim someone's deleted some posts earlier in the thread, but a) that's very unlikely and b) the user still wants to see the "previous page", i.e., the stuff before what he just finished reading, so in a way this is more expected behavior. The same can be done for "previous page".

The only bad spot is "jump straight to page" links, but using the DESC sorting for the highest page numbers will alleviate that, so with these two things combined, the only bad thing will be if someone wants to jump directly to a page in the middle of a thread. But any surviving page= URLs could be sharply limited by auto-redirecting them to appropriate &after= URLs (i.e., never have &page= URLs appear in the URL bar, even if it's useful to have them in links sometimes), so cut-and-paste URLs by users would still be efficient. Then there'd be basically no problem that I see with arbitrarily big threads. Maybe set up a cache in memory to store datelines for middle-of-thread page numbers to avoid the situation of a lot of people clicking a link that brings them directly to the middle of the thread and hasn't been optimized to &after=, just for completeness.


This was all inspired by a remark in a presentation by Domas Mitzuas on the infrastructure of Wikipedia, by the way. I realized that indeed, you'll always find "offset" in paged features of MediaWiki, never page numbers.
Reply With Quote
  #14  
Old 05-03-2007, 05:37 PM
orban orban is offline
 
Join Date: Jan 2005
Posts: 445
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

SEO will like you

A possible 10 gazillion URLs for every thread
Reply With Quote
  #15  
Old 05-03-2007, 07:56 PM
Simetrical Simetrical is offline
 
Join Date: Nov 2006
Location: New York City
Posts: 28
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Heh, like we don't have that already? t=143297, or pick one of p=1213498, 1213537, 1213543, . . ., 1240736, [whatever this post number will be], add on any page=, perpage=, lastpost, you name it. You'll want to rely on appropriate nofollows/noindexes and site maps if you want proper indexing of vB regardless of this, I'm thinking. Threads don't lend themselves well to unique URLs (because who says post X will stay on page Y of the thread, or in the thread at all for that matter?).
Reply With Quote
  #16  
Old 05-04-2007, 04:21 AM
kmike kmike is offline
 
Join Date: Oct 2002
Posts: 169
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

The first part of your post, until the offset talk, is exactly what I was talking about in the post #4. You'll need an appropriate compound index ending in "dateline" to take advantage of that however, and scanning it in reverse order (DESC) is why it was called a "reverse index".

Now to the offsets idea, you're just shifting the problem to another place if you want to keep the traditional thread pagination. Now instead of finding a first post on the requested page *once* you need to find the first posts for every page surrounding current page in the navigation area.
For example, looking at the first page in a thread, we'll have these pages in the pagination nav area:
1 2 3 4 11 51 101 Last
and you'll need the dateline offsets for the first posts on the pages 2, 3, 4, 11, 51, 101, and the last page. Ok, getting them for the first 10 pages is easy (though less effective) - just fetch 150 posts instead of 15, but what about pages 51, 101?

Now let's load page 10, what do we have in the pagination nav area?
First Prev 7 8 9 10 11 12 13 20 60 110 Next Last
You'll need an additional query to fetch the navigational data for the preceeding pages 7,8,9, and you still have to handle the pages 60, 110 somehow.

And another thing, basing the navigation on the pure dateline offsets is wrong - for busy enough forums/threads, some posts in a thread may have the same datelines. You can ignore it of course, but to be consistent you have to sort on dateline DESC, postid DESC or dateline ASC, postid ASC everywhere, otherwise the two posts with the same dateline will switch positions when they gradually move in the first half of the thread.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 11:29 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.07998 seconds
  • Memory Usage 2,262KB
  • Queries Executed 20 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (6)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (1)pagenav_pagelink
  • (6)post_thanks_box
  • (6)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (6)post_thanks_postbit_info
  • (5)postbit
  • (6)postbit_onlinestatus
  • (6)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete