PDA

View Full Version : Post table is too big, how to reduce or split it?


chikkoo
05-23-2009, 08:30 PM
Hi..

My forum has been running for last 7 years, therefore the Post table has grown more than 13 gb. Due to the size of the post table, it is crashing very often, at least once or twice a month.

My server management company suggested me that there should be an in-built function in vbulletin or should be a Mod to move the very old and inactive threads of certain no. of days to move to a different table or database. So that my current table size will become very less.

Does anyone know how to reduce the Post table by moving inactive old threads to a separate database or table.

Thanks for your suggestion to save my forum from crashing every month.

Paul M
05-23-2009, 09:28 PM
There is no inbuilt function to do this, you would need to run a few manual queries to copy old threads/posts to "archive" tables and delete them from your live tables. Note that vbulletin would no longer have access to those threads/posts.

DragonBlade
10-08-2009, 03:12 AM
Sorry for reviving a months-old thread, but I find myself in a similar predicament here.

However, I'm attempting to modify it somewhat.

I have a 2.2GB InnoDB post table. (We had changed it over from MyISAM after using Sphinx to be able to get rid of the FULLTEXT search.) I find that it is much too large for our little 540 Linode to handle.

Instead, I am planning on creating an "Archive" system, where Mods can toss threads into an Archive forum (and maybe have the Admins able to batch enter them into a new table, unless it would be better to just do it each time a Mod moves a thread there). This new table would be MyISAM, as MyISAM has a higher read rate if I am not mistaken than InnoDB. Since there would not be very many writes to it (only when Mods archive a thread), I would not have to worry about table locking issues, which is why I considered InnoDB in the first place.

I say "this table," but it would actually be two tables: one for a post_archive and one for a thread_archive. It would take the content from the InnoDB "live" tables, insert it into the "dead" MyISAM tables, and kill the shit on the InnoDB "live" tables.

I would have to configure another search index (or dozen, lol) on Sphinx, I would need the two tables, and I would have to replicate the "showthread" and "showpost" pages for the archive... Am I missing anything here?

Also, has there ever been a Mod written for this? I'd be happy to share my steps if not, and my results.

drjarmin
11-19-2009, 03:40 AM
Do mySQL db's ever need 'shrinking'? Is there a function to do this? Mine is also growing too big, but not sure why.

Thanks

mikejp
11-19-2009, 03:44 PM
We would love such a mod.

The other part that it would have to do, is redirect the users to the archives from the old URL.

Have you started your project? We might be able to assist in it as well if you would like.

Mike

kris
01-17-2010, 02:04 PM
chikkoo , did you find any solutions ?

DragonBlade is there any progress in your development ?

We need to find good solution to this problem, big size of my post table keeps me away from adding more mods to my forum.

Amenadiel
01-22-2010, 11:16 AM
MySQL 5.1 supports table partitioning. BUT, you can't partition a table with fulltext indexes, so there's the catch.

raywjohnson
01-23-2010, 09:17 PM
A possible solution: A second install of vB.

Create a product for both the Live and Archive installs. (but have them use the same database, with different table prefixes)

The Live product:

daily cron (Scheduled Task) to do the archiving
have a cutoff (in days) for what threads will be archived (using the date of the last post)
allow the marking of threads "no archive" to prevent archiving
ignore archiving of sticky threads


The Archive product:

daily cron (Scheduled Task) to clean up the archived threads/forums (i.e. build_thread_counters, forums via build_forum_counters and build_forum_permissions)
lock down most scripts on the arcive install ( only these would need to be accessible: index, forumdisplay, faq, search, image, showthread, showpost, actions, archive, misc )


If you use Sphinx Search, you would have to do a seperate install for that as well.

This would need to be able to detect changes to the thread tables (and post table) due to other products.

To make this work flawlessly, you would need to find a way to allow access to the Archive by members that are already logged into the Live install.

If I had the time, I would see if this could actually be done.

--RayJ

Marco van Herwaarden
01-25-2010, 10:31 AM
That would also need a second license.

Carnage
01-25-2010, 11:21 AM
My instinct on how to deal with this would be two or more tables

archive_0_post, archive_1_post ... archive_n_post and live_post

Modify your board to make new posts to live_post; periodically move posts from live_post to archive_n_post based on last use (eg if they are in a thread that is updated often leave them there even if they themselves are fairly old)

Create a view called post joining all the archive tables and the live table the view should be updatable so post edits etc will still work. I don't know how sphinx deals with indexes, you may have to add indexes for each table separatly or have an index on the view.

I don't know what the performance will be like as I don't know exactly how mysql deals with views - it may well get worse but in your situation, this would be the first solution i'd look at in detail as it requires the least code edits/loss of functionallity.

raywjohnson
01-25-2010, 07:56 PM
That would also need a second license.

I forgot to mention that this might be the case.

--RayJ

raywjohnson
02-04-2010, 07:50 PM
After some thought on the fact that you would need two licenses to create an archive in the way I suggest (even though you would be using only one site, for one forum, accessing the same data), I think the solution is to use phpBB3 (or other free forum software) as the archive install.

--RayJ

forsanelhaq
10-02-2010, 04:27 AM
i have the same problem and always my table post crashing

please help me something like product because i am not expert in sql

Paul M
10-02-2010, 11:50 AM
Tables should not "always" crash, no matter what size they are. If this is your problem then the issue is not its size, but your mysql and server itself.

Alfa1
10-02-2010, 04:50 PM
Would it be possible to work with multiple databases and make all databases searchable with Sphinx?

kmike
10-04-2010, 06:21 PM
Yes, you can define multiple data sources, one for each database, and combine corresponding indexes using a single distributed index. There's one important limitation though - document ids should be unique throughout all databases.

HeLLCiTo
09-09-2011, 01:24 PM
up,

show up any solution to this?

Magnumutz
09-12-2011, 10:25 AM
The post table size is a problem for pretty much ALL big boards, because there are lots of posts and when performing database back-ups or restores, you might get timeouts because of the hugeness of the post table.
A solution would be awesome!

SteveC
11-16-2012, 06:19 AM
Has there ever been any solution found to this? What do you do when a vbulletin site gets large?

raywjohnson
11-16-2012, 11:15 PM
My solution was to setup a second forum software, then convert and archive all posts over two years old to that system. It makes the old post table read only and keeps the live post table at a manageable size.

--RayJ

joshskeety
07-31-2013, 05:42 PM
It astounds me that VB hasn't thought of this yet..

I saw one VB admin just say.. "Why not move everything to an archive forum (meaning add another forum to your forum called archive)." And I was like.. REALLY? So the vb post table stays JUST as large if you do that, what have you archived? Nothing..

I actually brought this to their attention a few years ago and was basically laughed at. I had nearly a 4 gig post database that made backing up the forum impossible, made searching posts horrible, any repair to the database took HOURS. I am a Systems Engineer and EVERY system we use including just PST's have an archive or some way to relieve a database. This is typical in almost anything that uses a SQL database backend.. Why VB hasn't come up with an easy solution or why no hack has been created to do this is beyond me.

The solution is easy in my mind. Allow a connection to a SECOND database and set up a VB cron job automatically (like in the options section) to 1 in the morning, or something similar to move posts from (X) days from one database to the other. Allow the users table to access that database and use VB Search function to pull up those archives. That way you keep your main database at a minimum, backing up is a cinch, your archive you can backup like once a month MAYBE. Your MYSQL runs better, your forum runs better and repairing or optimizing your tables take 2 minutes rather than 2 hours..

Really, this is something that should of come standard on like VB version 2. The fact that there isn't even a MOD just astounds me.. My forum is pretty big and massive, but what about guys who most likely have post tables that are 20-50 Gig in size.. How do they do it?

I know my solution was to piss off all my users and deleted all our posts from 2004 until 2010.. It was a damn shame to lose all of them, my statistics got all screwed up, the users can't go back and find a post. We had 3 GREAT users who passed away and we couldn't go back to their posts..

If I knew how to code, I would have made this 7 years ago..

raywjohnson
07-31-2013, 07:42 PM
The issue with a large MySQL table is when a new record is added (i.e. the table it written to). The table has to be locked before the new record is added. This takes memory and processor power to accomplish.

Reading data from a table is quick and easy (not nearly as memory or processor intensive).

Moving old records (thread/posts) to a second table (that is read only) allows your active table (read/write) to be small, and your archive table (read only) to be as large as you need.

Also, the use of Sphinx Search will make searching the large table simple (and very fast).

It is not the optimal solution, but for those of us on a budget (and cannot afford to pay for additional MySQL servers w/huge amounts for RAM), this works well.

The large table is still unwieldy for many normal operations. A better solution would be to have a table for every year of threads and posts. Or just divide it by size. Say, 1GB max per archive table.

Since I set this up, my forum and server run smooth and efficient. Even as the archive table has grown to 7GB+.

--RayJ

Zachery
07-31-2013, 08:23 PM
Actually, that's not an issue with MySQL tables, that is an issue with MySQL tables in the MyISAM format. Innodb has row level locking.

The large post table isn't a big deal.

raywjohnson
07-31-2013, 08:39 PM
Yes. Thanks!

I forgot to mention I was talking about MyISAM tables.

When I first looking into changing to Innodb, it required a prohibitive about of RAM (as far as cost goes). I think that has changed, especially with cloud servers, and the cost is now very reasonable. So even for large multi-gigabyte tables, you can afford the RAM needed to run Innodb.

--RayJ

AusPhotography
07-31-2013, 10:40 PM
Which version of vB?
We use InnoDB on vB4.2.1; MySQL 5.5 without any issues.

raywjohnson
07-31-2013, 10:53 PM
v 3.8

I will be upgrading to v4 (or maybe 5) soon. After I convince the membership. And setup a cloud server (possibly with a separate MySQL server). Then I will use the InnoDB tables.

--RayJ

joeychgo
08-01-2013, 10:43 AM
How many posts do you have to get a 7gb post table?

I have 2.5 million posts and the post table is only 1.6gb. I have no issues with how it runs.

Paul M
08-01-2013, 12:15 PM
Really, this is something that should of come standard on like VB version 2. The fact that there isn't even a MOD just astounds me.. My forum is pretty big and massive, but what about guys who most likely have post tables that are 20-50 Gig in size.. How do they do it?
There is absolutely no reason for this to be "standard", and the fact no one has spent a load of time trying to make a Mod certainly doesnt astound me.

I have two forum databases in front of me, both have between 2.1 and 2.5 million posts in them, both Post tables are less than 2GB. To get tables the size you mention would reqire forums with 20/30 Million posts (or more). You could probably only find 2 or 3 of them in existance, and they simply make sure they have the hardware to cope with such large databases. As your forum gets bigger, you need to increase your system environment to keep up.

joeychgo
08-01-2013, 12:42 PM
My forum has been running for last 7 years, therefore the Post table has grown more than 13 gb. Due to the size of the post table, it is crashing very often, at least once or twice a month.


Since I set this up, my forum and server run smooth and efficient. Even as the archive table has grown to 7GB+.



That's why IM confused. The OP is talking 13gb and raywjohnson is talking just his archive is 7gb. Im wondering what is causing these tremendously large DB tables, or if there is something wrong with their setup.

final kaoss
08-01-2013, 03:12 PM
Really? Is that really what you think? In my forums niche, we have about 25K posts and the post table size is already at 250+ MB :D So at that rate it'll hit 2GB at around 200K posts:)

To get tables the size you mention would reqire forums with 20/30 Million posts (or more).

raywjohnson
08-01-2013, 06:48 PM
The forum I run currently has 326,955 threads and 7,518,307 posts.

--RayJ

Simon Lloyd
08-01-2013, 11:46 PM
You'll probably find that there is (or has been) an issue with the post table, when a repair was attempted it failed and left an unfinished temp file behind which you need to delete prior to repairing again then all will be good!, you'll need to use ssh to find the postindex.TMD file and delete it, when a file is attempting to repair they make another temporary file, if the databse crashes for some reason when this is taking place it leaves that file unfinished, trying to repair again does nothing as that file still exists, i suggest you backup your postindex table as it is then via ssh find that file, delete it, and repair again, you could run the Check table or analyse table first, when you see the full text it will give you the name of the file that already exists.

final kaoss
08-02-2013, 06:18 PM
No there isn't any such issue. With the Niche my forum is in, alot of content goes into each post, thanks though.

Gripi
10-01-2013, 05:44 PM
I'm running forum with:

Threads: 415,471
Posts: 11,706,710
Members: 181,871

In my old web hosting, crash once or twice every 1-2 month, but quick repair always fix the problem.

After moving to other web hosting with new configuration, already running for 2 month and never have any database issue.