Unoptimized query [Archive]

cobaku

09-25-2009, 11:42 AM

is below query normal it sometimes takes 27 seconds on some big forum sections which kills performance like crazy

from what i understand it only counts number of threads in certain forums
yes there are alot of threads on my forum but is there anything i can do to optimize this query, there is 1 more but i can live with the other.

Somehow i feel like noone can help me.

SELECT COUNT(*) AS threads, SUM(IF(thread.lastpost > 1253852688 AND open <> 10, 1, 0)) AS newthread

FROM thread AS thread

WHERE forumid = 1220
AND sticky = 0

AND visible IN (0,1,2)

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE thread range forumid forumid 6 119685 Using where

Time Before: 0.24812 seconds
Time After: 3.76783 seconds
Time Taken: 3.51971 seconds

Memory Before: 3,558.711 KB
Memory After: 3,558.566 KB
Memory Used: -0.145 KB

Dean C

09-25-2009, 01:06 PM

How many threads are we talking? Also where did you pull that query from?

sockwater

09-25-2009, 01:26 PM

How many threads are we talking? Also where did you pull that query from?
forumdisplay.php line 720 in 3.8.4.

Dean C

09-25-2009, 02:23 PM

I don't have a copy of vB's database handy to check, so check this index doesn't already exist. Try creating an index on the thread table with these fields in this order:

INDEX(forumid, sticky, visible)

Saying that, that query shouldn't be be particularly show. Can you also paste the full profiling output.

kmike

09-25-2009, 02:40 PM

The query is already using that index, according to EXPLAIN in the first post:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE thread range forumid forumid 6 119685 Using where
It's just that the number of matching rows is too big.
I'm afraid nothing could be done from the vBulletin side of things. Maybe some MySQL server tweaking will help to speed up that query.

Dean C

09-25-2009, 05:35 PM

The query is already using that index, according to EXPLAIN in the first post:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE thread range forumid forumid 6 119685 Using where
It's just that the number of matching rows is too big.
I'm afraid nothing could be done from the vBulletin side of things. Maybe some MySQL server tweaking will help to speed up that query.

Yep I did say check the index doesn't already exist ;)

KEY forumid (forumid, visible, sticky, lastpost),

Adrian Schneider

09-25-2009, 05:46 PM

All code optimizations aside, you could edit this forum so it only shows threads from the past 6 months or year instead. This should significantly reduce the # of threads and still utilize the index.

kmike

09-25-2009, 06:18 PM

Oh yes.
Hopefully the default sort order for that forum is by the last post time, otherwise the index wouldn't work.

cobaku

09-25-2009, 07:53 PM

thanks so much for the replies , i think i understand the solution from the replies
i am on it right now

that forumid had 120.000 threads

--------------- Added 1253994808 at 1253994808 ---------------

i could only do siradrian's solution and it works.

Do you think this mod will help me with only single query
https://vborg.vbsupport.ru/showthread.php?t=210897
it looks a little hard.

Adrian Schneider

09-28-2009, 09:11 AM

Afraid not. That mod is just to help automate common administrative queries. The easiest solution to implement is changing the option I mentioned above.

If you do not want to do that (and honestly I don't see why not...) then here is my thought.... Why on earth would you want to browse through 120,000 threads? The data does not get lost; you can still search and find it. I don't see people going past about 5-10 pages. Maybe 20 if they are bored. People use searching.

If your forum is that busy, where you cannot restrict posts by the past 6-12m because users will want to browse further, then I would suggest that you create some subforums and shift the content there instead. This way you can index by forumid and get way faster results without losing any user experience.

Anyway, if you insist on returning huge datasets back to the user, then here are more ideas:

Upgrade your hardware - This is generally expensive and will only solve the problem temporarily.

Implement caching - If you use 3rd party caching, I suppose it's of medium difficulty to set up. If you use MySQL's query cache, it would be pretty simple. In this example, all you'd need to do is some normalization on the linux timestamp it uses in the field list. If you round it to the nearest thousand or even hundred, it would be utilized. The way it's set up now, it can't be used. What this would do is have the query results re-used (instant repeat queries) until the data changes and invalidates it.

Again, especially if you use MySQL query caching, this is a band-aid solution.

Implement Sphinx -This is quite difficult / expensive to set up, but it's probably the most beneficial in terms of performance and growth support. Sphinx is a search daemon which is often used to replace keyword searching, but it can also perform full data scans VERY quickly. You can have it scan through hundreds of millions of rows in the time it would take MySQL to scan tens of thousands. For a board your size, it would be bringing the data back in under a tenth of a second easily.

cobaku

09-28-2009, 08:37 PM

i was already using your suggested solution, but i feel like i have to eliminate the problem completely.
thanks for your amazing detailed reply. I will definitely try sphinx.
i am using query caching and will also search what normalization on the linux timestampis

i am currently stuck at configuring
what i am trying to optimize is this

$threadscount = $db->query_first_slave("
SELECT COUNT(*) AS threads, SUM(IF(thread.lastpost > $lastread AND open <> 10, 1, 0)) AS newthread
$hook_query_fields
FROM " . TABLE_PREFIX . "thread AS thread
$tachyjoin
$hook_query_joins
WHERE forumid = $foruminfo[forumid]
AND sticky = 0
$prefix_filter
$visiblethreads
$globalignore
$limitothers
$datecut
$hook_query_where
");

what i have to configure is
at /etc/sphinx/sphinx.conf
below

source src1
{
type = mysql

sql_host = localhost
sql_user = test
sql_pass =
sql_db = test
sql_port = 3306 # optional, default is 3306

sql_query = \
SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content \
FROM documents

sql_attr_uint = group_id
sql_attr_timestamp = date_added

sql_query_info = SELECT * FROM documents WHERE id=$id
}

i found an example at https://vborg.vbsupport.ru/attachment.php?attachmentid=80138&d=1210237129 i hope it works

--------------- Added 1254179115 at 1254179115 ---------------

ok it seems this config thing is much bigger than i can ever understand

was worth to try

but i may still implement search sphinx

Adrian Schneider

09-28-2009, 11:40 PM

You'll have to make code modifications and play with the configuration quite a bit to get it to work. It's definitely not simple to set up. If I get some time in the future (probably not anytime soon, but maybe within a few months) I can post a tutorial for this.

However, I'm still of the opinion that browsing through over 10,000 rows of anything is not useful.

All normalization of the timestamps means is rounding... since the timestamp will increase every second, it can never re-use queries in the cache. If you round a few digits on the timestamps, the cache can be utilized.

Dean C

09-29-2009, 04:39 AM

However, I'm still of the opinion that browsing through over 10,000 rows of anything is not useful.

Certainly not to find out newpost counts :)

cobaku

09-29-2009, 08:09 AM

i never knew biggest boards uses sphinx indexing
other than search.

Thanks alot for the big tip.
I somehow managed to setup a sphinx search altough it does not list my older posts yet :) i have to fix it soon

I am quite happy with my new search system and may try to achive that indexing heavy queries with sphinx