PDA

View Full Version : SELECT DISTINCT problems


Mistah Roth
03-06-2007, 10:02 PM
My script displays the last 5 threads, a user has posted in. The main query is as follows:

SELECT distinct threadid FROM ". TABLE_PREFIX . "post WHERE userid = " . $vbulletin->userinfo['userid'] . " ORDER BY postid DESC LIMIT 5

That should work... because if you remove DISTINCT it shows the last posts you made including duplicates of the same thread if you posted more than once in a thread.

For some reason, when you add DISTINCT, it does not return the last 5 threads posted in. For some reason it skips any older threadids. If I remove the distinct from the query, it shows the last 5 posts properly. The problem is we can't have repeats, so DISTINCT is supposed to solve that.

Anyone have any idea whats going wrong? If all else fails I'll just make it so that it filters out doubles with PHP code.... not as efficient as I'd like it though, there should be no reason the query doesnt work.

Zachariah
03-06-2007, 11:40 PM
SELECT distinct threadid FROM ". TABLE_PREFIX . "post WHERE userid = '" . $vbulletin->userinfo['userid'] . "' ORDER BY postid DESC LIMIT 5

I would say you need to add single quotes around " . $vbulletin->userinfo['userid'] . "

Marco van Herwaarden
03-07-2007, 06:40 AM
SELECT distinct threadid FROM ". TABLE_PREFIX . "post WHERE userid = '" . $vbulletin->userinfo['userid'] . "' ORDER BY postid DESC LIMIT 5

I would say you need to add single quotes around " . $vbulletin->userinfo['userid'] . "
It is a numeric column, so no quotes needed.

SELECT DISTINCT threadid FROM post WHERE userid = 1 ORDER BY dateline DESC LIMIT 5

This seems to work correct. But might need to test more with different test data.

What MySQL version are you using?

Mistah Roth
03-08-2007, 04:26 AM
MySQL version 4.1.18

If I run the following Query, here are my results (I increase the limit number to show)

SELECT distinct threadid FROM post WHERE userid = 1 ORDER BY postid DESC LIMIT 15

Gives me the following:

threadid
1355
140
1321
1338
140
1334
1333
1311
1332
1319
1311
1284
1312
1311
1308

Those are my actual last 15 posts. Now if I add distinct, this is what I get:

SELECT distinct threadid FROM post WHERE userid = 1 ORDER BY postid DESC LIMIT 15

threadid
1355
1321
1338
1334
1333
1332
1319
1284
1312
1311
1310
1309
1308
1300
1288

Notice all the lower threadid's are not included.

Any ideas?

TECK
03-08-2007, 04:55 PM
For sure is not showing the same threadid's, you removed all duplicates so it gives room for other id's. For example 140 (2times), 1311 (3times), etc.
Also, you forgot to set a dateline in your query. :)
You must set a dateline, orelse you endup scanning the hole table for your id's, then drop all of them and keep only 5-15, whatever you want there.
This is very unorthodox, from a coder point of view, not to say very bad. :)

Paul M
03-08-2007, 05:13 PM
I suggest you output the postids so you can see better what is going on.

Mistah Roth
03-09-2007, 09:01 PM
For sure is not showing the same threadid's, you removed all duplicates so it gives room for other id's. For example 140 (2times), 1311 (3times), etc.

It shows some doubles, 1311 is still in the second query, its just the lower values that get dropped, I don't know why?

Also, you forgot to set a dateline in your query. :)
You must set a dateline, orelse you endup scanning the hole table for your id's, then drop all of them and keep only 5-15, whatever you want there.
This is very unorthodox, from a coder point of view, not to say very bad. :)
Good call, thanks for pointing that out haha

TECK
03-09-2007, 09:31 PM
It shows some doubles, 1311 is still in the second query, its just the lower values that get dropped, I don't know why?
Show me the query and the results.
Use EMS MySQL Manager to see all about your query (and post screenshots), it's free. :)

EMS SQL Manager 2005 Lite for MySQL, Windows edition (full installation package)
http://sqlmanager.net/en/products/mysql/manager/download

Mistah Roth
03-12-2007, 01:20 AM
My 4th post in this thread has the queries and results.

Cap'n Steve
03-12-2007, 01:58 AM
I don't know anything about DISTINCT, but this might work:

SELECT threadid FROM post WHERE userid = 1 GROUP BY threadid ORDER BY postid DESC LIMIT 5

hambil
03-12-2007, 03:08 AM
My script displays the last 5 threads, a user has posted in. The main query is as follows:

SELECT distinct threadid FROM ". TABLE_PREFIX . "post WHERE userid = " . $vbulletin->userinfo['userid'] . " ORDER BY postid DESC LIMIT 5

That should work... because if you remove DISTINCT it shows the last posts you made including duplicates of the same thread if you posted more than once in a thread.

For some reason, when you add DISTINCT, it does not return the last 5 threads posted in. For some reason it skips any older threadids. If I remove the distinct from the query, it shows the last 5 posts properly. The problem is we can't have repeats, so DISTINCT is supposed to solve that.

Anyone have any idea whats going wrong? If all else fails I'll just make it so that it filters out doubles with PHP code.... not as efficient as I'd like it though, there should be no reason the query doesnt work.
DISTINCT does not work the way you think it does. What it is doing, if you look close enough, is eliminating all but the last duplicate row.

Now in your case, we can assume you have more than two posts to the thread with the id 140. Since the DISTINCT eliminates all but the last one, the oldest post is kept, not the newest, and 140 drops out of your top 15.

So, this gets a little tricky. I have to go do some stuff, but I'll work out the query for you and post it later tonight (early tomorrow).

There may be a way to do this with just SQL, but, I can't come up with it at the moment. You're going to need to process the results. Since you're dealing with all of a users threads this could be a large query result to step through.

Depending on your specific performance issues it might be more efficient to keep a separate table with userid and threadid that you update when a users posts. It would only keep the latest 5. You could also timestamp it and actually show the correct order if a user posts in a thread twice.

For example, I post in threadid 210, then I post in two other threads, then I post in 210 again. I wouldn't put 210 into the table twice, but I would update 210's timestamp.

Otherwise, you could just read in the entire result set (all their postid, threadids) and then loop through it until you have 15 unique threadids...

Mistah Roth
03-12-2007, 05:47 PM
Hey Thanks for the Help guys,

Cap'n Steve I tried your query. I started by posting in a really old thread. With the old query that actually shows the last posts (including duplicates) I got:

SELECT threadid FROM post WHERE userid = 1 ORDER BY postid DESC LIMIT 15

140
1434
1332
1434
1445
1412
1434
1412
1412
1406
1091
1305
1368
1408
1378

These are the actual threadids of the last 15 posts I made, and in order, I checked.

When I added the GROUP BY threadid to the query (and changed the number to 5), it gave me

1445
1434
1406
1305
1368

Now it really makes no sense lol...

TECK
03-14-2007, 12:43 AM
You need to order it by dateline, postid. That's why is not working.

hambil
03-14-2007, 11:55 PM
You need to order it by dateline, postid. That's why is not working.
That's not going to help.

TECK
03-15-2007, 05:10 PM
SELECT threadid
FROM thread
WHERE postusername = 'John'
AND dateline > (UNIX_TIMESTAMP(NOW()) - 172800)
AND visible = 1
ORDER BY lastpostid, threadid DESC
LIMIT 15;
That will return the threads for the last 48 hours, less scanning to the tables.
I did this off my head, so play with the ORDER there, if it's not right.

EDIT: you should definitelly consider forum based perms for your query, orelse anyone could see private threads on public areas.

Mistah Roth
03-17-2007, 05:29 PM
That query gets the latest threads made by the user, I want the latest threads the user posted in.

Riku Yuizaki
03-19-2007, 01:39 AM
SELECT DISTINCT(post.threadid), thread.title FROM post, thread WHERE post.userid=1 AND thread.visible=1 AND thread.threadid=post.threadid ORDER BY post.dateline DESC LIMIT 5

returned the following for me:

threadid thread.title
5333 Holy vs. Suija (Omega Arc vs. Sakura)
5332 Hatsuharu vs. Black Rose (Sasuke vs. Sakura)
5270 Xepher Bladewing vs. Black Rose (Xepher Bladewing ...
5255 Xepher Bladewing vs. Black Rose (Xepher Bladewing ...
5186 RPG Palace:: THE GAME!

Removing DISTINCT causes the mulitple threadid's to appear.