PDA

View Full Version : mySQL - rename Post `title` to match Thread `title`


Rik Brown
01-20-2009, 05:18 AM
Can anyone advise some mysql queries to loop through the title of each thread in the thread table and, if the thread's corresponding posts in the post table have a different title, to update the post title to match the thread title?

Basically, I cannot understand how to loop through a table (have been Googling many pages on the matter). I'm assuming one would make a join on the two tables and then simply loop row-by-row based on thread title.

For testing/viewing purposes on one forum, I've made the following query:

SELECT thread.threadid, thread.title, post.postid, post.title
FROM thread
JOIN post
WHERE post.threadid = thread.threadid
AND thread.forumid IN (174)
AND thread.title != post.title
ORDER BY thread.threadid, post.postid
LIMIT 30


This finds the post titles that no longer match their corresponding thread titles. It appears all I would need to do now is loop through the table row-by-row and "update" the post title to match the thread title.

Does anyone know how to accomplish this using mysql?

Thanks. -- Rik

Farcaster
01-21-2009, 12:18 AM
If I understand your goal correctly and you want to update EVERY post in your database to have the same title as the parent thread, then I believe this code will work for you:

UPDATE post, thread
SET post.title = thread.title
WHERE post.threadid = thread.threadid
AND post.title <> thread.title

Try it on your test forum first to make sure it works as you expected.

Rik Brown
01-21-2009, 03:54 AM
If I understand your goal correctly and you want to update EVERY post in your database to have the same title as the parent thread, then I believe this code will work for you:

UPDATE post, thread
SET post.title = thread.title
WHERE post.threadid = thread.threadid
AND post.title <> thread.title

Try it on your test forum first to make sure it works as you expected.

Farcaster:

I tried it first with a single thread and then an entire forum. It worked perfectly in both instances and was very fast. However, running it against the entire vb database brought the system to a virtual crawl (we have 1.2 million posts). Since limiting it to a forum was quick, I'm going to batch it up that way in the future.

I really banged my head against the wall on this matter for about 10 hours until I read your reply.

Thank you so very much! -- Rik

--------------- Added 1232525499 at 1232525499 ---------------

Further to my post above....

I was able to bring into sync all thread titles and their corresponding post titles. I had accumulated over 100,000 mis-matches in 4-5 years. I had to do it forum-by-forum as mentioned above in order to limit the demands on the server. But now that its done, I can run the query against all forums quickly -- as long as I keep everything fairly up-to-date (perhaps via cron).

Thanks again, Farcaster. It should help our Google ranking. -- Rik

Marco van Herwaarden
01-21-2009, 08:33 AM
AND post.title <> thread.title
This is not really needed, as it would not change the result.

But why do you want to have all posts have the same title as the thread? Doesn't make much sense to me.

Rik Brown
01-21-2009, 09:46 PM
AND post.title <> thread.title
This is not really needed, as it would not change the result.

But why do you want to have all posts have the same title as the thread? Doesn't make much sense to me.

Because users leave messages like "Help" or "Does anyone else have this problem?" or "Newbie question" which aren't very useful to our users or the search engines and do not match up to the pagetext. So we tend to change those threads to titles that are more accurate and appropriate to the search engines such as:

2001 Toyota Camry - When to change the serpentine belt?
Making a Cross-Wind Landing at London Gatwick Airport
Bikes: Shimano derailleurs vs. Campy - Which are Best?

Some threads may already have tons of replies. We can just change the thread title (and even en masse via queries) and run the query above to update all those post titles. Like I said above, yesterday we fixed well over 100,000 changes that have accumulated over the years. And, our forums look much more better. :)

Regards. -- Rik

ps: I forgot to mention about the "AND post.title <> thread.title". I believe its quite necessary. Otherwise all posts (we have 1.2 million) would be updated each time the query is run and that would be a real drag on the server. Instead, with this limiting line of code it just updates those posts that do not match and thus limits the load on the server and the time to run the query.

Farcaster
01-21-2009, 10:45 PM
ps: I forgot to mention about the "AND post.title <> thread.title". I believe its quite necessary. Otherwise all posts (we have 1.2 million) would be updated each time the query is run and that would be a real drag on the server. Instead, with this limiting line of code it just updates those posts that do not match and thus limits the load on the server and the time to run the query.

SQL performance can be a funny thing. It really depends. If it cannot make use of an index on the post and thread titles, then it might actually be less expensive to do a blind update. You'd have run an EXPLAIN on this to see how it is building the query.

Rik Brown
01-22-2009, 02:04 AM
SQL performance can be a funny thing. It really depends. If it cannot make use of an index on the post and thread titles, then it might actually be less expensive to do a blind update. You'd have run an EXPLAIN on this to see how it is building the query.

Will do. Thanks again. -- Rik

Marco van Herwaarden
01-22-2009, 08:32 AM
Instead of duplicating the thread title to all posts, it might have been better to remove the title for these posts.

Rik Brown
01-27-2009, 11:11 PM
Instead of duplicating the thread title to all posts, it might have been better to remove the title for these posts.

Marco:

I can't imagine any advantage to having no post title in contrast to having the correct post title. :) We are simply making sure that the thread title we either fix or enhance (for Google purposes, etc.) is likewise applied across the thread instead of having post titles like "Newbie Question."

Regards. -- Rik

--------------- Added 1233105651 at 1233105651 ---------------

As I mentioned above, I decided to spread the query forum-by-forum so it doesn't attempt to do everything at once and bring the server to its knees. Here is the code for a mysql STORED PROCEDURE that I'm currently using:

BEGIN
DECLARE count INT default 0;
SET count = 1;

WHILE count <= 550 DO
UPDATE post, thread SET post.title = thread.title WHERE post.threadid = thread.threadid AND post.title <> thread.title AND thread.forumid = count;
SET count = count + 1;
END WHILE;
END

We only have a little over 100 forums but they are spread over forumids 1-514 at the moment. The above simple procedure bypasses forumids that do not exist and runs the one line query only on each forum that does exist.

I'm sure this could be improved with more coding (please feel free to advise) but it does what I need without knowing much about writing stored procedures.

I hope that this helps someone else.

Regards. -- Rik

Marco van Herwaarden
01-28-2009, 08:28 AM
99% of the post on a typical forum does not have a title. What you are doing now is only increasing your database size (probably a lot even) with duplicate data.

Rik Brown
01-28-2009, 08:57 PM
99% of the post on a typical forum does not have a title. What you are doing now is only increasing your database size (probably a lot even) with duplicate data.

Marco:

I'd just assume not debate SEO matters in this thread. Suffice it to say that we have seen our largest ever jump in average daily traffic of over 400% in less than 2 months when we started working on this project. The difference (increase) in traffic is larger than our entire forum growth going back more than a few years.

I might mention that our sitemap software creates a URL reference for EVERY page of our site. Google Analytics shows us that people are not always entering our site via the top of a thread. Instead, they enter at an indexed page of the thread that Google or another engine "likes." Those page titles ARE being indexed in Google. I can track them back and see for myself.

I also might add that since we can now mass change post titles, we have played with them doing things like adding a single keyword to titles in a specific forum (say dealing with Cars) and watching the effect. Normally, within about 4 days of propagating our new sitemaps, we'll find those particular posts climbing quickly up in our internal Google Analytics rankings and bypassing those posts (in the same forum) for which we didn't change the titles in the same manner.

We have found the growth to be so fast paced that we don't worry about thousands of threads/posts (out of about 1.4 million posts) temporarily dropping out of Google etc. due to being renamed -- as we know that they will come back within a week or so at a much higher ranking.

So all I can say is that this is working fantastically for our forums and nothing we have done in the past has even had a fraction of this effect. Its quite a thrill to see new record highs in traffic virtually daily and huge jumps week-to-week. We're loving it!

So let's agree to disagree! :)

Cheers! -- Rik

ps: I guess I should also mention that we use vbseo to name our thread URLs to match their titles.

Marco van Herwaarden
01-29-2009, 07:30 AM
I am not a SEO expert, so can not really comment on that. But it does surprise me that setting the title on each post on the thread (page) helps. The thread title once on a page should be enough i would guess.

Rik Brown
01-29-2009, 11:16 AM
I am not a SEO expert, so can not really comment on that. But it does surprise me that setting the title on each post on the thread (page) helps. The thread title once on a page should be enough i would guess.

Hi, Marco:

I would agree with that if users always came in at the top of the thread. But with Google sitemaps from vbseo, every page gets indexed separately and users are very likely to be referred to us in the middle of the thread (vbseo embeds permalinks, etc. for each post that again matches up with the post/thread titles and the vbseo-created sitemaps).

However, one very interesting thing we have found is that it appears Google often analyzes the number of posts in a thread and is often (not always) more likely to index threads with less posts than large megathreads. It sometimes even totally skips indexing some of those larger threads (but we do see notable exceptions that we can't yet explain on our site).

I don't have a number here but we often see threads with say 60-120-250 posts that get virtually no indexing or even totally bypassed (Google may somehow pay attention to the post counts) while the smaller threads on the same day are indexed and produce considerable incoming search engine traffic.

I'm just guessing that Google, like most users, knows that some of those large threads tend to have a lot of junk in them so they don't even go down those roads or Google simply puts such megathreads at a much lower priority for indexing (at least in regards to our site).

So we are going to be experimenting on some of our larger threads by paring them back from say 250 messages to lesser amounts (perhaps in increments of 60 -> 45 -> 30 -> 25 -> 20 messages) and see if their search engine popularity increases.

If we are right, in those cases, "less will be more" for us and help us clean dead-weight or polluting messages from our forums while further increasing our overall traffic.

Cheers! -- Rik