Was asking about miracle - I give you miracle. I changed one query in deletion of duplicated data and now it is very fast. I will just tell that before changes 1 query on 100 000 rows with 1000 duplicated rows
took 10 minutes before update,
now it takes less than 1 second!!!
So cleaning each table has 5 queries. 2 are instant, 2 are very fast and one took me 7 seconds on 100 000 rows - this one cannot be optimized, there is no any subquery, we just need to ask DB about those data.
Hope now Dave will made new official release, which will not allow for data duplication in 2 of our 3 cache tables, and makes whole mod works faster, and your DB smaller.
Below you have again full and updated description how to clean duplicated data. I was able to run this by my browser client, but be aware, that in case of some large databases, server can go away by this client and in such case you will need to use some other client than www.
Also note that if you made changes described here (
https://vborg.vbsupport.ru/showpost....&postcount=242) which I hope will be included in official release, then you will not have to remove duplicated data from wt_cache_short and wt_cache_medium anymore (only once during described procedure). And after that only wt_cache will need to delete duplicated data from time to time.
So here you have again description and procedure, but much faster this time
:
If you would like to check do you have data duplication in your cache execute those queries (time consuming). Execute one by one - each one works on other cache table and tells you how many times and which data is duplicated (1st column duplication counter, 2nd for which originaltext, 3rd for which language):
Code:
select count(*) counter, originaltext, tl from wt_cache group by originaltext, tl having count(*) > 1 order by counter desc;
select count(*) counter, originaltext, tl from wt_cache_short group by originaltext, tl having count(*) > 1 order by counter desc;
select count(*) counter, originaltext, tl from wt_cache_medium group by originaltext, tl having count(*) > 1 order by counter desc;
If you want to delete data duplication first need to create 2 tables for temporary data:
Code:
CREATE TABLE saver (
id INT,
tl VARCHAR(10),
originaltext VARCHAR(65000)
) ENGINE = MYISAM, CHARACTER SET utf8 COLLATE utf8_general_ci;
CREATE TABLE cleaner (
id INT
) ENGINE = MYISAM, CHARACTER SET utf8 COLLATE utf8_general_ci;
And when you have those you can execute clearing queries - note those will leave first translation in your database and only remove next translations for same text and language.
Code:
delete from cleaner;
delete from saver;
insert into saver (SELECT min(id) as id, tl, originaltext from wt_cache group by originaltext,tl having count(*) > 1);
insert into cleaner (SELECT cache.id from saver, wt_cache cache where saver.originaltext=cache.originaltext and saver.tl=cache.tl and saver.id<>cache.id);
DELETE FROM wt_cache USING wt_cache INNER JOIN cleaner ON wt_cache.id = cleaner.id;
delete from cleaner;
delete from saver;
insert into saver (SELECT min(id) as id, tl, originaltext from wt_cache_short group by originaltext,tl having count(*) > 1);
insert into cleaner (SELECT cache.id from saver, wt_cache_short cache where saver.originaltext=cache.originaltext and saver.tl=cache.tl and saver.id<>cache.id);
DELETE FROM wt_cache_short USING wt_cache_short INNER JOIN cleaner ON wt_cache_short.id = cleaner.id;
delete from cleaner;
delete from saver;
insert into saver (SELECT min(id) as id, tl, originaltext from wt_cache_medium group by originaltext,tl having count(*) > 1);
insert into cleaner (SELECT cache.id from saver, wt_cache_medium cache where saver.originaltext=cache.originaltext and saver.tl=cache.tl and saver.id<>cache.id);
DELETE FROM wt_cache_medium USING wt_cache_medium INNER JOIN cleaner ON wt_cache_medium.id = cleaner.id;