Sample Header Ad - 728x90

Table design comment_seen vs comment_unseen — migrate data from one to another

0 votes
1 answer
104 views
I have system that allows comments posted by users and tracks which users seen which comments . The read/unread state is tracked within comments_seen table. Surely enough, it grow tremendously. My goal is to move from comments_seen table to comments_unseen. Read/Unread status is only tracked for comment author and users in admin group (eg. admins see all new comment on site and should react respectively, while users only see answers to their comments or comments to their orders). Current comments_seen table is almost 36M rows, while it's current definition is as simple as
CREATE TABLE comments_seen
(
    user_id    int(10) NOT NULL,
    comment_id int(10) NOT NULL,
    CONSTRAINT user_id
        UNIQUE (user_id, comment_id)
)
The comments_unseen table could be very much the same. What would be your suggestions on my steps to moving from one table design to another? Any triggers to keep data consistent? Any deployment/migration steps? All queries in code could be rewritten at once, I guess, so time to support both tables can be minimized to, say, seconds or minutes. The overall application load isn't very high and everything could be done in night times for most users to not affect performance or UX. Any thought, suggestions and shared experience is very much appreciated. The DB is MySQL 5.5 **UPDATE 2021-04-01**: As pointed out in comments 35M rows isn't huge, but the culprit must be the query. So here's the slowest query (as reported by Percona's pt-query-digest ):
SELECT
			SUM(is_deleted IS NOT TRUE) AS total_count
			, SUM(parent_id = 0) AS branch_count
			, SUM(parent_id = 0 AND is_can_answer) AS opened_branch_count
			, SUM(parent_id = 0 AND NOT is_can_answer) AS closed_branch_count
			, SUM(cs.comment_id IS NULL AND is_deleted IS NOT TRUE) AS new_count , SUM(cs.comment_id IS NULL AND is_deleted IS NOT TRUE AND module = 5) AS order_comment_count , SUM(cs.comment_id IS NULL AND is_deleted IS NOT TRUE AND module = 4) AS items_comment_count 
		FROM (
                            (
								SELECT DISTINCT
									5 AS module
									, 'm' AS item_type
									, co.id AS item_id
								FROM cart_orders AS co
								WHERE co.author = 65
								OR co.manager_id = 65
								
							)
							UNION
							(
								SELECT DISTINCT
									5 AS module
									, 'i' AS item_type
									, pr.id AS item_id
								FROM cart_orders AS co
									JOIN cart_package AS cp
										ON cp.order_id = co.id
									JOIN pictures_relations AS pr
										ON pr.module_id = 5
											AND pr.item_id = cp.id
											AND pr.item_type = 'opkg'
											AND pr.is_file IS FALSE
								WHERE co.author = 65
								OR co.manager_id = 65
								
							)
							UNION
							(
								SELECT DISTINCT
									4 AS module
									, 'm' AS item_type
									, cat.id AS item_id
								FROM catalog AS cat
									INNER JOIN (
										cart_positions AS cp
										JOIN cart_orders AS co
											ON cp.order_id = co.id
									)
										ON cp.cat_id = cat.id
								WHERE co.author = 65
								OR co.manager_id = 65
								
							)
							) AS source JOIN comments AS a USING (module, item_id, item_type) 
			LEFT JOIN comments_seen AS cs ON cs.comment_id = a.id AND cs.user_id = 65 
		WHERE ((TRUE))
Execution time for the whole query, and it's UNIONed parts:
complete - 8s 451ms

1 - 775ms
2 - 1s 214ms
3 - 3s 623ms

2+3 - 4s 891ms
1+3 - 4s 314ms
1+2 - 1s 945ms
Clearly, there's only single JOIN by both comment_id and user_id columns, so the issue seem to be not in the comments_seen table as I initially thought, but rather in overall poor table design.
Asked by Eduard Sukharev (129 rep)
Mar 31, 2021, 06:28 PM
Last activity: Apr 1, 2021, 06:30 PM