Sample Header Ad - 728x90

Query to delete records with lower eff_date in a large table with 400 million records

5 votes
1 answer
892 views
I have a table with below structure: create table TEST_TAB ( activity_type CHAR(1), tracking_code NUMBER, eff_date DATE ) Sample data for this table: insert into TEST_TAB (activity_type, tracking_code, eff_date) values ('A', 1, to_date('01-11-2020', 'dd-mm-yyyy')); insert into TEST_TAB (activity_type, tracking_code, eff_date) values ('A', 1, to_date('02-01-2024', 'dd-mm-yyyy')); insert into TEST_TAB (activity_type, tracking_code, eff_date) values ('B', 2, to_date('01-08-2023', 'dd-mm-yyyy')); insert into TEST_TAB (activity_type, tracking_code, eff_date) values ('B', 2, to_date('02-08-2023', 'dd-mm-yyyy')); insert into TEST_TAB (activity_type, tracking_code, eff_date) values ('B', 2, to_date('03-08-2023', 'dd-mm-yyyy')); This is just a sample data and the amount of real data in the original table is nearly 400 million records. What I need to do is that for each group of activity_type, tracking_code, I need to keep the record which has the highest "eff_date" and delete the rest. So for the activity_type=A and tracking_code = 1 I need to keep the one with eff_date = 1/2/2024 and delete the other one. What I have for now is below query: delete from test_tab where rowid in (select rid from (select rowid as rid, row_number() over(partition by activity_type, tracking_code order by eff_date desc) as row_num from test_tab ) where row_num > 1 ) However this seems very slow. Could you suggest any better solution? The original table is partitioned on eff_date and it has index on the rest two columns. Another point is that there might be more than one year gap between eff_dates of each record in a single group. Thanks in advance
Asked by Pantea (1510 rep)
Nov 11, 2024, 01:10 PM
Last activity: Nov 16, 2024, 09:37 AM