Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes
1 answers
100 views
Stop rolling back transaction by committing them
I have a SQL Server process that runs annually, creating around 400 million records in several tables. The process operates in a loop, committing every 20,000 records for each of the tables. I recently discovered that someone changed the script and put a `begin tran` in the initial lines, which was...
I have a SQL Server process that runs annually, creating around 400 million records in several tables. The process operates in a loop, committing every 20,000 records for each of the tables. I recently discovered that someone changed the script and put a begin tran in the initial lines, which was only committed in the last lines. And there was an error in the process. Now the database is rolling back each of the 20 thousand lines that were committed in the nested transactions. Is there any way to prevent this rollback? How to identify the SPID of the transaction and commit it before the roll back reaches it? There were only about 50 million lines left when the error occurred due to lack of space in *tempdb*. We estimate that it will take around 20 hours to roll back everything in four different tables.
Jair Pedro Assis (1 rep)
Jan 29, 2025, 11:55 PM • Last activity: Jan 30, 2025, 10:54 PM
2 votes
2 answers
285 views
Is it possible to store SQL path instead of uploading entire BLOB data?
I've a c# site which is uploading images in SQL database and now that SQL table size is around 300GB. Is it possible to store the path of those images instead of storing entire blob data while uploading images from the C# site? Also, I would like to store those images in a NTFS file system. Along wi...
I've a c# site which is uploading images in SQL database and now that SQL table size is around 300GB. Is it possible to store the path of those images instead of storing entire blob data while uploading images from the C# site? Also, I would like to store those images in a NTFS file system. Along with that, I would like to transfer those blob data to NTFS file system and store the path of those images. If so, how could we achieve this using SQL Script without using filestream and filetable?
James (21 rep)
Oct 11, 2024, 01:49 AM • Last activity: Oct 12, 2024, 10:05 AM
1 votes
1 answers
1512 views
how to call lo_unlink(lo) in batches
In **Postgres 9.5** I am tying to reproduce the postgres tool vacuumlo to get rid of unreferenced large object to free disk space. References are stored in **text** type columns by hibernate so using the tool as is is not an option. Looking at vacuumlo sources ( [vacuumlo.c][1]) I have created a *va...
In **Postgres 9.5** I am tying to reproduce the postgres tool vacuumlo to get rid of unreferenced large object to free disk space. References are stored in **text** type columns by hibernate so using the tool as is is not an option. Looking at vacuumlo sources ( vacuumlo.c ) I have created a *vacuum_l* table like so: CREATE TABLE vacuum_l AS SELECT oid AS lo, false as deleted FROM pg_largeobject_metadata; and proceded to delete all rows from *vacuum_l* table wich oids are referenced by user tables in my database. Now is safe to call *unlink(lo)* for all items remaining in the *vacuum_l* table. Unfortunately the table is very big and, as reported in the vacuumlo.c source code: >We don't want to run each delete as an individual transaction, because the commit overhead would be high. However, since 9.0 the backend will acquire a lock per deleted LO, so deleting too many LOs per transaction risks running out of room in the shared-memory lock table. Accordingly, we delete up to transaction_limit LOs per transaction. So far I tried to unlink all oids with this sql function: ``` CREATE OR REPLACE FUNCTION unlinkOrphanedLobs() returns void AS $BODY$ DECLARE count_variable int; begin loop SELECT COUNT(*) INTO count_variable FROM vacuum_l WHERE deleted=false; EXIT WHEN count_variable ERROR: out of shared memory > > HINT: You might need to increase max_locks_per_transaction. Why, is clearly explained by the code comments reported above. This function wont close the transaction at the end of the loop. How would it be the correct way to do it? Thanks for any help.
Luky (121 rep)
Mar 22, 2021, 04:43 PM • Last activity: Jun 2, 2024, 09:07 PM
1 votes
1 answers
140 views
Weird behaviour with batch and trigger(MariaDB)
I found a weird issue involving mariadb. batch update from hibernate. the original question was added here : https://stackoverflow.com/questions/78204056/hibernate-sqlintegrityconstraintviolation-duplicate-data-entry-issue-after-upg?noredirect=1#comment138316709_78204056 table t_instancetest CREATE...
I found a weird issue involving mariadb. batch update from hibernate. the original question was added here : https://stackoverflow.com/questions/78204056/hibernate-sqlintegrityconstraintviolation-duplicate-data-entry-issue-after-upg?noredirect=1#comment138316709_78204056 table t_instancetest CREATE TABLE t_instancetest ( RATEID BIGINT(10) NOT NULL, BID VARCHAR(25) NULL DEFAULT NULL COLLATE 'utf8_bin', ASK VARCHAR(25) NULL DEFAULT NULL COLLATE 'utf8_bin', MID VARCHAR(25) NULL DEFAULT NULL COLLATE 'utf8_bin', UPDATEDDATE DATETIME(6) NULL DEFAULT NULL, INSTANCEUPDATEDDATE DATETIME(6) NULL DEFAULT NULL, UPDATEDBY VARCHAR(30) NULL DEFAULT NULL COLLATE 'utf8_bin', CRUD VARCHAR(1) NULL DEFAULT NULL COLLATE 'utf8_bin', CREATEDBY VARCHAR(30) NULL DEFAULT NULL COLLATE 'utf8_bin', INSTANCEUPDATEDBY VARCHAR(30) NULL DEFAULT NULL COLLATE 'utf8_bin', LOCATIONID INT(3) NOT NULL, PRIMARY KEY (RATEID, LOCATIONID) USING BTREE )COLLATE='utf8_bin' ENGINE=InnoDB ; audit log CREATE TABLE t_datalog ( Id BIGINT(10) NOT NULL AUTO_INCREMENT, RateId BIGINT(10) NOT NULL, CreatedDate DATETIME(6) NOT NULL, PRIMARY KEY (Id) USING BTREE )) COLLATE='utf8_bin' ENGINE=InnoDB AUTO_INCREMENT=232 ; TRIGGER (BEFORE UPDATE ) - but this field later removed to make it easier to test - and this trigger dropped. CREATE DEFINER=USERADMIN@% TRIGGER TRG_BU_T_INSTANCE. BEFORE UPDATE ON t_instance FOR EACH ROW BEGIN SET NEW.ATTRCHANGEID=UUID(); END TRIGGER (AFTER UPDATE) CREATE DEFINER=USERADMIN@% TRIGGER TRG_AU_T_INSTANCETEST AFTER UPDATE ON t_instancetest FOR EACH ROW BEGIN INSERT INTO t_datalog ( RateId, CreatedDate) VALUES (OLD.RATEID, sysdate()); END Batch update from application ( batchsize = 7)-as can be shown here, it update 7 different rateids 2024-05-10 12:31:59.814 INFO 3472 --- [nio-9012-exec-2] n.t.d.l.l.SLF4JQueryLoggingListener : {"name":"reliableDataSource", "connection":3, "time":13, "success":true, "type":"Prepared", "batch":true, "querySize":1, "batchSize":7, "query":["update t_instancetest set ask=?, bid=?, createdby=?, crud=?, instanceupdateddate=?, locationId=?, mid=?, updatedby=?, updateddate=? where locationid=? and rateId=?"], "params":[["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7181","naga","2024-05-10 12:03:55.0","101","7181"], ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7239","naga","2024-05-10 12:03:55.0","101","7239"], ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7254","naga","2024-05-10 12:03:55.0","101","7254"], ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7265","naga","2024-05-10 12:03:55.0","101","7265"], ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7370","naga","2024-05-10 12:03:55.0","101","7370"], ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7453","naga","2024-05-10 12:03:55.0","101","7453"], ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7643","naga","2024-05-10 12:03:55.0","101","7643"]]} Mariadb logs , however showing we are only updating one record in t_instancetest and execute trigger 7 times on the same rateid. other rateid were not executed. #240509 17:27:01 server id 2 end_log_pos 127297238 CRC32 0x5e6a5fa7 Write_rows: table id 3067 flags: STMT_END_F ### UPDATE reliabledb.t_instance ### WHERE ### @1=7181 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2='12.23' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */ ### @3='116.13' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */ ### @4='' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */ ### @5='2024-05-09 16:49:54.962888' /* DATETIME(6) meta=6 nullable=1 is_null=0 */ ### @6='2024-05-09 17:22:52.812000' /* DATETIME(6) meta=6 nullable=1 is_null=0 */ ### @7='naga' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */ ### @8='U' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */ ### @9='SYSTEM' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */ ### @10='naga' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */ ### @11=101 /* INT meta=0 nullable=0 is_null=0 */ ### SET ### @1=7181 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2='12.23' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */ ### @3='116.13' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */ ### @4='7181' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */ ### @5='2024-05-09 16:49:54.962888' /* DATETIME(6) meta=6 nullable=1 is_null=0 */ ### @6='2024-05-09 17:27:01.116000' /* DATETIME(6) meta=6 nullable=1 is_null=0 */ ### @7='naga' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */ ### @8='U' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */ ### @9='SYSTEM' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */ ### @10='naga' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */ ### @11=101 /* INT meta=0 nullable=0 is_null=0 */ ### INSERT INTO reliabledb.t_datalog ### SET ### @1=98 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */ ### INSERT INTO reliabledb.t_datalog ### SET ### @1=99 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */ ### INSERT INTO reliabledb.t_datalog ### SET ### @1=100 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */ ### INSERT INTO reliabledb.t_datalog ### SET ### @1=101 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */ ### INSERT INTO reliabledb.t_datalog ### SET ### @1=102 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */ ### INSERT INTO reliabledb.t_datalog ### SET ### @1=103 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */ # Number of rows: 7 # at 127297238 #240509 17:27:01 server id 2 end_log_pos 127297269 CRC32 0xd7928418 Xid = 205571784 COMMIT/*!*/; If we drop the trigger, the insertion completed normally (all record updated ). Also if we remove the batch, the behaviour the insertion will completed also. The issue only happened if batch size > 3 and trigger are in effect. Anyone can share any idea on why this is happening? or maybe point to a mariadb bug that might interfere with this? It only happens for this table on the schema, where the other table are not affected. version details: - mariadb 10.5 - mariadb jdbc client 3.3.3 ( also tried with 2.7.4 ) - spring boot ( java ) 2.7.18 - hibernate 5.6.15
Rudy (151 rep)
May 10, 2024, 07:40 AM • Last activity: May 16, 2024, 07:31 AM
1 votes
1 answers
132 views
how can I see how many records have been affected by a transaction before it is committed? or how costly would be the rollback if I have to kill it?
I am using [sqlwatch][1] to monitor [some of my servers][2] and it has a job that deletes in batch from a table called `dbo.sqlwatch_logger_snapshot_header` I had lots of problems of locking and blocking as this table is used by different processes, then I disabled all other jobs with the exception...
I am using sqlwatch to monitor some of my servers and it has a job that deletes in batch from a table called dbo.sqlwatch_logger_snapshot_header I had lots of problems of locking and blocking as this table is used by different processes, then I disabled all other jobs with the exception of the one that deals with this table. I used the following script to disable all other sqlwatch jobs:
use distribution

set nocount on
set transaction isolation level read uncommitted

declare @Job_id uniqueidentifier
declare @job_name varchar(300)
declare @category_name varchar(300)
declare @body varchar(max)
declare @flag tinyint

declare @enabled bit = 0;  --0 - disable 1-enable
declare @subject varchar(300) = case when @enabled = 0 then 'Disabling  sqlwatch jobs on' + @@servername else 'Enabling sqlwatch jobs on' + @@servername end;
declare @job_enabled bit;

set @flag = 0
set @body = 'The following jobs are going to be ' +
          case when @enabled = 0 then 'Disabled' else 'Enabled' end
          + ' : '+char(10)+Char(13)


IF OBJECT_ID('tempdb.[dbo].[#LogReaderAgents]') IS NOT NULL 
DROP TABLE [dbo].[#LogReaderAgents] 

CREATE TABLE [dbo].[#LogReaderAgents] ( 
[job_id]         UNIQUEIDENTIFIER                 NOT NULL,
[job_name]       SYSNAME                          NOT NULL,
[category_name]  SYSNAME                          NOT NULL,
[enabled]        TINYINT                          NOT NULL)

INSERT INTO [dbo].[#LogReaderAgents] ([job_id],[job_name],[category_name],[enabled])
select job_id, job_name = sj.name, category_name=sc.name, sj.enabled
from  msdb.dbo.sysjobs sj
inner join msdb.dbo.syscategories sc
        on sj.category_id = sc.category_id
where 1=1
  and sj.name like 'SQLWATCH%'
 -- and sc.category_id in (10,13)
 and sj.name not in ('SQLWATCH-INTERNAL-RETENTION')

-- exec sp_gettabledef 'dbo.#LogReaderAgents'
-- exec sp_GetInsertList 'TEMPDB','DBO.#LogReaderAgents'


DECLARE c1 CURSOR FOR
       SELECT Job_id,job_name, category_name, [enabled]
       FROM #LogReaderAgents
OPEN c1
FETCH NEXT FROM c1
       INTO @Job_id,@job_name,@category_name,@job_enabled
       WHILE @@FETCH_STATUS = 0
       begin

              if (select top (1) stop_execution_date from msdb.dbo.sysjobactivity ja
                     where Job_ID = @Job_id
                           and
                     ja.start_execution_date IS NOT NULL
                     order by Start_execution_date desc) is not NULL
              begin
                     set @flag = 1
                     Print @job_name +' is ' + case when @job_enabled=1 then 'Enabled' else 'disabled' end 
                     exec msdb..sp_update_job @job_id = @job_id, @enabled = @enabled
                     set @Body = @Body + char(10)+char(13) + @job_name + ' -- ' + @category_name +' is ' + case when @job_enabled=1 then 'Enabled' else 'disabled' end 
              end
              
              FETCH NEXT FROM c1
       INTO @Job_id,@job_name,@category_name,@job_enabled
       end
CLOSE c1
DEALLOCATE c1
however, after that the job is still running, but I dont see any changes in the number of the records that are in this table. here is the job executing a procedure and doing a delete in batch: enter image description here progress has been slow but steady, there were 37.2 million rows Image now there are 36.9 millions enter image description here so basically it is working, so all good. Now my question: Suppose I had to stop this job for any reason, it is a batch delete, if I had to stop this job now, how many rows are affected, how costly would it be to rollback what has not been committed yet?
Marcello Miorelli (17274 rep)
Feb 23, 2024, 04:56 PM • Last activity: Feb 25, 2024, 01:35 AM
1 votes
0 answers
160 views
SQL script to run from batch jobs
I have a sql script that truncates the table and insert into database from a linked server.Since the server is SQL express I will need to use batch jobs to schedule the jobs. Below is a snippet of the batch job that calls the sql script TRUNCATE TABLE [TABLE1].[dbo].[TableData] WAITFOR DELAY '000:00...
I have a sql script that truncates the table and insert into database from a linked server.Since the server is SQL express I will need to use batch jobs to schedule the jobs. Below is a snippet of the batch job that calls the sql script TRUNCATE TABLE [TABLE1].[dbo].[TableData] WAITFOR DELAY '000:00:01' INSERT INTO [TABLE1].[dbo].[TableData] SELECT * FROM [SourceServer].[SourceDatabase].[dbo].[VW_View] The batch jobs works fine if I use the below code but since osql is deprecated I will need to use sqlcmd which is runs in a loop and never stops. osql -S myserver -E -i "c:\test.sql" -o "c:\output.txt" --- Runs fine sqlcmd -S myserver -E -i "c:\test.sql" ----- Does not work and goes to a loop that does not end. I will appreciate if someone can provide me the correct syntax for the sqlcmd.
SQL_NoExpert (1117 rep)
Feb 14, 2024, 12:23 PM
1 votes
1 answers
470 views
Handle partial failures in a transaction (commit selectively) in MSSQL
I am doing a batch update where I process records sequentially (User 1, User 2 etc). Each record has multiple update queries associated to it. I need to skip if there is a data issue for an individual record (user in this case). Ideally, I would like to process them in parallel, but haven't reached...
I am doing a batch update where I process records sequentially (User 1, User 2 etc). Each record has multiple update queries associated to it. I need to skip if there is a data issue for an individual record (user in this case). Ideally, I would like to process them in parallel, but haven't reached that level yet (lots of challenges). Is it possible to do it such that: 1. There is a single transaction. 2. If there is a failure the statements associated with **that** record gets rolled back, **without** affecting others. 3. Commit. For example, I have 4 users in a CSV file. If 3 are good and 1 is bad, 3 should get committed (or aborted) atomically; 1 should get skipped with errors. Observed: ~~~ do everything for user 1; do everything for user 2; --> if there is failure, it rolls back the *entire* transaction do everything for user 3; ~~~ In fact, any error level >= 16 is rolling back the entire transaction. Expected: ~~~ do everything for user 1; do everything for user 2; --> if there is failure, roll back this *block* only do everything for user 3; do everything for user 4; ~~~ It's a normal try-catch requirement in any programming language; however, couldn't see a SQL Server equivalent (involving transactions). I have read about checkpoints, but not sure if that's an option to consider.
Nishant (899 rep)
Oct 23, 2023, 03:07 PM • Last activity: Oct 26, 2023, 02:51 AM
-1 votes
1 answers
788 views
What is the safe way to update production data if data found inconsistent
I am in a small company and require to fix a bunch of data in production which are inconsistent. I have written the script to handle the fix. I understand that writing sql to fix data is way more risky than working on fixes normal front end or backend code. For front end or back end, we can write th...
I am in a small company and require to fix a bunch of data in production which are inconsistent. I have written the script to handle the fix. I understand that writing sql to fix data is way more risky than working on fixes normal front end or backend code. For front end or back end, we can write thousands of test code to automatically test result or even little bug the affect might not be as dramatic. How do you guys fix the data in production normally if many data was inconsistent? Are there any method or strategy which can reduce the risk
LittleFunny (99 rep)
Apr 2, 2020, 10:13 PM • Last activity: Aug 29, 2023, 09:14 AM
0 votes
2 answers
289 views
Delete all rows if input parameter is 0 otherwise delete only those entries which match the input param
I have a stored proc which is receiving id as input. If the id which is received is 0 then we need to delete all the rows otherwise we need to delete only those rows which matches the input id. I can do this by splitting my query using an if-else but that's the catch here, we want to do it in a sing...
I have a stored proc which is receiving id as input. If the id which is received is 0 then we need to delete all the rows otherwise we need to delete only those rows which matches the input id. I can do this by splitting my query using an if-else but that's the catch here, we want to do it in a single query by using JOINS and CASE (may be). Here is my table. CREATE TABLE InfineonFiles ( infimax INTEGER NOT NULL, neonId INTEGER NOT NULL, eTime INTEGER NOT NULL, pTime INTEGER NOT NULL, cisId INTEGER NOT NULL ) Input attribute is say, DECLARE @inpId INT = [ 0 OR POSITIVE INTEGER] Deletion is like this IF @inpId = 0 DELETE TOP (5000) AIA FROM InfineonFiles WHERE pTime < @timeUnitInSeconds ELSE DELETE TOP (5000) AIA FROM InfineonFiles WHERE pTime < @timeUnitInSeconds AND cisId = inpId The above needs to be in a single query. How can I do that?
Himanshuman (197 rep)
Aug 11, 2023, 09:46 AM • Last activity: Aug 11, 2023, 11:40 AM
0 votes
0 answers
91 views
Search from text file contain string and replace row
My database table named VmIpAddress has the following 12 columns `id, hosting_id, server_id, vm_id, ip, mac_address, subnet_mask, gateway, cidr, trunks, tag, net` A row might contain for example "3471","3654","16",NULL,"137.74.189.22","02:00:00:55:a1:57","255.255.255.224","137.74.189.254","27",NULL,...
My database table named VmIpAddress has the following 12 columns id, hosting_id, server_id, vm_id, ip, mac_address, subnet_mask, gateway, cidr, trunks, tag, net A row might contain for example "3471","3654","16",NULL,"137.74.189.22","02:00:00:55:a1:57","255.255.255.224","137.74.189.254","27",NULL,NULL,"net0" I have a .txt file contain the following data 137.74.189.22, 193.15.125.110, 02:00:00:20:a2:10 137.74.189.23, 193.15.125.111, 02:00:00:20:a2:11 137.74.189.24, 193.15.125.112, 02:00:00:20:a2:12 137.74.189.25, 193.15.125.113, 02:00:00:20:a2:13 Note that the $5 "ip" must be exactly what the text file contains 137.74.189.22 in this case, there may also be 137.74.189.222 which I do not want changed. From this data the table row contains 137.74.189.22 should be modified to the new IP address and MAC as well as a few other columns $3 $5 $6 $7 $8 $9 should be changed so that it changes to "3471","3654","19",NULL,"193.15.125.110","02:00:00:20:a2:10","255.255.255.0","193.15.125.1","24",NULL,NULL,"net0" $3 = Always 19 $5 = First variable in text file $6 = Third variable in text file $7 = Always 255.255.255.0 $8 = Always 193.15.125.1 $9 = Always 24 OS Linux mysql 5.7
Toodarday (113 rep)
Jan 25, 2023, 06:34 PM • Last activity: Jan 26, 2023, 05:14 AM
2 votes
2 answers
1623 views
CTE vs. temp table for batch deletes
Two-part question here. We have a large table (between 1-2 million rows) with very frequent DML operations on it. During low volume periods, we have an agent job to remove older rows to keep the tables size in check. it uses the CTE below, which is causing lots of blocking when it runs: ;with agent_...
Two-part question here. We have a large table (between 1-2 million rows) with very frequent DML operations on it. During low volume periods, we have an agent job to remove older rows to keep the tables size in check. it uses the CTE below, which is causing lots of blocking when it runs: ;with agent_cte (queue_id) as ( select top (3000) queue_id from Table_A with (updlock, readpast) where state in ('success','error','reject','failed','deleted') and modify_timestamp <= DATEADD(dd,- 5,getdate()) order by modify_timestamp asc ) delete from agent_cte ; CTE Query Plan I re-wrote this to use a temp table to speed the query and reduce blocking, but I'm finding a large performance difference between using IN vs. EXISTS to determine which rows to delete. IN version: -- Create temp Table create table #Table_AToDelete (Queue_id uniqueidentifier, PRIMARY KEY(Queue_id) ); -- grab top 3k queue_id's older than 5 days, insert into temp table Insert into #Table_AToDelete select top (3000) queue_id from Table_A with (nolock) where state in ('success','error','reject','failed','deleted') and modify_timestamp <= DATEADD(dd,- 5,getdate()) -- delete the rows from table_A based on rows in temp table delete from Table_A where queue_id in (select queue_id from #Table_AToDelete) Temp Table Plan This version runs in 40-50 seconds, however when I replace the last line from the delete statement with the below: where exists(select queue_id from #Table_AToDelete) It is still running after 2 minutes, so I cancelled it. So, the questions: 1. I've seen temp tables used to help with blocking and performance before, but don't fully understand WHY it would perform better than using a CTE? We do have an index on Queue_id. 2. Why is there a large performance difference between IN and EXISTS in the delete? Any feedback on how to tune this to perform better or reduce blocking further? Some more notes: - Both the CTE and temp table are available using paste the plan (links above). - There are FKs to two other tables with cascading update and delete, and the CTE plan spends more time on the delete there, while the temp table version spends more of its time on the main table. - Generally speaking, are there performance benefits from using a temp table like this vs. a CTE? - I'm not allowed to post table schemas, so I apologize if the plans are not enough info. I'm also going to test using a view like in this article .
DBA Greg14 (265 rep)
Nov 30, 2022, 02:42 PM • Last activity: Dec 1, 2022, 01:35 PM
2 votes
2 answers
176 views
Why do batch updates to a CQL list merge the data?
I have created the following table CONSISTENCY LOCAL_QUORUM; drop keyspace if exists cycling; CREATE KEYSPACE IF NOT EXISTS cycling WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 } and durable_writes = true; CREATE TABLE IF NOT EXISTS cycling.rider ( rider_id int PRIMARY K...
I have created the following table CONSISTENCY LOCAL_QUORUM; drop keyspace if exists cycling; CREATE KEYSPACE IF NOT EXISTS cycling WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 } and durable_writes = true; CREATE TABLE IF NOT EXISTS cycling.rider ( rider_id int PRIMARY KEY, rider_name list, rider_name2 frozen> ); Do the following sequence of batch operations CONSISTENCY LOCAL_QUORUM; BEGIN BATCH UPDATE cycling.rider SET rider_name = ['a2'], rider_name2 = ['b2'] WHERE rider_id = 100; UPDATE cycling.rider SET rider_name = ['a3'], rider_name2 = ['b3'] WHERE rider_id = 100; APPLY BATCH; I get merged data in my list (rider_name) select * from cycling.rider; rider_id | rider_name | rider_name2 ----------+--------------+------------- 100 | ['a2', 'a3'] | ['b3'] Do you know why? I was expecting rider_id | rider_name | rider_name2 ----------+--------------+------------- 100 | ['a3'] | ['b3']
gudge (133 rep)
Nov 15, 2022, 09:31 PM • Last activity: Nov 21, 2022, 01:17 PM
2 votes
5 answers
18924 views
Delete from table in chunks
I want to cleanup one of my tables which has 4 million. So I prepare this query to clean it. But it's throwing an error, can anyone please help me to fix this? DELIMITER $$ DROP PROCEDURE IF EXISTS archive_table $$ create procedure archive_table () set @min=(select min(dob) from test where dob @min...
I want to cleanup one of my tables which has 4 million. So I prepare this query to clean it. But it's throwing an error, can anyone please help me to fix this? DELIMITER $$ DROP PROCEDURE IF EXISTS archive_table $$ create procedure archive_table () set @min=(select min(dob) from test where dob @min and dob < DATE_SUB(CURDATE(), INTERVAL 30 day) ); END WHILE; END$$ **ERROR:** ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'while @min is not null begin delete from test where dob = @min; commi' at line 1
TheDataGuy (1986 rep)
Oct 27, 2017, 04:55 PM • Last activity: Oct 30, 2022, 08:11 PM
0 votes
1 answers
465 views
PostgreSQL: does the WAL group commit affect synchronous replication performance?
I'm curious if PostgreSQL's Group Commit mechanism (controlled by `commit_delay` and `commit_siblings` parameters) only affects disk fsync calls. Or does it also allow multiple on-the-fly transactions to be batched concurrently in WAL synchronous replication? From [here](https://wiki.postgresql.org/...
I'm curious if PostgreSQL's Group Commit mechanism (controlled by commit_delay and commit_siblings parameters) only affects disk fsync calls. Or does it also allow multiple on-the-fly transactions to be batched concurrently in WAL synchronous replication? From [here](https://wiki.postgresql.org/wiki/Group_commit) it should affect synchronous replication ("The proposed implementation this page describes is **heavily based on the existing synchronous replication implementation**. ")? But [here](https://www.postgresql.org/docs/current/wal-configuration.html) says: "No sleep will occur **if fsync is not enabled**", so it seems that Group Commit mechanism only reduces local fsync, and does not support batch confirmation of WAL remote synchronous replication? --------UPDATE: My understanding: In synchronous (strongly consistent) replication, each commit on the local node waits until the remote standby node returns ack before continuing. So my question is: During this waiting time for ack signaling (at least one RTT - Round Trip Time): **[A]** will the transactions on the concurrent connections of other clients continue to be replicated and committed to the standby node **without blocking**? **[B]** Or does each transaction have to **wait** for the end of the previous transaction to initiate a commit to the remote standby node (stop-and-wait protocol)? As you can see, action [A] is much like group commit, and [B] is like no group commit optimization. Coupled with the descriptions on the pg's official wiki (It says "The group commit is heavily based on the existing synchronous replication implementation."), that's why I associate it with group commit optimization.
ASBai (103 rep)
Sep 27, 2022, 10:48 PM • Last activity: Sep 28, 2022, 09:47 PM
0 votes
0 answers
59 views
Deleting large amounts of rows in SQL server 15
I'm trying to execute a batch delete, but I'm having trouble executing it due to not properly using clustered index. Clustered index consists of three fields(int, int, datetime2). The table consists of four columns(PK[integer, integer, datetime2], decimal) What I have tried so far is doing TOP(X) de...
I'm trying to execute a batch delete, but I'm having trouble executing it due to not properly using clustered index. Clustered index consists of three fields(int, int, datetime2). The table consists of four columns(PK[integer, integer, datetime2], decimal) What I have tried so far is doing TOP(X) delete which in the end slows down alot and deleting by days (one batch is one day) which is also not a good solution. Below I have attached the "by day" deletion script.
DECLARE
    @ITERATION INT = 0
    , @TOTALROWS INT = 0
    , @MSG VARCHAR(500)
    , @STARTTIME DATETIME
    , @ENDTIME DATETIME
    , @StartValue DATETIME = (SELECT MIN(DateTimeValue) FROM TableX)
    , @EndValue date
    , @MaxValue date = '2022-04-12';

SET NOCOUNT ON;
SET DEADLOCK_PRIORITY LOW;

WHILE @StartValue = @StartValue AND DateTimeValue > ' + CAST(DATEDIFF(millisecond, @STARTTIME,@ENDTIME) AS VARCHAR);

    RAISERROR (@MSG, 0, 1) WITH NOWAIT;

    SET @StartValue = @EndValue;

END;
Is there any way to use clustered index to my advantage here to speed up deletion?
elitra (1 rep)
Jul 12, 2022, 03:13 PM • Last activity: Jul 12, 2022, 03:35 PM
1 votes
0 answers
668 views
Postgres batch insert with custom return value
I use Postgres 13 with Golan and pgx/v4 package for [batch][1] insert items Now we're migrating from UUID to serial ID's. For one table with parent ID, I need to generate one sort of mapping of parentds ids, to put them on children's rows. I split this insert in 2 batches, In the first one, I do som...
I use Postgres 13 with Golan and pgx/v4 package for batch insert items Now we're migrating from UUID to serial ID's. For one table with parent ID, I need to generate one sort of mapping of parentds ids, to put them on children's rows. I split this insert in 2 batches, In the first one, I do something like this:
batch.Queue(fmt.Sprintf("%s, %d", query, entity.TmpId), record.Values()...)
I think generated output is similar to this:
insert into foo(txt) values ('a') returning id, 'key1' as externalKey;
After insert, I can generate this map of entity.TmpId and serialId. The problem is in performance, before now to insert 1.5k rows It takes ~2 sec. Now to insert only the parent's rows, which is less than 1.5k, it takes ~ 1 min. I did a test to return only id col, or id and some static value and the performance is good, comparable to what we have at the moment. At the moment, I want to generate this map using order of insert, although the documentation does not specify that this order will be guaranteed . Or maybe you suggest me a better solution? Here I created a SQL fiddle
Darii Petru (142 rep)
Jul 11, 2022, 09:20 AM
3 votes
2 answers
9426 views
Copy millions of rows to another table in batches mySQL
`Table A` which is always getting updated (records being inserted or updated). `Table A` contains millions of records. I'd like to copy some of these records to a new table `Table B`. `Table A` and `Table B` has exact same schema. How can I copy records from `Table A` to `Table B`? I don't want to c...
Table A which is always getting updated (records being inserted or updated). Table A contains millions of records. I'd like to copy some of these records to a new table Table B. Table A and Table B has exact same schema. How can I copy records from Table A to Table B? I don't want to consider the data which keeps getting updated in Table A. I only want to copy the data which is there when I first queried Table A. I'm trying to copy data in batches. So everytime I query for a batch of 500 records from Table A and copy them to Table B. The next time I query Table A to get next 500 records using offset. There is no guarentee that the new set of records are exactly next batch of 500 records since Table A is always getting updated. The task is to be able to ensure we are fetching batches in sequential way and it guarentees we have exactly next 500 records.
INSERT INTO Table B FROM SELECT * FROM Table A WHERE ...
doesn't work. Because as I mentioned Table A has a lot of data and running this query timesout. It needs to be carried out in batches. Creating a temporary table would also require to copy it in batches. I tried to use mySQL views but they also have the same problem. The view fetches data from the underlying table. If the underlying table gets updated the view fetches the updated data.
BountyHunter (33 rep)
Nov 10, 2021, 09:29 AM • Last activity: Nov 11, 2021, 12:20 AM
0 votes
1 answers
5509 views
How can I create a mysql database (if it doesn't exist) or replace it (if it exists)?
[My dump:][1] mysqldump --host=localhost --user=root --password=whatever --single-transaction --routines --triggers --log-error=error.txt --databases mydatabase > out.sql My restore mysql --host=localhost --user=root --password=whatever -Dtargetdatabasename < out.sql As I am restoring a single datab...
My dump: mysqldump --host=localhost --user=root --password=whatever --single-transaction --routines --triggers --log-error=error.txt --databases mydatabase > out.sql My restore mysql --host=localhost --user=root --password=whatever -Dtargetdatabasename < out.sql As I am restoring a single database, I need to make sure that if it already exists it is deleted and if it does not exist I must create it (empty) in order to import my dump file without errors. How do I do this, from the cmd console with batch? Thanks Update Solved:
mysql --host=localhost --user=root --password=whatever -e "DROP DATABASE IF EXISTS database_name";
mysql --host=localhost --user=root --password=whatever -e "CREATE DATABASE IF NOT EXISTS database_name";
if someone has a better idea to publish it, to select the answer as correct, otherwise you can vote for close
acgbox (157 rep)
Aug 25, 2021, 11:04 PM • Last activity: Aug 26, 2021, 11:11 PM
0 votes
1 answers
920 views
How to assign GUID to batches of data based on row counts and tunable batch size
I need to assign batches of rows a common GUID id (for consumption by an external process). Below is a simplified setup describing what I'm looking for: `BASE_TABLE` represents one of pre-existing tables in this scenario, much simplified obviously. Values in the "real" table should be considered ran...
I need to assign batches of rows a common GUID id (for consumption by an external process). Below is a simplified setup describing what I'm looking for: BASE_TABLE represents one of pre-existing tables in this scenario, much simplified obviously. Values in the "real" table should be considered random; they're not consecutive in real life (even the ID is not really an int) CREATE TABLE BASE_TABLE ( ID int ,VALUE1 varchar(10) ,VALUE2 varchar(255) ) TARGET_TABLE represents the "output" of the process I'm working on, which is just the input to something else that requires batches of data to have a uniqueIdentifier BATCHID for each batch. Batches in real life are ~1000 rows each. CREATE TABLE TARGET_TABLE ( ID int ,VALUE1 varchar(10) ,VALUE2 varchar(255) ,BATCHID uniqueIdentifier ) (Add some dummy data) DECLARE @DATA1 int set @DATA1=0 WHILE @DATA1<100 BEGIN INSERT INTO BASE_TABLE (ID,VALUE1,VALUE2) VALUES (@DATA1,'v1'+CONVERT(varchar,@DATA1), 'v2'+CONVERT(varchar,@DATA1)) SET @DATA1=@DATA1+1 END /** example DESIRED RESULT, with batch size of 5 |ID|VALUE1|VALUE2|BATCHID| |--|------|------|------------------------------------| |0 |v10 |v20 |38B1B4FB-7F1E-44FD-9336-19095C01C629| |1 |v11 |v21 |38B1B4FB-7F1E-44FD-9336-19095C01C629| |2 |v12 |v22 |38B1B4FB-7F1E-44FD-9336-19095C01C629| |3 |v13 |v23 |38B1B4FB-7F1E-44FD-9336-19095C01C629| |4 |v14 |v24 |38B1B4FB-7F1E-44FD-9336-19095C01C629| |5 |v15 |v25 |41122454-A743-4545-8F0C-D7B461E072AE| |6 |v16 |v26 |41122454-A743-4545-8F0C-D7B461E072AE| |7 |v17 |v27 |41122454-A743-4545-8F0C-D7B461E072AE| |8 |v18 |v28 |41122454-A743-4545-8F0C-D7B461E072AE| |9 |v19 |v29 |41122454-A743-4545-8F0C-D7B461E072AE| |10|v110 |v210 |41122454-A743-4545-8F0C-D7B461E072AE| |11|v111 |v211 |FBDE5513-C869-4F2D-AC4D-40CBEF4A2D48| etc Important result is that for each batch of size N there are N rows with the same BatchID, which must be a GUID. I'm pretty sure I could do what I want with a cursor, but I want to do something like this if possible: select ID,VALUE1,VALUE2, NEWID() OVER(PARTITION BY ROW_NUMBER() OVER (ORDER BY ID)) AS BatchID from BASE_TABLE but it's not valid because NEWID() isn't an aggregate function.
Gus (133 rep)
Mar 2, 2021, 06:19 PM • Last activity: Mar 2, 2021, 10:51 PM
7 votes
1 answers
2144 views
How to avoid Table Lock Escalation?
I've got a task to update 5 million rows in a production table, without locking down the whole table for extended time So, I used approach that helped me before many times - updating top (N) rows at a time with 1-N second interval between chunks This time started with update top (1000) rows at a tim...
I've got a task to update 5 million rows in a production table, without locking down the whole table for extended time So, I used approach that helped me before many times - updating top (N) rows at a time with 1-N second interval between chunks This time started with update top (1000) rows at a time, monitoring the Extended Events session for lock_escalation events in the process lock_escalation showed up during each update operation, so I started lowering row count per chunk 1000 -> 500 -> 200 -> 100 -> 50 rows and so on down to 1 Before (not with this table, and for delete operations - not update), lowering row count to 200 or 100, helped to get rid of lock_escalation events But this time, even with 1 row per 1 update operation, table lock_escalation still shows up. Duration of each update operation is about the same, regardless if its 1 row or 1000 rows at a time How to get rid of table lock escalations in my case ? @@TRANCOUNT is zero Extended event: Extended Event Code :
set nocount on

declare 
	@ChunkSize				int = 1000,							-- count rows to remove in 1 chunk 
	@TimeBetweenChunks		char(8) = '00:00:01', 				-- interval between chunks
	
	@Start					datetime,
	@End					datetime,
	@Diff					int,
	
	@MessageText			varchar(500),
	
	@counter				int = 1,
	@RowCount				int = 1,
	@TotalRowsToUpdate		bigint,
	@TotalRowsLeft			bigint
	


-- total row count to update
set @TotalRowsToUpdate = (select count(*)
							from [Table1]
								join [Table2] on
									btid = tBtID
							where	btStatusID = 81)


set @TotalRowsLeft = @TotalRowsToUpdate
set @MessageText = 'Total Rows to Update = ' + cast(@TotalRowsLeft as varchar) raiserror (@MessageText,0,1) with nowait
print ''



-- begin cycle
while @RowCount > 0 begin

	set @Start = getdate()

	-- update packages
	update top (@ChunkSize) bti
		set	btstatusid = 154,
			btType = 1
	from [Table1] bti
		join [Table2] on
			btid = tBtID
	where	btStatusID = 81
	

	set @RowCount = @@ROWCOUNT

	-- measure time
	set @End = getdate()
	set @Diff = datediff(ms,@Start,@End)

	set @TotalRowsLeft = @TotalRowsLeft - @RowCount
	set @MessageText = cast(@counter as varchar) + ' - Updated ' + cast(@RowCount as varchar) + ' rows in ' + cast(@Diff as varchar) + ' milliseconds - total ' + cast(@TotalRowsLeft as varchar) + ' rows left...'

	-- print progress message
	raiserror (@MessageText,0,1) with nowait


	set @counter += 1

	WAITFOR DELAY @TimeBetweenChunks

end
Plan: https://www.brentozar.com/pastetheplan/?id=SyozGWMLw
Aleksey Vitsko (6195 rep)
Sep 30, 2020, 11:02 AM • Last activity: Oct 1, 2020, 03:43 PM
Showing page 1 of 20 total questions