Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

100 views

Stop rolling back transaction by committing them

sql-server transaction rollback batch-processing

I have a SQL Server process that runs annually, creating around 400 million records in several tables. The process operates in a loop, committing every 20,000 records for each of the tables. I recently discovered that someone changed the script and put a `begin tran` in the initial lines, which was...

                                  I have a SQL Server process that runs annually, creating around 400 million records in several tables.

The process operates in a loop, committing every 20,000 records for each of the tables.

I recently discovered that someone changed the script and put a begin tran in the initial lines, which was only committed in the last lines. And there was an error in the process. 

Now the database is rolling back each of the 20 thousand lines that were committed in the nested transactions.

Is there any way to prevent this rollback?

How to identify the SPID of the transaction and commit it before the roll back reaches it?

There were only about 50 million lines left when the error occurred due to lack of space in *tempdb*. We estimate that it will take around 20 hours to roll back everything in four different tables.

Jair Pedro Assis (1 rep)

Jan 29, 2025, 11:55 PM • Last activity: Jan 30, 2025, 10:54 PM

2 votes

2 answers

285 views

Is it possible to store SQL path instead of uploading entire BLOB data?

sql-server powershell batch-processing

I've a c# site which is uploading images in SQL database and now that SQL table size is around 300GB. Is it possible to store the path of those images instead of storing entire blob data while uploading images from the C# site? Also, I would like to store those images in a NTFS file system. Along wi...

                                  I've a c# site which is uploading images in SQL database and now that SQL table size is around 300GB. Is it possible to store the path of those images instead of storing entire blob data while uploading images from the C# site? Also, I would like to store those images in a NTFS file system. Along with that, I would like to transfer those blob data to NTFS file system and store the path of those images. If so, how could we achieve this using SQL Script without using filestream and filetable?
                                

James (21 rep)

Oct 11, 2024, 01:49 AM • Last activity: Oct 12, 2024, 10:05 AM

1 votes

1 answers

1512 views

how to call lo_unlink(lo) in batches

postgresql-performance postgresql-9.5 hibernate batch-processing

In **Postgres 9.5** I am tying to reproduce the postgres tool vacuumlo to get rid of unreferenced large object to free disk space. References are stored in **text** type columns by hibernate so using the tool as is is not an option. Looking at vacuumlo sources ( [vacuumlo.c][1]) I have created a *va...

                                  In **Postgres 9.5** I am tying to reproduce the postgres tool vacuumlo to get rid of unreferenced large object to free disk space.
References are stored in **text** type columns by hibernate so using the tool as is is not an option. 
Looking at vacuumlo sources ( vacuumlo.c ) I have created a *vacuum_l* table like so:

    CREATE TABLE vacuum_l AS SELECT oid AS lo, false as deleted FROM pg_largeobject_metadata;

and proceded to delete all rows from *vacuum_l* table wich oids are referenced by user tables in my database.
Now is safe to call *unlink(lo)* for all items remaining in the *vacuum_l* table.

Unfortunately the table is very big and, as reported in the vacuumlo.c source code:

>We don't want to run each delete as an individual transaction, because the commit overhead would be high.  However, since 9.0 the backend will acquire a lock per deleted LO, so deleting too many LOs per transaction risks running out of room in the shared-memory lock table. Accordingly, we delete up to transaction_limit LOs per transaction.

So far I tried to unlink all oids with this sql function:

```
CREATE OR REPLACE FUNCTION unlinkOrphanedLobs() returns void AS
$BODY$
DECLARE
count_variable int;
begin
    loop
        SELECT COUNT(*) INTO count_variable FROM vacuum_l WHERE deleted=false;
        EXIT WHEN count_variable  ERROR: out of shared memory
> 
> HINT: You might need to increase max_locks_per_transaction.

Why, is clearly explained by the code comments reported above.
This function wont close the transaction at the end of the loop.

How would it be the correct way to do it?

Thanks for any help.
                                

Luky (121 rep)

Mar 22, 2021, 04:43 PM • Last activity: Jun 2, 2024, 09:07 PM

1 votes

1 answers

140 views

Weird behaviour with batch and trigger(MariaDB)

mariadb trigger batch-processing

I found a weird issue involving mariadb. batch update from hibernate. the original question was added here : https://stackoverflow.com/questions/78204056/hibernate-sqlintegrityconstraintviolation-duplicate-data-entry-issue-after-upg?noredirect=1#comment138316709_78204056 table t_instancetest CREATE...

                                  I found a weird issue involving mariadb. batch update from hibernate. the original question was added here :
https://stackoverflow.com/questions/78204056/hibernate-sqlintegrityconstraintviolation-duplicate-data-entry-issue-after-upg?noredirect=1#comment138316709_78204056 

table t_instancetest

    CREATE TABLE t_instancetest (
        RATEID BIGINT(10) NOT NULL,
        BID VARCHAR(25) NULL DEFAULT NULL COLLATE 'utf8_bin',
        ASK VARCHAR(25) NULL DEFAULT NULL COLLATE 'utf8_bin',
        MID VARCHAR(25) NULL DEFAULT NULL COLLATE 'utf8_bin',
        UPDATEDDATE DATETIME(6) NULL DEFAULT NULL,
        INSTANCEUPDATEDDATE DATETIME(6) NULL DEFAULT NULL,
        UPDATEDBY VARCHAR(30) NULL DEFAULT NULL COLLATE 'utf8_bin',
        CRUD VARCHAR(1) NULL DEFAULT NULL COLLATE 'utf8_bin',
        CREATEDBY VARCHAR(30) NULL DEFAULT NULL COLLATE 'utf8_bin',
        INSTANCEUPDATEDBY VARCHAR(30) NULL DEFAULT NULL COLLATE 'utf8_bin',
        LOCATIONID INT(3) NOT NULL,
        PRIMARY KEY (RATEID, LOCATIONID) USING BTREE
    )COLLATE='utf8_bin'
    ENGINE=InnoDB
    ;

audit log

    CREATE TABLE t_datalog (
        Id BIGINT(10) NOT NULL AUTO_INCREMENT,
        RateId BIGINT(10) NOT NULL,
        CreatedDate DATETIME(6) NOT NULL,
        PRIMARY KEY (Id) USING BTREE
         ))
    COLLATE='utf8_bin'
    ENGINE=InnoDB
    AUTO_INCREMENT=232
    ;


TRIGGER (BEFORE UPDATE ) - but this field later removed to make it easier to test - and this trigger dropped.

   

     CREATE DEFINER=USERADMIN@% TRIGGER TRG_BU_T_INSTANCE. 
     BEFORE UPDATE ON t_instance FOR EACH ROW BEGIN 
       SET NEW.ATTRCHANGEID=UUID();
         END


TRIGGER (AFTER UPDATE)

    CREATE DEFINER=USERADMIN@% TRIGGER TRG_AU_T_INSTANCETEST AFTER UPDATE ON t_instancetest FOR EACH ROW BEGIN
    INSERT INTO t_datalog ( RateId, CreatedDate) VALUES (OLD.RATEID, sysdate());
     END

Batch update from application ( batchsize = 7)-as can be shown here, it update 7 different rateids

    2024-05-10 12:31:59.814  INFO 3472 --- [nio-9012-exec-2] n.t.d.l.l.SLF4JQueryLoggingListener      : {"name":"reliableDataSource", "connection":3, "time":13, "success":true, "type":"Prepared", "batch":true, "querySize":1, "batchSize":7, "query":["update t_instancetest set ask=?, bid=?, createdby=?, crud=?, instanceupdateddate=?, locationId=?, mid=?, updatedby=?, updateddate=? where locationid=? and rateId=?"], "params":[["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7181","naga","2024-05-10 12:03:55.0","101","7181"], 
    ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7239","naga","2024-05-10 12:03:55.0","101","7239"],
    ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7254","naga","2024-05-10 12:03:55.0","101","7254"],
    ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7265","naga","2024-05-10 12:03:55.0","101","7265"],
    ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7370","naga","2024-05-10 12:03:55.0","101","7370"],
    ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7453","naga","2024-05-10 12:03:55.0","101","7453"],
    ["116.13","12.23","SYSTEM","U","2024-05-10 12:31:59.665","101","7643","naga","2024-05-10 12:03:55.0","101","7643"]]}
    
Mariadb logs , however showing we are only updating one record in t_instancetest and execute trigger 7 times on the same rateid. other rateid were not executed.

   

        #240509 17:27:01 server id 2  end_log_pos 127297238 CRC32 0x5e6a5fa7   Write_rows: table id 3067 flags: STMT_END_F
    ### UPDATE reliabledb.t_instance
    ### WHERE
    ###   @1=7181 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2='12.23' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */
    ###   @3='116.13' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */
    ###   @4='' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */
    ###   @5='2024-05-09 16:49:54.962888' /* DATETIME(6) meta=6 nullable=1 is_null=0 */
    ###   @6='2024-05-09 17:22:52.812000' /* DATETIME(6) meta=6 nullable=1 is_null=0 */
    ###   @7='naga' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */
    ###   @8='U' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
    ###   @9='SYSTEM' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */
    ###   @10='naga' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */
    ###   @11=101 /* INT meta=0 nullable=0 is_null=0 */
    ### SET
    ###   @1=7181 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2='12.23' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */
    ###   @3='116.13' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */
    ###   @4='7181' /* VARSTRING(75) meta=75 nullable=1 is_null=0 */
    ###   @5='2024-05-09 16:49:54.962888' /* DATETIME(6) meta=6 nullable=1 is_null=0 */
    ###   @6='2024-05-09 17:27:01.116000' /* DATETIME(6) meta=6 nullable=1 is_null=0 */
    ###   @7='naga' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */
    ###   @8='U' /* VARSTRING(3) meta=3 nullable=1 is_null=0 */
    ###   @9='SYSTEM' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */
    ###   @10='naga' /* VARSTRING(90) meta=90 nullable=1 is_null=0 */
    ###   @11=101 /* INT meta=0 nullable=0 is_null=0 */
    ### INSERT INTO reliabledb.t_datalog
    ### SET
    ###   @1=98 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */
    ### INSERT INTO reliabledb.t_datalog
    ### SET
    ###   @1=99 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */
    ### INSERT INTO reliabledb.t_datalog
    ### SET
    ###   @1=100 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */
    ### INSERT INTO reliabledb.t_datalog
    ### SET
    ###   @1=101 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */
    ### INSERT INTO reliabledb.t_datalog
    ### SET
    ###   @1=102 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */
    ### INSERT INTO reliabledb.t_datalog
    ### SET
    ###   @1=103 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2=7181 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @3='2024-05-09 17:27:01.000000' /* DATETIME(6) meta=6 nullable=0 is_null=0 */
    # Number of rows: 7
    # at 127297238
    #240509 17:27:01 server id 2  end_log_pos 127297269 CRC32 0xd7928418    Xid = 205571784
    COMMIT/*!*/;

If we drop the trigger, the insertion completed normally (all record updated ).
Also if we remove the batch, the behaviour the insertion will completed also.
The issue only happened if batch size > 3 and trigger are in effect.

Anyone can share any idea on why this is happening? or maybe point to a mariadb bug that might interfere with this? It only happens for this table on the schema, where the other table are not affected.

version details:

 - mariadb 10.5 
 - mariadb jdbc client 3.3.3 ( also tried with 2.7.4 )
 - spring boot ( java ) 2.7.18 
 - hibernate 5.6.15


                                

Rudy (151 rep)

May 10, 2024, 07:40 AM • Last activity: May 16, 2024, 07:31 AM

1 votes

1 answers

132 views

how can I see how many records have been affected by a transaction before it is committed? or how costly would be the rollback if I have to kill it?

sql-server sql-server-2016 transaction rollback batch-processing

I am using [sqlwatch][1] to monitor [some of my servers][2] and it has a job that deletes in batch from a table called `dbo.sqlwatch_logger_snapshot_header` I had lots of problems of locking and blocking as this table is used by different processes, then I disabled all other jobs with the exception...

I am using sqlwatch to monitor some of my servers and it has a job that deletes in batch from a table called dbo.sqlwatch_logger_snapshot_header I had lots of problems of locking and blocking as this table is used by different processes, then I disabled all other jobs with the exception of the one that deals with this table. I used the following script to disable all other sqlwatch jobs:

use distribution

set nocount on
set transaction isolation level read uncommitted

declare @Job_id uniqueidentifier
declare @job_name varchar(300)
declare @category_name varchar(300)
declare @body varchar(max)
declare @flag tinyint

declare @enabled bit = 0;  --0 - disable 1-enable
declare @subject varchar(300) = case when @enabled = 0 then 'Disabling  sqlwatch jobs on' + @@servername else 'Enabling sqlwatch jobs on' + @@servername end;
declare @job_enabled bit;

set @flag = 0
set @body = 'The following jobs are going to be ' +
          case when @enabled = 0 then 'Disabled' else 'Enabled' end
          + ' : '+char(10)+Char(13)


IF OBJECT_ID('tempdb.[dbo].[#LogReaderAgents]') IS NOT NULL 
DROP TABLE [dbo].[#LogReaderAgents] 

CREATE TABLE [dbo].[#LogReaderAgents] ( 
[job_id]         UNIQUEIDENTIFIER                 NOT NULL,
[job_name]       SYSNAME                          NOT NULL,
[category_name]  SYSNAME                          NOT NULL,
[enabled]        TINYINT                          NOT NULL)

INSERT INTO [dbo].[#LogReaderAgents] ([job_id],[job_name],[category_name],[enabled])
select job_id, job_name = sj.name, category_name=sc.name, sj.enabled
from  msdb.dbo.sysjobs sj
inner join msdb.dbo.syscategories sc
        on sj.category_id = sc.category_id
where 1=1
  and sj.name like 'SQLWATCH%'
 -- and sc.category_id in (10,13)
 and sj.name not in ('SQLWATCH-INTERNAL-RETENTION')

-- exec sp_gettabledef 'dbo.#LogReaderAgents'
-- exec sp_GetInsertList 'TEMPDB','DBO.#LogReaderAgents'


DECLARE c1 CURSOR FOR
       SELECT Job_id,job_name, category_name, [enabled]
       FROM #LogReaderAgents
OPEN c1
FETCH NEXT FROM c1
       INTO @Job_id,@job_name,@category_name,@job_enabled
       WHILE @@FETCH_STATUS = 0
       begin

              if (select top (1) stop_execution_date from msdb.dbo.sysjobactivity ja
                     where Job_ID = @Job_id
                           and
                     ja.start_execution_date IS NOT NULL
                     order by Start_execution_date desc) is not NULL
              begin
                     set @flag = 1
                     Print @job_name +' is ' + case when @job_enabled=1 then 'Enabled' else 'disabled' end 
                     exec msdb..sp_update_job @job_id = @job_id, @enabled = @enabled
                     set @Body = @Body + char(10)+char(13) + @job_name + ' -- ' + @category_name +' is ' + case when @job_enabled=1 then 'Enabled' else 'disabled' end 
              end
              
              FETCH NEXT FROM c1
       INTO @Job_id,@job_name,@category_name,@job_enabled
       end
CLOSE c1
DEALLOCATE c1

however, after that the job is still running, but I dont see any changes in the number of the records that are in this table. here is the job executing a procedure and doing a delete in batch:

progress has been slow but steady, there were 37.2 million rows

now there are 36.9 millions

so basically it is working, so all good. Now my question: Suppose I had to stop this job for any reason, it is a batch delete, if I had to stop this job now, how many rows are affected, how costly would it be to rollback what has not been committed yet?

Marcello Miorelli (17274 rep)

Feb 23, 2024, 04:56 PM • Last activity: Feb 25, 2024, 01:35 AM

1 votes

0 answers

160 views

SQL script to run from batch jobs

sql-server linked-server jobs sqlcmd batch-processing

I have a sql script that truncates the table and insert into database from a linked server.Since the server is SQL express I will need to use batch jobs to schedule the jobs. Below is a snippet of the batch job that calls the sql script TRUNCATE TABLE [TABLE1].[dbo].[TableData] WAITFOR DELAY '000:00...

                                  I have a sql script that truncates the table and insert into database from a linked server.Since the server is SQL express I will need to use batch jobs to schedule the jobs. Below is a snippet of the batch job that calls the sql script

TRUNCATE TABLE [TABLE1].[dbo].[TableData]

WAITFOR DELAY '000:00:01'

INSERT INTO [TABLE1].[dbo].[TableData]
    SELECT *
    FROM [SourceServer].[SourceDatabase].[dbo].[VW_View]

The batch jobs works fine if I use the below code but since osql is deprecated I will need to use sqlcmd which is runs in a loop and never stops.

osql -S myserver -E  -i "c:\test.sql" -o "c:\output.txt" --- Runs fine

sqlcmd -S myserver -E  -i "c:\test.sql" ----- Does not work and goes to a loop that does not end.

I will appreciate if someone can provide me the correct syntax for the sqlcmd.

SQL_NoExpert (1117 rep)

Feb 14, 2024, 12:23 PM

1 votes

1 answers

470 views

Handle partial failures in a transaction (commit selectively) in MSSQL

sql-server transaction error-handling batch-processing

I am doing a batch update where I process records sequentially (User 1, User 2 etc). Each record has multiple update queries associated to it. I need to skip if there is a data issue for an individual record (user in this case). Ideally, I would like to process them in parallel, but haven't reached...

                                  I am doing a batch update where I process records sequentially (User 1, User 2 etc). Each record has multiple update queries associated to it. I need to skip if there is a data issue for an individual record (user in this case). Ideally, I would like to process them in parallel, but haven't reached that level yet (lots of challenges).

Is it possible to do it such that:

1. There is a single transaction.
2. If there is a failure the statements associated with **that** record gets rolled back, **without** affecting others.
3. Commit.

For example, I have 4 users in a CSV file. If 3 are good and 1 is bad, 3 should get committed (or aborted) atomically; 1 should get skipped with errors.

Observed:

~~~
do everything for user 1;
do everything for user 2;
--> if there is failure, it rolls back the *entire* transaction
do everything for user 3;
~~~

In fact, any error level >= 16 is rolling back the entire transaction.

Expected:

~~~
do everything for user 1;
do everything for user 2;
--> if there is failure, roll back this *block* only
do everything for user 3;
do everything for user 4;
~~~

It's a normal try-catch requirement in any programming language; however, couldn't see a SQL Server equivalent (involving transactions). I have read about checkpoints, but not sure if that's an option to consider.
                                

Nishant (899 rep)

Oct 23, 2023, 03:07 PM • Last activity: Oct 26, 2023, 02:51 AM

-1 votes

1 answers

788 views

What is the safe way to update production data if data found inconsistent

backup update data batch-processing

I am in a small company and require to fix a bunch of data in production which are inconsistent. I have written the script to handle the fix. I understand that writing sql to fix data is way more risky than working on fixes normal front end or backend code. For front end or back end, we can write th...

                                  I am in a small company and require to fix a bunch of data in production which are inconsistent.  I have written the script to handle the fix. 

I understand that writing sql to fix data is way more risky than working on fixes normal front end or backend code. For front end or back end, we can write thousands of test code to automatically test result or even little bug the affect might not be as dramatic. 

How do you guys fix the data in production normally if many data was inconsistent? Are there any method or strategy which can reduce the risk

LittleFunny (99 rep)

Apr 2, 2020, 10:13 PM • Last activity: Aug 29, 2023, 09:14 AM

0 votes

2 answers

289 views

Delete all rows if input parameter is 0 otherwise delete only those entries which match the input param

sql-server join delete batch-processing

I have a stored proc which is receiving id as input. If the id which is received is 0 then we need to delete all the rows otherwise we need to delete only those rows which matches the input id. I can do this by splitting my query using an if-else but that's the catch here, we want to do it in a sing...

                                  I have a stored proc which is receiving id as input. If the id which is received is 0 then we need to delete all the rows otherwise we need to delete only those rows which matches the input id.

I can do this by splitting my query using an if-else but that's the catch here, we want to do it in a single query by using JOINS and CASE (may be).

Here is my table.

    CREATE TABLE InfineonFiles (
    	infimax INTEGER NOT NULL,
    	neonId  INTEGER NOT NULL,
    	eTime   INTEGER NOT NULL,
    	pTime   INTEGER NOT NULL,
    	cisId   INTEGER NOT NULL )

Input attribute is say,

    DECLARE @inpId INT = [ 0 OR POSITIVE INTEGER]

Deletion is like this

    IF @inpId = 0
        DELETE TOP (5000) AIA FROM InfineonFiles WHERE pTime < @timeUnitInSeconds 
    ELSE
        DELETE TOP (5000) AIA FROM InfineonFiles WHERE pTime < @timeUnitInSeconds AND cisId = inpId 

The above needs to be in a single query. How can I do that?
                                

Himanshuman (197 rep)

Aug 11, 2023, 09:46 AM • Last activity: Aug 11, 2023, 11:40 AM

0 votes

0 answers

91 views

Search from text file contain string and replace row

mysql query mysql-5.7 replace batch-processing

My database table named VmIpAddress has the following 12 columns `id, hosting_id, server_id, vm_id, ip, mac_address, subnet_mask, gateway, cidr, trunks, tag, net` A row might contain for example "3471","3654","16",NULL,"137.74.189.22","02:00:00:55:a1:57","255.255.255.224","137.74.189.254","27",NULL,...

                                  My database table named VmIpAddress has the following 12 columns

id, hosting_id, server_id, vm_id, ip, mac_address, subnet_mask, gateway, cidr, trunks, tag, net

A row might contain for example

    "3471","3654","16",NULL,"137.74.189.22","02:00:00:55:a1:57","255.255.255.224","137.74.189.254","27",NULL,NULL,"net0"

I have a .txt file contain the following data

    137.74.189.22, 193.15.125.110, 02:00:00:20:a2:10
    137.74.189.23, 193.15.125.111, 02:00:00:20:a2:11
    137.74.189.24, 193.15.125.112, 02:00:00:20:a2:12
    137.74.189.25, 193.15.125.113, 02:00:00:20:a2:13

Note that the $5 "ip" must be exactly what the text file contains 137.74.189.22 in this case, there may also be 137.74.189.222 which I do not want changed.

From this data the table row contains 137.74.189.22 should be modified to the new IP address and MAC as well as a few other columns $3 $5 $6 $7 $8 $9 should be changed so that it changes to

    "3471","3654","19",NULL,"193.15.125.110","02:00:00:20:a2:10","255.255.255.0","193.15.125.1","24",NULL,NULL,"net0"

$3 = Always 19

$5 = First variable in text file

$6 = Third variable in text file

$7 = Always 255.255.255.0

$8 = Always 193.15.125.1

$9 = Always 24

OS Linux mysql 5.7

Toodarday (113 rep)

Jan 25, 2023, 06:34 PM • Last activity: Jan 26, 2023, 05:14 AM

2 votes

2 answers

1623 views

CTE vs. temp table for batch deletes

sql-server delete cte temporary-tables batch-processing

Two-part question here. We have a large table (between 1-2 million rows) with very frequent DML operations on it. During low volume periods, we have an agent job to remove older rows to keep the tables size in check. it uses the CTE below, which is causing lots of blocking when it runs: ;with agent_...

                                  Two-part question here.  

We have a large table (between 1-2 million rows) with very frequent DML operations on it.  During low volume periods, we have an agent job to remove older rows to keep the tables size in check.  it uses the CTE below, which is causing lots of blocking when it runs:

    ;with agent_cte (queue_id) as 
    (
       select top (3000) queue_id
       from Table_A with (updlock, readpast)
       where state in ('success','error','reject','failed','deleted')
       and modify_timestamp <= DATEADD(dd,- 5,getdate())
       order by modify_timestamp asc
    )
       delete from agent_cte 
    ;
CTE Query Plan 

I re-wrote this to use a temp table to speed the query and reduce blocking, but I'm finding a large performance difference between using IN vs. EXISTS to determine which rows to delete.

IN version:

    -- Create temp Table
	create table #Table_AToDelete
		(Queue_id uniqueidentifier,
		PRIMARY KEY(Queue_id)
		);

    -- grab top 3k queue_id's older than 5 days, insert into temp table
    	Insert into #Table_AToDelete
    		select top (3000) queue_id
    		from Table_A with (nolock)
    		where state in ('success','error','reject','failed','deleted')
    		and modify_timestamp <= DATEADD(dd,- 5,getdate())
	
    -- delete the rows from table_A based on rows in temp table
    	delete 
    		from Table_A
    		where queue_id in (select queue_id from #Table_AToDelete)

Temp Table Plan 

This version runs in 40-50 seconds, however when I replace the last line from the delete statement with the below:

    where exists(select queue_id from #Table_AToDelete)

It is still running after 2 minutes, so I cancelled it.

So, the questions:

1. I've seen temp tables used to help with blocking and performance before, but don't fully understand WHY it would perform better than using a CTE?  We do have an index on Queue_id.
2. Why is there a large performance difference between IN and EXISTS in the delete?
 
Any feedback on how to tune this to perform better or reduce blocking further?

Some more notes:

- Both the CTE and temp table are available using paste the plan (links above). 
- There are FKs to two other tables with cascading update and delete, and the CTE plan spends more time on the delete there, while the temp table version spends more of its time on the main table. 
- Generally speaking, are there performance benefits from using a temp table like this vs. a CTE? 
- I'm not allowed to post table schemas, so I apologize if the plans are not enough info.

I'm also going to test using a view like in this article . 


                                

DBA Greg14 (265 rep)

Nov 30, 2022, 02:42 PM • Last activity: Dec 1, 2022, 01:35 PM

2 votes

2 answers

176 views

Why do batch updates to a CQL list merge the data?

cassandra batch-processing cql

I have created the following table CONSISTENCY LOCAL_QUORUM; drop keyspace if exists cycling; CREATE KEYSPACE IF NOT EXISTS cycling WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 } and durable_writes = true; CREATE TABLE IF NOT EXISTS cycling.rider ( rider_id int PRIMARY K...

                                  I have created the following table

    CONSISTENCY LOCAL_QUORUM;
    
    drop keyspace if exists cycling;
    
    CREATE KEYSPACE IF NOT EXISTS cycling
      WITH REPLICATION = { 
       'class' : 'SimpleStrategy', 
       'replication_factor' : 3 
      } and durable_writes = true;
    
    
    CREATE TABLE IF NOT EXISTS cycling.rider (
       rider_id int PRIMARY KEY,
       rider_name list,
       rider_name2 frozen>
    );

Do the following sequence of batch operations 

    CONSISTENCY LOCAL_QUORUM;
    BEGIN BATCH
    UPDATE cycling.rider SET rider_name = ['a2'], rider_name2 = ['b2'] WHERE rider_id = 100;
    UPDATE cycling.rider SET rider_name = ['a3'], rider_name2 = ['b3'] WHERE rider_id = 100;
    APPLY BATCH;

I get merged data in my list (rider_name)

    select * from cycling.rider;
     rider_id | rider_name   | rider_name2
    ----------+--------------+-------------
          100 | ['a2', 'a3'] |      ['b3']

Do you know why?

I was expecting

     rider_id | rider_name   | rider_name2
    ----------+--------------+-------------
          100 | ['a3']       |      ['b3']


                                

gudge (133 rep)

Nov 15, 2022, 09:31 PM • Last activity: Nov 21, 2022, 01:17 PM

2 votes

5 answers

18924 views

Delete from table in chunks

mysql delete batch-processing

I want to cleanup one of my tables which has 4 million. So I prepare this query to clean it. But it's throwing an error, can anyone please help me to fix this? DELIMITER $$ DROP PROCEDURE IF EXISTS archive_table $$ create procedure archive_table () set @min=(select min(dob) from test where dob @min...

                                  I want to cleanup one of my tables which has 4 million. So I prepare this query to clean it. But it's throwing an error, can anyone please help me to fix this?

    DELIMITER $$
    DROP PROCEDURE IF EXISTS archive_table $$
    create procedure archive_table () 
    set @min=(select min(dob) from test
                  where dob  @min
                  and dob < DATE_SUB(CURDATE(), INTERVAL 30 day) );
    END WHILE;
    END$$

**ERROR:**

        ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use
     near 'while @min is not null
    begin
        delete from test where dob = @min;
        commi' at line 1


                                

TheDataGuy (1986 rep)

Oct 27, 2017, 04:55 PM • Last activity: Oct 30, 2022, 08:11 PM

0 votes

1 answers

465 views

PostgreSQL: does the WAL group commit affect synchronous replication performance?

postgresql replication transaction bulk batch-processing

I'm curious if PostgreSQL's Group Commit mechanism (controlled by `commit_delay` and `commit_siblings` parameters) only affects disk fsync calls. Or does it also allow multiple on-the-fly transactions to be batched concurrently in WAL synchronous replication? From [here](https://wiki.postgresql.org/...

                                  I'm curious if PostgreSQL's Group Commit mechanism (controlled by commit_delay and commit_siblings parameters) only affects disk fsync calls.

Or does it also allow multiple on-the-fly transactions to be batched concurrently in WAL synchronous replication?

From [here](https://wiki.postgresql.org/wiki/Group_commit)  it should affect synchronous replication ("The proposed implementation this page describes is **heavily based on the existing synchronous replication implementation**. ")? 

But [here](https://www.postgresql.org/docs/current/wal-configuration.html)  says: "No sleep will occur **if fsync is not enabled**", so it seems that Group Commit mechanism only reduces local fsync, and does not support batch confirmation of WAL remote synchronous replication?

--------UPDATE:

My understanding: In synchronous (strongly consistent) replication, each commit on the local node waits until the remote standby node returns ack before continuing. 

So my question is: During this waiting time for ack signaling (at least one RTT - Round Trip Time):

**[A]** will the transactions on the concurrent connections of other clients continue to be replicated and committed to the standby node **without blocking**? 

**[B]** Or does each transaction have to **wait** for the end of the previous transaction to initiate a commit to the remote standby node (stop-and-wait protocol)?

As you can see, action [A] is much like group commit, and [B] is like no group commit optimization. Coupled with the descriptions on the pg's official wiki (It says "The group commit is heavily based on the existing synchronous replication implementation."), that's why I associate it with group commit optimization.

ASBai (103 rep)

Sep 27, 2022, 10:48 PM • Last activity: Sep 28, 2022, 09:47 PM

0 votes

0 answers

59 views

Deleting large amounts of rows in SQL server 15

sql-server sql-server-2019 batch-processing

I'm trying to execute a batch delete, but I'm having trouble executing it due to not properly using clustered index. Clustered index consists of three fields(int, int, datetime2). The table consists of four columns(PK[integer, integer, datetime2], decimal) What I have tried so far is doing TOP(X) delete which in the end slows down alot and deleting by days (one batch is one day) which is also not a good solution. Below I have attached the "by day" deletion script.

DECLARE
    @ITERATION INT = 0
    , @TOTALROWS INT = 0
    , @MSG VARCHAR(500)
    , @STARTTIME DATETIME
    , @ENDTIME DATETIME
    , @StartValue DATETIME = (SELECT MIN(DateTimeValue) FROM TableX)
    , @EndValue date
    , @MaxValue date = '2022-04-12';

SET NOCOUNT ON;
SET DEADLOCK_PRIORITY LOW;

WHILE @StartValue = @StartValue AND DateTimeValue > ' + CAST(DATEDIFF(millisecond, @STARTTIME,@ENDTIME) AS VARCHAR);

    RAISERROR (@MSG, 0, 1) WITH NOWAIT;

    SET @StartValue = @EndValue;

END;

Is there any way to use clustered index to my advantage here to speed up deletion?

elitra (1 rep)

Jul 12, 2022, 03:13 PM • Last activity: Jul 12, 2022, 03:35 PM

1 votes

0 answers

668 views

Postgres batch insert with custom return value

postgresql-performance batch-processing

I use Postgres 13 with Golan and pgx/v4 package for [batch][1] insert items Now we're migrating from UUID to serial ID's. For one table with parent ID, I need to generate one sort of mapping of parentds ids, to put them on children's rows. I split this insert in 2 batches, In the first one, I do som...

I use Postgres 13 with Golan and pgx/v4 package for batch insert items Now we're migrating from UUID to serial ID's. For one table with parent ID, I need to generate one sort of mapping of parentds ids, to put them on children's rows. I split this insert in 2 batches, In the first one, I do something like this:

batch.Queue(fmt.Sprintf("%s, %d", query, entity.TmpId), record.Values()...)

I think generated output is similar to this:

insert into foo(txt) values ('a') returning id, 'key1' as externalKey;

After insert, I can generate this map of entity.TmpId and serialId. The problem is in performance, before now to insert 1.5k rows It takes ~2 sec. Now to insert only the parent's rows, which is less than 1.5k, it takes ~ 1 min. I did a test to return only id col, or id and some static value and the performance is good, comparable to what we have at the moment. At the moment, I want to generate this map using order of insert, although the documentation does not specify that this order will be guaranteed . Or maybe you suggest me a better solution? Here I created a SQL fiddle

Darii Petru (142 rep)

Jul 11, 2022, 09:20 AM

3 votes

2 answers

9426 views

Copy millions of rows to another table in batches mySQL

mysql bulkcopy batch-processing

`Table A` which is always getting updated (records being inserted or updated). `Table A` contains millions of records. I'd like to copy some of these records to a new table `Table B`. `Table A` and `Table B` has exact same schema. How can I copy records from `Table A` to `Table B`? I don't want to c...

Table A which is always getting updated (records being inserted or updated). Table A contains millions of records. I'd like to copy some of these records to a new table Table B. Table A and Table B has exact same schema. How can I copy records from Table A to Table B? I don't want to consider the data which keeps getting updated in Table A. I only want to copy the data which is there when I first queried Table A. I'm trying to copy data in batches. So everytime I query for a batch of 500 records from Table A and copy them to Table B. The next time I query Table A to get next 500 records using offset. There is no guarentee that the new set of records are exactly next batch of 500 records since Table A is always getting updated. The task is to be able to ensure we are fetching batches in sequential way and it guarentees we have exactly next 500 records.

INSERT INTO Table B FROM SELECT * FROM Table A WHERE ...

doesn't work. Because as I mentioned Table A has a lot of data and running this query timesout. It needs to be carried out in batches. Creating a temporary table would also require to copy it in batches. I tried to use mySQL views but they also have the same problem. The view fetches data from the underlying table. If the underlying table gets updated the view fetches the updated data.

BountyHunter (33 rep)

Nov 10, 2021, 09:29 AM • Last activity: Nov 11, 2021, 12:20 AM

0 votes

1 answers

5509 views

How can I create a mysql database (if it doesn't exist) or replace it (if it exists)?

mysql mysqldump windows command-line batch-processing

[My dump:][1] mysqldump --host=localhost --user=root --password=whatever --single-transaction --routines --triggers --log-error=error.txt --databases mydatabase > out.sql My restore mysql --host=localhost --user=root --password=whatever -Dtargetdatabasename < out.sql As I am restoring a single datab...

My dump: mysqldump --host=localhost --user=root --password=whatever --single-transaction --routines --triggers --log-error=error.txt --databases mydatabase > out.sql My restore mysql --host=localhost --user=root --password=whatever -Dtargetdatabasename < out.sql As I am restoring a single database, I need to make sure that if it already exists it is deleted and if it does not exist I must create it (empty) in order to import my dump file without errors. How do I do this, from the cmd console with batch? Thanks Update Solved:

mysql --host=localhost --user=root --password=whatever -e "DROP DATABASE IF EXISTS database_name";
mysql --host=localhost --user=root --password=whatever -e "CREATE DATABASE IF NOT EXISTS database_name";

if someone has a better idea to publish it, to select the answer as correct, otherwise you can vote for close

acgbox (157 rep)

Aug 25, 2021, 11:04 PM • Last activity: Aug 26, 2021, 11:11 PM

0 votes

1 answers

920 views

How to assign GUID to batches of data based on row counts and tunable batch size

sql-server uuid batch-processing

I need to assign batches of rows a common GUID id (for consumption by an external process). Below is a simplified setup describing what I'm looking for: `BASE_TABLE` represents one of pre-existing tables in this scenario, much simplified obviously. Values in the "real" table should be considered ran...

                                  I need to assign batches of rows a common GUID id (for consumption by an external process).

Below is a simplified setup describing what I'm looking for:

BASE_TABLE represents one of pre-existing tables in this scenario, much simplified obviously.  Values in the "real" table should be considered random; they're not consecutive in real life (even the ID is not really an int)

    CREATE TABLE BASE_TABLE (
    	ID int
    	,VALUE1 varchar(10)
    	,VALUE2 varchar(255)
    ) 

TARGET_TABLE represents the "output" of the process I'm working on, which is just the input to something else that requires batches of data to have a uniqueIdentifier BATCHID for each batch.  Batches in real life are ~1000 rows each.

    CREATE TABLE TARGET_TABLE 
    (
      ID int
      ,VALUE1 varchar(10)
      ,VALUE2 varchar(255)
      ,BATCHID uniqueIdentifier
    )

(Add some dummy data)

    DECLARE @DATA1 int
    set @DATA1=0
    WHILE @DATA1<100
    BEGIN
      INSERT INTO BASE_TABLE (ID,VALUE1,VALUE2) VALUES (@DATA1,'v1'+CONVERT(varchar,@DATA1), 'v2'+CONVERT(varchar,@DATA1))
      SET @DATA1=@DATA1+1
    END

/**
example DESIRED RESULT, with batch size of 5


|ID|VALUE1|VALUE2|BATCHID|
|--|------|------|------------------------------------|
|0 |v10   |v20   |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|1 |v11   |v21   |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|2 |v12   |v22   |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|3 |v13   |v23   |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|4 |v14   |v24   |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|5 |v15   |v25   |41122454-A743-4545-8F0C-D7B461E072AE|
|6 |v16   |v26   |41122454-A743-4545-8F0C-D7B461E072AE|
|7 |v17   |v27   |41122454-A743-4545-8F0C-D7B461E072AE|
|8 |v18   |v28   |41122454-A743-4545-8F0C-D7B461E072AE|
|9 |v19   |v29   |41122454-A743-4545-8F0C-D7B461E072AE|
|10|v110  |v210  |41122454-A743-4545-8F0C-D7B461E072AE|
|11|v111  |v211  |FBDE5513-C869-4F2D-AC4D-40CBEF4A2D48|

etc

Important result is that for each batch of size N there are N rows with the same BatchID, which must be a GUID.

I'm pretty sure I could do what I want with a cursor, but I want to do something like this if possible:

    select 
    ID,VALUE1,VALUE2,
    NEWID() OVER(PARTITION BY ROW_NUMBER() OVER (ORDER BY ID)) AS BatchID
    from BASE_TABLE

but it's not valid because NEWID() isn't an aggregate function.

                                

Gus (133 rep)

Mar 2, 2021, 06:19 PM • Last activity: Mar 2, 2021, 10:51 PM

7 votes

1 answers

2144 views

How to avoid Table Lock Escalation?

sql-server sql-server-2017 extended-events lock-escalation batch-processing

I've got a task to update 5 million rows in a production table, without locking down the whole table for extended time So, I used approach that helped me before many times - updating top (N) rows at a time with 1-N second interval between chunks This time started with update top (1000) rows at a time, monitoring the Extended Events session for lock_escalation events in the process lock_escalation showed up during each update operation, so I started lowering row count per chunk 1000 -> 500 -> 200 -> 100 -> 50 rows and so on down to 1 Before (not with this table, and for delete operations - not update), lowering row count to 200 or 100, helped to get rid of lock_escalation events But this time, even with 1 row per 1 update operation, table lock_escalation still shows up. Duration of each update operation is about the same, regardless if its 1 row or 1000 rows at a time How to get rid of table lock escalations in my case ? @@TRANCOUNT is zero Extended event:

Code :

set nocount on

declare 
	@ChunkSize				int = 1000,							-- count rows to remove in 1 chunk 
	@TimeBetweenChunks		char(8) = '00:00:01', 				-- interval between chunks
	
	@Start					datetime,
	@End					datetime,
	@Diff					int,
	
	@MessageText			varchar(500),
	
	@counter				int = 1,
	@RowCount				int = 1,
	@TotalRowsToUpdate		bigint,
	@TotalRowsLeft			bigint
	


-- total row count to update
set @TotalRowsToUpdate = (select count(*)
							from [Table1]
								join [Table2] on
									btid = tBtID
							where	btStatusID = 81)


set @TotalRowsLeft = @TotalRowsToUpdate
set @MessageText = 'Total Rows to Update = ' + cast(@TotalRowsLeft as varchar) raiserror (@MessageText,0,1) with nowait
print ''



-- begin cycle
while @RowCount > 0 begin

	set @Start = getdate()

	-- update packages
	update top (@ChunkSize) bti
		set	btstatusid = 154,
			btType = 1
	from [Table1] bti
		join [Table2] on
			btid = tBtID
	where	btStatusID = 81
	

	set @RowCount = @@ROWCOUNT

	-- measure time
	set @End = getdate()
	set @Diff = datediff(ms,@Start,@End)

	set @TotalRowsLeft = @TotalRowsLeft - @RowCount
	set @MessageText = cast(@counter as varchar) + ' - Updated ' + cast(@RowCount as varchar) + ' rows in ' + cast(@Diff as varchar) + ' milliseconds - total ' + cast(@TotalRowsLeft as varchar) + ' rows left...'

	-- print progress message
	raiserror (@MessageText,0,1) with nowait


	set @counter += 1

	WAITFOR DELAY @TimeBetweenChunks

end

Plan: https://www.brentozar.com/pastetheplan/?id=SyozGWMLw

Aleksey Vitsko (6195 rep)

Sep 30, 2020, 11:02 AM • Last activity: Oct 1, 2020, 03:43 PM

Showing page 1 of 20 total questions