Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

1 votes
1 answers
516 views
What is the impact of DROP SCHEMA on concurrent read only workloads in PostgreSQL?
I have an ELT process that deals with a pretty small amount of data. I would like to have 1 schema called `prod` with the results of the previous ELT run and 1 schema called `dev` where I will perform the transformations of the current ELT run. Once the current ELT run is complete I would like to ma...
I have an ELT process that deals with a pretty small amount of data. I would like to have 1 schema called prod with the results of the previous ELT run and 1 schema called dev where I will perform the transformations of the current ELT run. Once the current ELT run is complete I would like to make the switch by running something along the lines of: start transaction; drop schema prod; alter schema dev rename to prod; end transaction; All queries issued outside of the ELT process are read only. How would MVCC handle this? What would happen to read only queries that were running when the drop schema command is issued? What would happen to read only queries issued after the drop schema command? What about after the end of the transaction?
foobar0100 (641 rep)
May 2, 2016, 03:19 AM • Last activity: Jul 28, 2025, 04:05 PM
2 votes
0 answers
63 views
How can I design an experiment to show the benefits of writing while under Snapshot isolation?
We've all read [the documentation for Snapshot isolation](https://learn.microsoft.com/en-us/sql/relational-databases/sql-server-transaction-locking-and-row-versioning-guide?view=sql-server-ver16#modify-data-without-optimized-locking) and know about Update Conflict Detection and when you should _theo...
We've all read [the documentation for Snapshot isolation](https://learn.microsoft.com/en-us/sql/relational-databases/sql-server-transaction-locking-and-row-versioning-guide?view=sql-server-ver16#modify-data-without-optimized-locking) and know about Update Conflict Detection and when you should _theoretically_ use Snapshot isolation for writes. However, I have never found anyone who does their writes under Snapshot isolation. Paul White has [a post on how it can go wrong](https://www.sql.kiwi/2014/06/the-snapshot-isolation-level/) , but I have never seen anyone discuss what it looks like when it goes _right_. I haven't found it in any textbooks, blogs, or production servers. **How can I design an experiment to show the benefits of writing while under Snapshot isolation?** I am particularly interested in comparing write performance. However, what should I vary to test when writing under Snapshot is a good idea? Furthermore, what is a fair comparison to it? Read Committed, RCSI, or something more extreme like Serializable? I am **not** asking about Read Committed Snapshot or [using Snapshot for reads](https://dba.stackexchange.com/questions/346376/why-not-use-snapshot-isolation-for-everything-read-only)· ; They're both awesome.
J. Mini (1225 rep)
Jul 23, 2025, 06:48 PM • Last activity: Jul 26, 2025, 12:28 PM
0 votes
1 answers
148 views
Prevent failure in conditional insert in mysql database
The infrastructure of our system looks like this. An AWS lambda function receives requests such as (accountId, .....). It creates an entry in the MySQL database using a newly generated UUID as caseId. (caseId, accountId, ....). The insert is a conditional insert operation discussed in detail below....
The infrastructure of our system looks like this. An AWS lambda function receives requests such as (accountId, .....). It creates an entry in the MySQL database using a newly generated UUID as caseId. (caseId, accountId, ....). The insert is a conditional insert operation discussed in detail below. I am able to avoid race condition by setting transaction isolation to SERIALIZABLE. However, the issue is that I do not have any control over how many concurrent requests will be successfully processed. For example, consider following concurrent requests.
request  | accountId | field1 | ...  
 1         a1          value1   ....   true     --- create a new entry with caseId Idxxx
 2         a1          value2   ....   false    --- update existing entry with caseId Idxxx 
 3         a1          value3   ....   false    --- update existing entry with caseId Idxxx 
 4         a1          value4   ....   false    --- update existing entry with caseId Idxxx
With our current implementation we are getting CannotAquireLockException. What are the ways in which I can avoid retry failures (CannotAquireLockException) ? The detailed table schema and condition are described below: The database is a mysql database system with the following table schema.
Table1: case table

|caseId(PK) | accountId | status |  .....

Table2: case reopen table

|caseId(FK)| casereopenId(PK)| caseReopenTime|

Table3: Alert table

Id (incrementing id) | alertId | accountId |
The lambda function tries to "create" a case in the database. the create wrapper, generates a UUID for caseId. The goal is : - check if an accountId already exists in case table. - if it does, then - check if status is OPEN - get the caseId for the accountId. - check if the caseId is present in case reopen table. - if above condition is false, then add an entry into the case table. Thanks!
Swagatika (101 rep)
Feb 13, 2020, 01:10 AM • Last activity: Jul 17, 2025, 08:04 PM
16 votes
2 answers
4817 views
Is it possible for SQL statements to execute concurrently within a single session in SQL Server?
I have written a stored procedure which makes use of a temporary table. I know that in SQL Server, temporary tables are session-scoped. However, I have not been able to find definitive information on exactly what a session is capable of. In particular, if it is possible for this stored procedure to...
I have written a stored procedure which makes use of a temporary table. I know that in SQL Server, temporary tables are session-scoped. However, I have not been able to find definitive information on exactly what a session is capable of. In particular, if it is possible for this stored procedure to execute twice concurrently in a single session, a significantly higher isolation level is required for a transaction within that procedure due to the two executions now sharing a temporary table.
Trevor Giddings (275 rep)
Apr 18, 2019, 05:37 PM • Last activity: Jun 27, 2025, 01:57 PM
0 votes
2 answers
192 views
Mysql locking - How to acquire a lock in the DB for each userId
Hi I have a J2EE Server(clustered) that executes requests from multiple users. I would like that each request of a given user will be handled like so: { acquireLockForUser(userId); // i.e. insert a lock row for a dedicated table doSomeStuffInDbForUser(user); releaseLock(userId); // i.e. delete a loc...
Hi I have a J2EE Server(clustered) that executes requests from multiple users. I would like that each request of a given user will be handled like so: { acquireLockForUser(userId); // i.e. insert a lock row for a dedicated table doSomeStuffInDbForUser(user); releaseLock(userId); // i.e. delete a lock row from the dedicated table } This will make sure that each user request will be handled by a single thread of my servers. Whats the way to achieve that? (Is it possible to insert into a table 'userLock' by id and lock it? how to insert\fetch from this "lock" table?)
Urbanleg (375 rep)
Mar 30, 2016, 10:01 AM • Last activity: Jun 21, 2025, 04:09 AM
1 votes
2 answers
345 views
Creating a clustered index on a view, minimising contention
I have a table of 100 billion rows with an identity bigint column that is the clustered primary key. I have a schema bound view on that table that is filtered down to the last 500 million rows or so. I want to create a clustered index on the view that incorporates the primary key and a few other col...
I have a table of 100 billion rows with an identity bigint column that is the clustered primary key. I have a schema bound view on that table that is filtered down to the last 500 million rows or so. I want to create a clustered index on the view that incorporates the primary key and a few other columns. The table is highly transactional, so can't really be locked for more than a minute at a time. Will the creation of the index on the view cause contention directly on the underlying table during the creation? If so, is it possible for me to create the index on my schema bound view in an incremental manner (i.e. somehow pause part of the way through creation, to allow the table to catch up on its transaction backlog, then resume creation). I have to be careful with overall server contention as well; I've seen heavy operations against one table like this slow up my whole server before. Essentially, I'm looking for the most efficient way to create the index on my view to minimize contention. I'm using an indexed view because I'm using the HASHBYTES function to generate a hash of the row, then creating an index on that hash and the primary key of the table. This allows me to quickly compare 500 million rows of data with another large dataset for any changes in data. (I realize I probably could do this in a computed column on the table itself as well, but I figured there may be less contention if the data is a materialized copy as an indexed view.) The table is not partitioned.
J.D. (40893 rep)
Dec 16, 2019, 08:26 PM • Last activity: May 8, 2025, 08:25 AM
0 votes
1 answers
356 views
Transaction - Timestamp Ordering. When is aborted transaction restarted
(I don't know if this is the correct place to ask, since this is more like a theoretical question). I have a question about restarting aborted transactions. I don't know **when / at which point** should restarting of the transaction(s) occur. Example - using *Basic [Timestamp Ordering][1]* w11(x), r...
(I don't know if this is the correct place to ask, since this is more like a theoretical question). I have a question about restarting aborted transactions. I don't know **when / at which point** should restarting of the transaction(s) occur. Example - using *Basic Timestamp Ordering * w11(x), r12(x), r11(z), r13(x), w13(z), r12(z), w13(x), w14(x) Initial values are: read_ts(x) = write_ts(x) = read_ts(z) = write_ts(z) = 10 --- As we can see transaction 12 is aborted because it is trying to read value z, written already by T13 (older, T12 can't read it). So in this case T12 is aborted and its timestamp is raised (to 15?). My question is when does the aborted transaction(s) restart? (1) After all the non-aborted transactions are done, or (2) *soon as possible* (= restarts after next operation w13(x))? In scenario 1, w14(x) would be done, and after that aborted transaction 12 would start (as T15). In scenario 2 (*soon as possible*), transaction 12 (as T15) is restarted immediately after next operation w13(x). After that w14(x) is aborted (14 < 15) and restarted as T16 What is the correct order in this case? What does the *restart* actually mean? When do we restart an aborted transaction(s)? I would like to see a practical example. edit: changed the last operation to w14(x) instead of r14(x)
P_95 (1 rep)
Mar 5, 2017, 02:55 PM • Last activity: Apr 20, 2025, 02:02 AM
0 votes
1 answers
694 views
How to prevent double booking of a room in this hotel booking database design?
I have two tables for hotel booking. One is rooms table where I save room type, hotelId and total number of rooms that the hotel has for that type. | HotelId | RoomTypeId | RoomType | TotalRooms | -------- | ------------|----------|---- | 100 | 2 | Premium | 24 | 100 | 3 | SeaView | 10 I have Bookin...
I have two tables for hotel booking. One is rooms table where I save room type, hotelId and total number of rooms that the hotel has for that type. | HotelId | RoomTypeId | RoomType | TotalRooms | -------- | ------------|----------|---- | 100 | 2 | Premium | 24 | 100 | 3 | SeaView | 10 I have Bookings table where I keep record of all the reservations. | BookinId | RoomTypeId | HotelId | RoomsBooked | start | end | | -------- | ------------|----------|--------- |------------------|---------------| | 1001 | 2 | 100 | 3 | 2023-03-17 14:00 |2023-03-20 11:00| | 1002 | 2 | 100 | 2 | 2023-03-17 14:00 |2023-03-20 11:00| | 1003 | 3 | 100 | 1 | 2023-03-17 14:00 |2023-03-20 11:00| When a user tries to book a room of a specific type(say premium) in a hotel. I do below two things: 1. Using sql query, get the sum of number of all the premium rooms booked in the hotel from Bookings table and subtract it from the TotalRooms of the premium room for that hotel from Rooms table. This gives me the number of available rooms. 2. If available rooms > 0, I insert a new booking record in the Bookings table. How can I prevent double booking in this scenario? For example, when there is only one premium room left and two users try to book it for the same DateTime, step 1 may get that available rooms is 1 for both users and then step 2 may insert duplicate record in Bookings table.
bornfree (101 rep)
Mar 6, 2023, 03:33 PM • Last activity: Apr 19, 2025, 07:04 PM
1 votes
1 answers
1190 views
Postgres Repeatable Read vs Selecting rows that are changed by another ongoing transaction
Let's say I have a set of select statements that query important field values in a for-loop. The goal is to make sure that the rows are not updated by any other transaction so that this set of selects doesn't result in data that is out of date. In theory, it seems that setting the transaction level...
Let's say I have a set of select statements that query important field values in a for-loop. The goal is to make sure that the rows are not updated by any other transaction so that this set of selects doesn't result in data that is out of date. In theory, it seems that setting the transaction level to repeatable read should solve the problem. In this case, we can begin the transaction in the first select statement and then reuse the same transaction in this loop to make sure that updates are blocked until this transaction is committed. Is there anything I am missing? Probably, there are some other ways to be sure that stale rows are not selected. UPDATE: a bit more details I have a series of queries like select name from some_table where id = $id_param and this $id_param is set in a for-loop. I am worried, however, that this name field might be changed by another concurrent operation for some row or even get deleted. This would result in corrupted states for the final object. It seems that based on the comment below, pessimistic locking could be the way to go i.e. using ...FOR UPDATE, but I am not sure.
Don Draper (209 rep)
Sep 15, 2022, 04:35 PM • Last activity: Apr 18, 2025, 04:05 PM
1 votes
1 answers
101 views
How to know, when it is time to vacuum an sqlite database file?
Sqlite, contrary most SQL-compatible DB engine, is working directly on files instead over network sockets. It is also not very good in concurrency, beside that deletion operators are mostly advisory (i.e. do not free the places of the deleted entities). That results that an sqlite database needs reg...
Sqlite, contrary most SQL-compatible DB engine, is working directly on files instead over network sockets. It is also not very good in concurrency, beside that deletion operators are mostly advisory (i.e. do not free the places of the deleted entities). That results that an sqlite database needs regularly be VACUUM-ed, at least those with regular modify or delete operations. Vacuum works by rebuilding the database file, without the free spaces in it. Needless to say, it is absolutely non-concurrent with anything, making it to a costly operation - we have downtime for the vacuum. *However, I think there should exist some indicator which compares the size of the database file to the size of the actual data in it.* Using this indicator, it could be said, do we need a vacuum-ing or not. Does it exist?
peterh (2137 rep)
Apr 11, 2025, 10:16 AM • Last activity: Apr 15, 2025, 11:40 AM
0 votes
1 answers
72 views
mysql/postgresql row lock in high concurrency scenarios
i have a simple transaction with isolation level `READ COMMITTED` The table setting is simple ```sql create table example.parent_tbl( id int auto_increment primary key ); create table example.child_tbl( id int auto_increment primary key, parent_id int not null, update_dt timestamp default current_ti...
i have a simple transaction with isolation level READ COMMITTED The table setting is simple
create table example.parent_tbl(
id int auto_increment primary key
);

create table example.child_tbl(
id int auto_increment primary key,
parent_id int not null,
update_dt timestamp default current_timestamp ON UPDATE current_timestamp,

foreign key (parent_id) references parent_tbl(id),
unique (parent_id)
);
sql
select 
p.*
from example.parent_tbl p
where p.id not in (select parent_id from example.child_tbl)
limit 1
for update
skip locked;
after that, i will insert a new record with parent_tbl.id into child_tbl. The ID column in child_tbl is unique. I have multiple processes running in parallel and occasionally one of them will hit integrity exception as the same ID exists in child_tbl already. # QUESTION why a transaction can select a ID that appears in child_tbl? The logic seems trivial. Either transaction A holds row lock of parent_tbl before commit, then transaction B will skip it. OR Transaction A has committed and transaction B where clause will filter out the row and will skip it. But the fact tells me this is not the case!!! -------------------- The general query log shows the counterintuitive behaviour of this locking mechanism server side log There are two process running in parallel, doing the same thing. in this case, they are thread 60 & 61 respectively. Two outcomes are observed. 1. thread 61 inserted the id 113 into the **child_tbl** and release the row lock. Thread 60 runs the query but it fails to detect the row with id 113 just inserted by thread 61. 2. thread 61 inserted the id 113 into the **child_tbl** and right before it commit and release the row lock, another thread 60 runs the query and acquire the same row lock of the **parent_tbl**. ------------- # Additional I have do the same tests with PostgreSQL and I hit exception as well.
cccllll (3 rep)
Apr 14, 2025, 06:44 PM • Last activity: Apr 14, 2025, 07:55 PM
5 votes
2 answers
796 views
Which approach to concurrency safe upserts is best for table-valued input if MERGE is forbidden?
[This classic concurrency safety article](https://michaeljswart.com/2017/07/sql-server-upsert-patterns-and-antipatterns/) is clearly designed for only upserting one row at a time. In my situation, I have a table-valued input and I want to upsert each row in a concurrency safe way. I know that this i...
[This classic concurrency safety article](https://michaeljswart.com/2017/07/sql-server-upsert-patterns-and-antipatterns/) is clearly designed for only upserting one row at a time. In my situation, I have a table-valued input and I want to upsert each row in a concurrency safe way. I know that this isn't always possible, but I want to get as close as possible. MERGE seems like a natural solution to it, but I distrust it and truly am in a situation where it is [bug prone](https://michaeljswart.com/2021/08/what-to-avoid-if-you-want-to-use-merge/) . The remaining two approaches in Michael J. Swart's article are: 1) _Inside a Transaction With Lock Hints (Update More Common)_
CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
AS 
SET XACT_ABORT ON;
BEGIN TRAN
 
  UPDATE TOP (1) dbo.AccountDetails WITH (UPDLOCK, SERIALIZABLE)
     SET Etc = @Etc
   WHERE Email = @Email;
 
  IF (@@ROWCOUNT = 0)
  BEGIN      
      INSERT dbo.AccountDetails ( Email, Etc )
      VALUES ( @Email, @Etc );
  END 
COMMIT
2) _Inside a Transaction With Lock Hints (Insert More Common)_
CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
AS 
SET XACT_ABORT ON;
BEGIN TRAN
 
  INSERT dbo.AccountDetails ( Email, Etc )
  SELECT @Email, @Etc
  WHERE NOT EXISTS (
      SELECT *
      FROM dbo.AccountDetails WITH (UPDLOCK, SERIALIZABLE)
      WHERE Email = @Email
  )
 
  IF (@@ROWCOUNT = 0)
  BEGIN      
      UPDATE TOP (1) dbo.AccountDetails
      SET Etc = @Etc
      WHERE Email = @Email;
  END 
COMMIT
I could adapt either of these to use table variables (e.g. I suspect that IF (@@ROWCOUNT = 0) needs totally removing), but **does the usage of a table-valued input make it obvious that we should prefer either the first or second solution?** If not, then on what basis should the decision be made?
J. Mini (1225 rep)
Apr 11, 2025, 11:15 PM • Last activity: Apr 13, 2025, 10:35 AM
0 votes
1 answers
611 views
Is this atomic set and get in MariaDB / MySQL: `UPDATE t SET col = @my_var := col + 1 WHERE id = 123`?
I want to atomically update a single row and get its value. I discovered today that MariaDB / MySQL has this additional way of setting user variables using the assignment operator `:=`, described e.g. [here][1]. I wonder therefore if the following statement, followed by checking the value of `@my_va...
I want to atomically update a single row and get its value. I discovered today that MariaDB / MySQL has this additional way of setting user variables using the assignment operator :=, described e.g. here . I wonder therefore if the following statement, followed by checking the value of @my_var, accomplishes what I want: UPDATE t SET col = @my_var := col + 1 WHERE id = 123; It seems it should, but I'd like to confirm with the pundits. --- (I know there's also SELECT .. FOR UPDATE.)
Jan Żankowski (203 rep)
May 27, 2021, 10:31 AM • Last activity: Mar 4, 2025, 01:04 AM
0 votes
0 answers
61 views
Chronological timestamps of transactions in a financial system
I have a high-volume, concurrent application with business transactions updating several tables via a series of DB calls over a network. While the DB ensures ACIDty, the transaction timestamps are not guaranteed to be chronological out of the box. For example, transaction A can start **before** tran...
I have a high-volume, concurrent application with business transactions updating several tables via a series of DB calls over a network. While the DB ensures ACIDty, the transaction timestamps are not guaranteed to be chronological out of the box. For example, transaction A can start **before** transaction B but commit **after** it. Our business requires that financial ledgers follow strict chronological order, in this case, transaction A's timestamp(6) should be after transaction B, because transaction A finished (committed) last. Naively, one could set the transaction time at the end of a transaction, a final step before the commit, however due to variability of network latency this is obviously not guaranteed to work. Commits can take unpredictably long. **Question: Is serialization of transactions at the application/system level, the only way to deal with this?** **Proposal 1:** I can run transactions for different (business) accounts concurrently, but within a single business account I might have to serialize globally. Unfortunately some accounts are large, with transaction volumes in 4000/min rates, so even per-account serialization will add significant delays. If there is a better solution to this please let me know. **Proposal 2:** Every app-node generates a unique ID (GUID) prior to starting the transaction. This represents a transaction ID which does not carry time but is accompanied with every modification as part of the transaction. As the last step of our transaction, we insert a row into a table like below, then commit: INSERT INTO transaction_sequence(transaction_guid) value ("GUID"); Table transaction_sequence: transaction_guid varbinary(16) NOT NULL, transaction_time datetime(6) NULL Above, the row that we insert as the last step of a transaction won't have a timestamp either. Later, there is a separate backend process that backfills the **transaction_time** with timestamps* according to their relative order of commits *as determined from the binary-log of the MariaDB/MySQL system files*. This solution is database engine specific and therefore not ideal but back-filling of timestamps will not inhibit concurrency in any way. **Proposal 3:** Use MariaDB-specific **transaction-precise history** extension on the **transaction_sequence** table, documented here . This extension will (hopefully) eliminate my need to process binary-log files although it remains DB engine specific. *) The timestamps are interpolated with approximate transaction completion times and be guaranteed chronological. My challenge is described by this paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/temporaltime.pdf
Slawomir (473 rep)
Feb 6, 2025, 03:57 PM • Last activity: Feb 6, 2025, 06:03 PM
0 votes
1 answers
425 views
Postgres Search Path between connections
I'm a noob to Postgres, and am setting up a multi-tenant database using one schema per tenant. To target each tenant I need to set search_path accordingly. I'm worried about accidentally executing something intended for one tenant on another tenant, if the wrong search path is picked up somehow. I w...
I'm a noob to Postgres, and am setting up a multi-tenant database using one schema per tenant. To target each tenant I need to set search_path accordingly. I'm worried about accidentally executing something intended for one tenant on another tenant, if the wrong search path is picked up somehow. I will be executing Functions from .Net code (via NpgSql). The functions will reside in the public schema. Each function will have the search path set inside it, like so:
BEGIN
 PERFORM set_config('search_path', tenant, true);
...
END
Is there any danger, via concurrency or connection pooling, that the search paths for two calls might be shared, for instance if the 'true' parameter was forgotten or set to false. I really want to be sure that there's no danger of data from one tenant crossing over in to another. Thanks in advance.
Paul Guz (13 rep)
Nov 21, 2019, 10:04 AM • Last activity: Dec 27, 2024, 06:04 PM
2 votes
0 answers
66 views
How to prevent anomalies in a tree data structure
I have two tables `node` and `node_tree`: CREATE TABLE node ( id integer PRIMARY KEY ); CREATE TABLE node_tree ( node_id integer references node(id) UNIQUE, parent_id integer references node(id) ); `node_tree` is an adjacency list. There is a limit on the height of the tree, so, (1) the depth of the...
I have two tables node and node_tree: CREATE TABLE node ( id integer PRIMARY KEY ); CREATE TABLE node_tree ( node_id integer references node(id) UNIQUE, parent_id integer references node(id) ); node_tree is an adjacency list. There is a limit on the height of the tree, so, (1) the depth of the parent node before inserting a new node must be checked and (2), the sum of the height of the node being relocated and the depth of its new parent must not exceed the established threshold and to prevent infinite loops, it is essential to verify that the new parent is not included among the descendants of the node being relocated. Using recursive CTEs, it is easy to apply any of those constraints. But with multiple, concurrent transactions, anomalies can happen. I can think of the following solutions: 1. Locking the whole node_tree table. Although I have read locking tables is an anti-pattern and almost always the wrong thing to do. 2. Using advisory locks at the application level. 3. Using serializable isolation level. 4. Row level locking using for update. I like the last one but the problem with it is that, you cannot use for update with recursive CTEs (trust me, I have tested), and it seems complex to me. Now my question is what is the best option and why?
Dante (177 rep)
Nov 23, 2024, 10:47 PM
42 votes
5 answers
37615 views
Locking issue with concurrent DELETE / INSERT in PostgreSQL
This is pretty simple, but I'm baffled by what PG does (v9.0). We start with a simple table: CREATE TABLE test (id INT PRIMARY KEY); and a few rows: INSERT INTO TEST VALUES (1); INSERT INTO TEST VALUES (2); Using my favorite JDBC query tool (ExecuteQuery), I connect two session windows to the db whe...
This is pretty simple, but I'm baffled by what PG does (v9.0). We start with a simple table: CREATE TABLE test (id INT PRIMARY KEY); and a few rows: INSERT INTO TEST VALUES (1); INSERT INTO TEST VALUES (2); Using my favorite JDBC query tool (ExecuteQuery), I connect two session windows to the db where this table lives. Both of them are transactional (ie, auto-commit=false). Let's call them S1 and S2. The same bit of code for each: 1:DELETE FROM test WHERE id=1; 2:INSERT INTO test VALUES (1); 3:COMMIT; Now, run this in slow motion, executing one at a time in the windows. S1-1 runs (1 row deleted) S2-1 runs (but is blocked since S1 has a write lock) S1-2 runs (1 row inserted) S1-3 runs, releasing the write lock S2-1 runs, now that it can get the lock. But reports 0 rows deleted. HUH??? S2-2 runs, reports a unique key constraint violation Now, this works fine in SQLServer. When S2 does the delete, it reports 1 row deleted. And then S2's insert works fine. I suspect that PostgreSQL is locking the index in the table where that row exists, whereas SQLServer locks the actual key value. Am I right? Can this be made to work?
DaveyBob (561 rep)
Oct 26, 2012, 04:58 PM • Last activity: Nov 10, 2024, 01:46 AM
0 votes
1 answers
547 views
Using SELECT ... FOR UPDATE as a way to lock a whole table
I know I can lock a whole table using `LOCK TABLE`, but I am just curious if I can get the same result by using the `FOR UPDATE` statement, assuming that all transactions that update that table select a specific row from that table using `FOR UPDATE` at the beginning of their transaction (or maybe h...
I know I can lock a whole table using LOCK TABLE, but I am just curious if I can get the same result by using the FOR UPDATE statement, assuming that all transactions that update that table select a specific row from that table using FOR UPDATE at the beginning of their transaction (or maybe having a separate table called locks with one unique column named lock_name and using SELECT lock_name FROM locks WHERE lock_name = 'foo' FOR UPDATE to acquire a lock before each transaction)? I am asking this question because I have to save a tree data structure in a database table using either an adjacency list or a closure table. When multiple users try to add a new node or relocate a sub-tree, I have to select some rows and do some calculations and then update the table based on those calculations and this operation is not possible unless I lock the whole table.
Dante (177 rep)
Nov 8, 2024, 07:22 AM • Last activity: Nov 9, 2024, 01:32 AM
0 votes
0 answers
32 views
Preventing Race Conditions with SERIALIZABLE Isolation in Supabase for High-Concurrency Updates
**Question:** I'm working with Supabase/PostgreSQL in a Next.js application and need to ensure that an is_processing flag is updated only once per user, even if multiple requests are sent in parallel. The goal is to prevent any duplicate operations by ensuring that once is_processing = true, additio...
**Question:** I'm working with Supabase/PostgreSQL in a Next.js application and need to ensure that an is_processing flag is updated only once per user, even if multiple requests are sent in parallel. The goal is to prevent any duplicate operations by ensuring that once is_processing = true, additional requests won’t proceed until it’s reset. **Approach So Far** I initially created a PostgreSQL function using SERIALIZABLE isolation to handle concurrency, with the function attempting to set is_processing to true if it’s initially false. Here’s the function:
CREATE OR REPLACE FUNCTION serialized_update_processing_flag(user_id UUID) RETURNS BOOLEAN AS $$
DECLARE
    result BOOLEAN;
BEGIN
    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

    UPDATE users
    SET is_processing = true
    WHERE id = user_id
      AND is_processing = false
    RETURNING true INTO result;

    RETURN result IS NOT NULL;
EXCEPTION
    WHEN serialization_failure THEN
        RETURN FALSE;
END;
$$ LANGUAGE plpgsql;
**Problem:** Running this function directly throws an error: "SET TRANSACTION ISOLATION LEVEL must be called before any query." **Alternative Implementation in Next.js** I moved the transaction management into my Next.js code, starting with SERIALIZABLE isolation at the application level. Here’s the revised approach:
const client = await pool.connect();

try {
  // Start a transaction with SERIALIZABLE isolation level
  await client.query('BEGIN');
  await client.query('SET TRANSACTION ISOLATION LEVEL SERIALIZABLE');

  // Call the RPC function
  const { rows } = await client.query(
    'SELECT serialized_update_recharge_flag($1) AS success',
    [userId]
  );

  await client.query('COMMIT');

  // Check the result
  console.log('ROWWW', rows);
  return rows.success;
} catch (error) {
  await client.query('ROLLBACK');
  console.error(Error in serialized transaction: ${error});
  throw error;
} finally {
  client.release();
}
Issue with High-Concurrency Requests This approach has been somewhat effective, but when I simulate around 30 parallel requests, some of them still manage to bypass the SERIALIZABLE isolation level, resulting in multiple operations setting is_processing = true concurrently. **Questions** How can I reliably enforce strict SERIALIZABLE isolation in this setup to prevent any concurrent updates from slipping through? Are there alternative methods in PostgreSQL or Supabase for handling high-concurrency, single-update situations with atomicity? I’d appreciate any suggestions or solutions to ensure consistent behavior and prevent duplicate operations, especially under high parallel loads. Thank you!
Obaid Aqeel (101 rep)
Nov 6, 2024, 02:59 PM
3 votes
1 answers
174 views
Make sure that only one transaction updates a row
I have `users` table ``` CREATE TABLE tasks( id UUID PRIMARY KEY, status TEXT REFERENCES statuses, -- available statuses are 'NEW', 'PENDING', 'PROCESSING', 'COMPLETE' created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NULL ); ``` I want to make sure that only single process will change status fro...
I have users table
CREATE TABLE tasks(
    id UUID PRIMARY KEY,
    status TEXT REFERENCES statuses, -- available statuses are 'NEW', 'PENDING', 'PROCESSING', 'COMPLETE'
    created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NULL
);
I want to make sure that only single process will change status from PENDING to PROCESSING in case of multiple concurrent updates attempts, the rest should observe zero updated rows. Task status change is driven by processing external HTTP request, it is important to only single request result in success, the rest should detect change and return 409 Conflict, we don't have control over how many concurrent request we can get. Update statement is first thing that is done as part of bigger transaction, after update in case process observe that it updated tasks row (updated row count is 1), it later insert/update some data in other tables etc. I've tested following update query
UPDATE tasks 
  SET status = 'PROCESSING'
  WHERE id = '00000000-0000-0000-0000-000000000000' AND status = 'PENDING'
using Java JDBC, multiple threads, connections etc. and I observe that only single process make update when using default isolation level READ COMMITTED. When I was thinking about this problem I thought I need either FOR UPDATE in UPDATE statement or SERIALIZABLE isolation level on transaction, but in actual tests I wasn't able to catch inconsistency in case of multiple updates. My question: is it enough to filter by id and status to make sure only single process will perform update, or I need additional actions to exclude unexpected behaviour in general?
sebarys (33 rep)
Oct 31, 2024, 03:05 PM • Last activity: Nov 1, 2024, 04:58 PM
Showing page 1 of 20 total questions