Is snapshot isolation only potentially different from serializability if there are "cycles" between transactions with reads and writes?
0
votes
1
answer
69
views
I'm trying to figure out a better intuition for what exactly are the sorts of consistency anomalies that snapshot isolation allows to happen. The description on Wikipedia says:
> In databases, and transaction processing (transaction management), snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at the time it started), and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.
> When the transaction concludes, it will successfully commit only if the values updated by the transaction have not been changed externally since the snapshot was taken. Such a write–write conflict will cause the transaction to abort.
So my reasoning goes:
- If two transactions have interleaved reads, that alone cannot violate serializability, because if nobody is changing the data then the end result will be the same regardless of what order the reads occur.
- If two transactions have interleaved writes, then snapshot isolation will abort and retry at least one of them until it doesn't have interleaved writes, thus preventing that.
- So then it seems like the only way that snapshot isolation could allow serializability to be violated would be interleaving writes with reads.
So then I try to imagine a simple case where there's two transactions and three disjoint sets of records, kind of like a Venn diagram, but where there is only interleaving between tx 1's writes and tx 2's reads, not the other way around:
- Set of records A is written by tx 1
- Set of records B is written by tx 1 and read by tx 2
- Set of records C is written by tx 2
And it seems like in this situation it's still not possible for snapshot isolation to violate serializability. Because I figure:
- If tx 2 is considered to start before tx 1 commits, it will only read the versions of records in B from before tx 1 changed them, and then also tx 2 will not change any records in B because it is only reading them, so the execution will be serializable to the order [tx 2, tx 1].
- If not, then tx 2 is considered to start after tx 1 commits, in which case it seems obvious that the execution will be serializable to the order [tx 1, tx 2].
Since it seems like violating serializability with snapshot isolation in this case is still impossible, that makes it seem like snapshot isolation can only violate serializability if there exists _both_:
- Some records which are written by tx 1 and read by tx 2
- Some records which are written by tx 2 and read by tx 1
Or, alternatively, some more indirect loop, e.g. tx 1 writes A and reads B, tx 2 writes B and reads C, and tx 3 writes C and reads A.
And it seems conspicuous that the common example I see of an anomaly which snapshot isolation allows is exactly such a case:
> As a concrete example, imagine V1 and V2 are two balances held by a single person, Phil. The bank will allow either V1 or V2 to run a deficit, provided the total held in both is never negative (i.e. V1 + V2 ≥ 0). Both balances are currently $100. Phil initiates two transactions concurrently, T1 withdrawing $200 from V1, and T2 withdrawing $200 from V2.
>
> If the database guaranteed serializable transactions, the simplest way of coding T1 is to deduct $200 from V1, and then verify that V1 + V2 ≥ 0 still holds, aborting if not. T2 similarly deducts $200 from V2 and then verifies V1 + V2 ≥ 0. Since the transactions must serialize, either T1 happens first, leaving V1 = −$100, V2 = $100, and preventing T2 from succeeding (since V1 + (V2 − $200) is now −$200), or T2 happens first and similarly prevents T1 from committing.
>
> If the database is under snapshot isolation(MVCC), however, T1 and T2 operate on private snapshots of the database: each deducts $200 from an account, and then verifies that the new total is zero, using the other account value that held when the snapshot was taken. Since neither update conflicts, both commit successfully, leaving V1 = V2 = −$100, and V1 + V2 = −$200.
So I wanted to ask, is my formulation and understanding of the situation correct, the snapshot isolation can only violate serializability if there are "loops" between sets of records that transactions writes and read as such? Or are there possible ways for anomalies to other than that?
Asked by Phoenix
(101 rep)
Mar 28, 2024, 08:41 PM
Last activity: Mar 29, 2024, 09:29 AM
Last activity: Mar 29, 2024, 09:29 AM