A “Transaction” in SQL Server
The standard definition of a transaction states that “every query that runs in a SQL Server is in a transaction,” that means any query you run on a SQL Server is considered as being in a transaction. It could either be a simple SELECT query or any UPDATE or ALTER query.
- If you run a query without mentioning the BEGIN TRAN keyword then it would be considered an implicit transition.
- If you run a query that starts with BEGIN TRAN and ends with COMMIT or ROLLBACK, then it would be considered an explicit transaction.
Transaction Properties
A database management system (DBMS) is considered a relational database management system (RDBMS) if it follows the transactional properties, ACID.
- A: Atomicity
- C: Consistency
- I: Isolation
- D: Durability
The SQL Server takes care of the Atomicity, Consistency, and Durability of the system, and the user has to care about the Isolation property of the transaction. The meaning of each of these properties is described below, as it applies to a transaction.
Atomicity
Transaction work should be atomic, which means all the work is one unit. If the user performs a transition, either the transaction should complete and perform all the asked operations, or it should fail and don’t do anything. Atomicity deals with the transaction process and an RDBMS transaction does not leave the work incomplete.
Consistency
After the transaction is completed, the database should not be left in an inconsistent state, which means the data on which transaction is applied must be logically correct, according to the rules of the system.
Isolation
If two transactions are applied on a similar database, then both the transaction should be isolated from each other, and the user must see the result. It can also be defined as a transaction that should see the data only after or before the concurrent transaction process is completed, which means if a one transaction process is in between, the other transaction process should wait until the first transaction is completed.
For instance, if A performs a transaction process on data d1, and before the transaction process gets completed, B also performs another transaction process on the same data d1. Here, the isolation property will isolate the transaction process of A and B, and the transaction process of B will only start after the transaction process of A gets completed.
Durability
Even if the system fails, the transaction should be persistent, which means, if the system fails during a transaction process, the transaction should be dropped, too, without affecting the data.
Concurrency in SQL Server
Concurrency is a situation that arises in a database due to the transaction process. Concurrency occurs when two or more than two users are trying to access the same data or information. DBMS concurrency is considered a problem because accessing data simultaneously by two different users can lead to inconsistent results or invalid behaviour.
Concurrency Problem Types
The concurrency problem mostly arises when both the users try to write the same data, or when one is writing and the other is reading. Apart from this logic, there are some common types of concurrency problems:
- Dirty Reads
- Lost Updates
- Non-repeatable Reads
- Phantom Reads
Dirty Read
This problem occurs when another process reads the changed, but uncommitted data. For instance, if one process has changed data but not committed it yet, another process is able to read the same data. This leads to the inconsistent state for the reader.
Lost Updates
This problem occurs when two processes try to manipulate the same data simultaneously. This problem can lead to data loss, or the second process might overwrite the first processs change.
Non-repeatable Reads
This problem occurs when one process is reading the data, and another process is writing the data. In non-repeatable reads, the first process reading the value might get two different values, as the changed data is read a second time because the second process changes the data.
Phantom Reads
If two same queries executed by two users show different output, then it would be a Phantom Read problem. For instance, If user A select a query to read some data, at the same time the user B insert some new data but the user A only get able to read the old data at the first attempt, but when user A re-query the same statement then he/she gets a different set of data.
Solve Concurrency Problems
SQL Server provides 5 different levels of transaction isolation to overcome these Concurrency problems. These 5 isolation levels work on two major concurrency models:
- Pessimistic model - In the pessimistic model of managing concurrent data access, the readers can block writers, and the writers can block readers.
- Optimistic model - In the optimistic model of managing concurrent data access, the readers cannot block writers, and the writers cannot block readers, but the writer can block another writer.
Note that readers are users are performing the SELECT operations. Writers are users are performing INSERT, ALTER, UPDATE, S.E.T. operations.
Isolation Level
When we connect to a SQL server database, the application can submit queries to the database with one of five different isolation levels. These levels are:
- Read Uncommitted
- Read Committed
- Repeatable Read
- Serializable
- Snapshot
Out of these five isolation levels, Read Uncommitted, Read Committed, Repeatable Read, and Serializable come under the pessimistic concurrency model. Snapshot comes under the optimistic concurrency model. These levels are ordered in terms of the separation of work by two different processes, from minimal separation to maximal.
Let's look at each of these isolation levels and how they affect concurrency of operations.
Read Uncommitted
This is the first level of isolation, and it comes under the pessimistic model of concurrency. In Read Uncommitted, one transaction is allowed to read the data that is about to be changed by the commit of another process. Read Uncommitted allows the dirty read problem.
Read Committed
This is the second level of isolation and also falls under the pessimistic model of concurrency. In the Read Committed isolation level, we are only allowed to read data that is committed, which means this level eliminates the dirty read problem. In this level, if you are reading data then the concurrent transactions that can delete or write data, some work is blocked until other work is complete.
Repeatable Read
The Repeatable Read isolation level is similar to the Read Committed level and eliminates the Non-Repeatable Read problem. In this level, the transaction has to wait till another transaction's update or read query is complete. But if there is an insert transaction, it does not wait for anyone. This can lead to the Phantom Read problem.
Serializable
This is the highest level of isolation in the pessimistic model. By implementing this level of isolation, we can prevent the Phantom Read problem. In this level of isolation, we can ask any transaction to wait until the current transaction completes.
Snapshot
Snapshot follows the optimistic model of concurrency, and this level of isolation takes a snapshot of the current data and uses it as a copy for the different transactions. Here each transaction has its copy of data, so if a user tries to perform a transaction like an update or insert, it asks him to re-verify all the operation before the process gets started executing.
Conclusion
Concurrency occurs when two transactions operate currently on the same set of data. Concurrency can then lead to data inconsistency and abnormal behaviour for a transaction. RDBMS transactions have four properties, which are known as ACID, that ensure consistency of work. There are four major common concurrency problems that can be solved by using two models of concurrency: pessimistic and optimistic. We can complete work in SQL Server with any of these five types of isolation levels.