Isolation Levels(Read Committed, Repeatable Read, Serializable)

Rohit Sharma
Sep 10, 2021
8 min read

Hello EveryOne,

Hope you all are doing well!

So today I m here with another interesting topic which is Isolation level but before jumping directly to isolation levels let's discuss the origin of these levels.

As you know ACID (not the which is used in chemistry labs ;-) ) is the most powerful property of the RDMS lets quickly discuss each property. All transactions process these properties. You will find a lot of articles on this topic but let's redefine it in our own way.

(I m going to use PostgreSQL Database to explain further)

A - Atomicity
C- Consistency
I- Isolation
D - Durability

Let's discuss this jargon real quick,

(We gonna use all the further DB operations in transactions)

Atomicity

Atomicity simply means all success or all failure. That's all the atomicity is let's take a simple example to suppose we have one table users

																																users
																      																																																   																					  id(primary key) |  name varchar(50)

Now let's insert multiple values with our powerful INSERT query along with transactions so let's write the simple syntax

Let's quickly discuss the above statement we have written

begin transaction :-This is a statement to tell the database that "hey! buddy I m going to start the transaction please take care of the same"
insert into users(name) values ('rohit'),('sharma'):- This statement basically insert multiple values of one column in the table (Note:- till now no actually insertion happened done in the table the values will be inserted into table only when we run successfully commit operation)
commit:- This is a statement to tell the database that "hey! buddy I m done with the transaction please check all the operations I have done inside the transaction you can do the changes in the table. Thank you! for your help" (Well query never speaks to the database like this it's me speaking virtually ;-) ).

Simple! let's move forward, suppose we are inserting multiple names in the users table and in which 3 names syntax is fine but the 4th name syntax is wrong so what's the database gonna do will it insert 3 names successfully in the table and not insert 4th name which is having syntax error?

Let's run practically and check the same

So the above statement is the same as the above one only difference is here we have intentionally added syntax error even we run the commit statement the database checks whether the operation in the transaction is right or not. Here it finds it's wrong so even we run the commit operation it automatically rollback(revert the changes) the transaction.

To verify the same let's quickly run the select statement and check whether the statement is true or not. If we run our select query we see

(Never ever use (*) wildcard operator with select never ever)

We see the previous values that we have inserted in our first insert operation. Let's quickly recall what we have stated in our first statement "all success or all failure". that's it the atomic operation we tried to insert four values but due to one value error, all got in a failure state and no changes were done in the table.

Consistency

It simply means all the data which is getting stored in a table or database must follow some valid rules. So that we don't get any invalid values while retrieving the data from the database.

Let's see it practically,

So in the above operation, we can see at the second insert operation we try to violate the primary key rules (A primary key is a key which is used to identify the rows uniquely and it can never be duplicated and its clustered index. Let me know if you wanna learn about clustered index and non clustered index). We tried to duplicate the primary key by using the same value which already exists in the table so what about the 'hello' value did it inserted into the database? Let's check quickly by running our favorite select query.

If we see here we still get the old values that we have inserted into our first operation the value 'hello' gets discarded because we violated rules!

Isolation

It simply means each transaction must be unaware of another transaction. hmm, so what it means exactly let's see by creating two sessions of transaction and understand the same

So let's take our favorite table users . I m going to denote both transactions by different names please notice the same that will help you in understanding.

So let's quickly discuss the above weird-looking statements!

So here we have two transactions running parallel one is named 'transaction1' and another is named 'transaction2'.
In 'transaction1' we have to insert the value 'tx1'
In 'transaction2' we have to insert the value 'tx2'
If we see here both transaction blocks inserting the value simultaneously.
But when we are performing the select operation after the insertion operation in both the blocks we are not able to see each other inserted value
Meaning the 'transaction1' is unable to see the value inserted by 'transaction2' i.e;tx2
And the 'transaction2' is unable to see the value inserted by 'transaction1 i.eltx1
So yeah here we are this is the isolation of multiple transactions running concurrently but unaware of each other. It can be any operation like create, delete, read and update

So after so long journey, we are finally coming to the main topic of our discussion! Now we know what is isolation? but in RDBMS there are different types of isolation levels which features are to avoid read anomalies like dirty read, phantom read, nonrepeatable mode, and serialization anomaly or lost update(We gonna discuss these in future blogs).

READ COMMITTED
REPEATABLE READ
SERIALIZABLE

(Note:- We have one more additional level, READ UNCOMMITTED. In PostgreSQL READ UNCOMMITTED is treated as READ COMMITTED.)

Let's discuss all the above weird-looking jargon!.

1) READ COMMITTED

So this isolation level leads the transactions to avoid other transactions or SQL operations to read uncommitted values. You will be thinking about what is partially committed value or uncommitted value? Let me explain the same!

(Note:- We haven't set the isolation level here because in the PostgreSQL database the default isolation level is READ COMMITTED so we don't need to set the level for this)

Here we see there is one transaction going on and trying to insert a new value. We haven't committed the transaction yet! As we are in the middle of the transaction we are now performing the select operation in another session we can see the select query didn't return the uncommitted values we can only see the committed values which we have inserted at starting of the blog. So yeah that's it this is the read committed isolation level. It helps us to avoid dirty read problems i.e; (reading any uncommitted transaction value)

But the cons of this isolation level is that it reads the updated values of the same row which is updated by any other transaction while being in an uncommitted transaction. Let's discuss the same.

There are two transactions started let's consider tx1 the left side one and tx2 the right side one.
The tx1 transaction reading the value of row having id=1
The returned value is 'rohit'
Then in the tx2 transaction, we update the value of the row having id=1 to 'Noob' and committed the same
Then in tx1 again we are trying to read the value of row having id=1
But Oops! its value got updated in the second read
So yeah that's how it works it avoid the dirty read problem but generates the non-repeatable mode problem which is (getting the different value of the same column due to update done by any other transaction)

2) REPEATABLE READ

So this transaction isolation level avoids the non-repeatable mode problem. You will get the same value every time you try to read from the table till the end of the transaction. Let's see this practically!

We have two transactions lets consider tx1 the left transaction and tx2 the right transaction
The left transaction isolation level is repeatable read while the right transaction level is read committed
so in tx1, we perform the select operation on row value having id=1 we got the value 'Noob'
and now in tx2, we perform the update operation on row value having id=1
now we successfully committed the tx2 transaction
and again we are running the select operation multiple times in the tx1 transaction
We can see we are getting the same value of id=1 even its value got updated by the tx2 transaction.

3) SERIALIZABLE

This isolation level helps us to overcome the lost update problem i.e.; updates performs in one transaction get lost or overwritten by some other transaction when running concurrently. Let's discuss this practically

Here we have two transactions tx1 and tx2 running concurrently.
Both having an isolation level of READ COMMITTED.
In tx1 we are updating the row value having id=1
In tx2 we again updating the row value having id=1
both tx1 and tx2 are uncommitted transactions
first, we have executed the tx1 to update the row value 'Lost Update'
then we have executed the tx2 to update the row value 'Overwrite Update'
the tx2 will not execute immediately it will be pending state
because the PostgreSQL places a lock to prevent another update until the first transaction is finished (If you wanna read more about row locks and table locks do let me know in the comment section)
so once we commit the tx1 the lock will be released and without wasting any single millisecond the tx2 update operation will be executed and the value 'Overwrite Update' will be update
Now if the person it belongs to tx1 will be expecting value 'Lost Update' but due to conflict, the value gets updated by tx2 i.e; 'Overwrite Update'

So here comes the SERIALIZABLE isolation in the picture we can resolve this problem lets see how

Here we have two transactions tx1 and tx2 running concurrently.
The tx1 is having an isolation level of READ COMMITTED.
The tx2 is having an isolation level of SERIALIZABLE
both tx1 and tx2 are uncommitted transactions
first, we have executed the tx1 to update the row value 'i will be updated'
then we have executed the tx2 to update the row value 'i want to also get updated'
the tx2 will not execute immediately it will be pending state
because the PostgreSQL places a lock to prevent another update until the first transaction is finished (If you wanna read more about row locks and table locks do let me know in the comment section)
so once we commit the tx1 the lock will be released and without wasting any single millisecond the tx2 update operation will be executed.
But in this case, we have used SERIALIZABLE access level it won't allow the update operation and will throw an error stating "could not serialize access due to concurrent update"
Finally ! we have saved our row from concurrent update anomaly.

The cons of this isolation level are we need to face this such error so we need to add logic for transaction retry!

Durability

It simply means that whatever we store in our database will always be there until someone explicitly deletes the data or some hardware crashes occurs. Let's see one quick example of the same.

If we see where first we have executed the now() function (PostgreSQL inbuilt function) it outputs the current time
At that time we have executed the select query we got the value 'i will be updated'
now after some time again we have executed the now() function
If you see we have almost 8 minutes of difference between them
And again we have executed the select query we again got the value 'i will be updated'.
So yeah that's the durability property the data will persist even you come back after a year or 100 years(if you are alive ;-) ) the data will be as it is until and someone explicitly deletes the data or some hardware crashes occurs.

Finally, we have covered the points related to ACID and isolation levels. Please let me know in the comment section if anything I missed will love to read your feedback.

Stay Safe! Keep Learning!

Isolation Levels(Read Committed, Repeatable Read, Serializable)

Recent Posts

Comments