Database 2023-06-08

MySQL Transaction Isolation Levels: Preventing Dirty Reads, Phantom Reads, and More

Learn MySQL's four transaction isolation levels, the anomalies each prevents (dirty read, non-repeatable read, phantom read), and how to choose the right level.

Read in: ja
MySQL Transaction Isolation Levels: Preventing Dirty Reads, Phantom Reads, and More

Overview

This post summarizes transaction anomalies in MySQL. The MySQL version assumed is 8 series.

Test Environment

The environment used for testing is prepared with docker-compose. (Although it's just one container, so you don't necessarily need to use compose...)

.
├── docker-compose.yml
└── initdb.d
    └── 1_schema.sql

docker-compose.yml

version: '3'

services:
  mysql:
    image: mysql:8.0.33
    environment:
      MYSQL_ROOT_PASSWORD: password
      MYSQL_DATABASE: example
      TZ: "Asia/Tokyo"
    command: mysqld
    ports:
      - 3306:3306
    volumes:
      - ./initdb.d:/docker-entrypoint-initdb.d

1_schema.sql

CREATE DATABASE IF NOT EXISTS example;
DROP TABLE IF EXISTS `users`;
CREATE TABLE `users` (
  `id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `name` varchar(255) NOT NULL UNIQUE
) ENGINE = InnoDB DEFAULT CHARSET = utf8;

You can prepare a MySQL 8 series container with docker compose up.

Transaction Isolation Levels

MySQL's InnoDB provides four transaction isolation levels as defined by the ANSI/ISO SQL standard.

Isolation Level Dirty Read Inconsistent Read Lost Update Phantom Read
READ UNCOMMITTED
READ COMMITTED ×
REPEATABLE READ※1 × × ○※
SERIALIZABLE × × × ×

※1 REPEATABLE READ is the default in MySQL.

※2 Although marked as ○ above, MySQL is designed to prevent phantom reads in REPEATABLE READ.

The transaction isolation level ranges from READ UNCOMMITTED, the lowest, to SERIALIZABLE, the highest. The above table is ordered from lowest to highest. Generally, the higher the isolation, the lower the performance tends to be.

For more on transactions, see Transaction Overview.

Anomalies

Let's reproduce transaction anomalies in MySQL.

An anomaly refers to "unexpected results or inconsistencies arising from transaction isolation levels or processing order."

There are anomalies defined by ANSI SQL standards or ISO/IEC 9075, and there are various others besides those discussed here.

Inconsistent read is not defined by those standards. (I couldn't find where it is defined...)

Transactions are denoted as TX. Numbers are used to distinguish multiple transactions (e.g., TX1, TX2).

Dirty Read

Dirty read is a phenomenon where TX1 reads data from TX2 before TX2 commits.

Test

All sessions are conducted with READ UNCOMMITTED.

mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
  1. Start transactions in TX1 and TX2
// TX1
mysql> START TRANSACTION;
// TX2
mysql> START TRANSACTION;
  1. Add data in TX2
// TX2
mysql> INSERT INTO users(name) VALUES('foo');

Data is added in TX2, but not committed.

  1. Read data again in TX1
// TX1
mysql> SELECT * FROM users; // 1 row in set

TX1 reads data from TX2 before TX2 commits.

Inconsistent Read

Inconsistent read is a phenomenon where the data being read lacks consistency.

Refer to Various Anomalies#Inconsistent Read Anomaly.

I wasn't sure about the exact definition, so my understanding might be questionable...

Since it's about inconsistency after commit, inconsistent read seems like a higher concept than fuzzy read or phantom read?? But strictly, it should be different...

Test

All sessions are conducted with READ UNCOMMITTED.

mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
  1. Start transaction and read data in TX1
// TX1
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // Empty set
  1. Start transaction and add data in TX2
// TX2
mysql> START TRANSACTION;
mysql> INSERT INTO users(name) VALUES('foo');
mysql> COMMIT;
  1. Read data again in TX1
// TX1
mysql> SELECT * FROM users; // 1 row in set

The result differs from the initial read, confirming inconsistency due to TX2's actions.

Fuzzy Read (Non-repeatable Read)

Fuzzy read is a phenomenon where TX1 can reference data updated by another TX2.

All sessions are conducted with READ COMMITTED.

mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

Test

  1. Start transaction and read data in TX1
// TX1
mysql> INSERT INTO users(name) VALUES('foo'); // Initial data entry
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // 1 row in set

Initial data entry result.

+-----+------+
| id  | name |
+-----+------+
| 1   | foo  |
+-----+------+
  1. Start transaction and read data in TX2
// TX2
mysql> START TRANSACTION;
mysql> UPDATE users SET name = 'bar' WHERE id = 1;
mysql> COMMIT;
mysql> SELECT * FROM users; // 1 row in set

Update is complete.

+-----+------+
| id  | name |
+-----+------+
| 1   | bar  |
+-----+------+
  1. Read data again in TX1
// TX1
mysql> SELECT * FROM users; // 1 row in set

TX1's read result changes due to TX2's commit.

+-----+------+
| id  | name |
+-----+------+
| 1   | bar  |
+-----+------+

Phantom Read

Phantom read is a phenomenon where data read by TX1 changes if TX2 commits an addition or deletion. Fuzzy read involves updates, while phantom read involves additions or deletions.

Test

All sessions are conducted with READ COMMITTED.

mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
  1. Start transaction and read data in TX1
// TX1
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // Empty set
  1. Add data and commit in TX2
// TX2
mysql> START TRANSACTION;
mysql> INSERT INTO users(name) VALUES('foo');
mysql> COMMIT;
mysql> SELECT * FROM users;

Addition is complete.

+-----+------+
| id  | name |
+-----+------+
| 1   | foo  |
+-----+------+
  1. Read data again in TX1
// TX1
mysql> SELECT * FROM users; // 1 row in set

TX1's read result changes due to TX2's commit.

+-----+------+
| id  | name |
+-----+------+
| 1   | foo  |
+-----+------+

Lost Update

Lost update is a phenomenon where a conflict occurs when TX1 and TX2 update the same data, resulting in some updates being lost.

Test

All sessions are conducted with REPEATABLE READ.

mysql> SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
  1. Start transaction and read data in TX1
// TX1
mysql> INSERT INTO users(name) VALUES('foo'); // Initial data entry
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // 1 row set
  1. Start transaction and read data in TX2
// TX2
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // 1 row set
  1. Update data in TX1 and TX2
// TX1
mysql> UPDATE users SET name = 'tx1' WHERE id = 1;

// TX2
mysql> UPDATE users SET name = 'tx2' WHERE id = 1;
  1. Commit TX1 and TX2
// TX1
mysql> COMMIT;
// TX2
mysql> COMMIT;
  1. Read data
mysql> SELECT * FROM users; 1 row set

TX1's commit is lost, and TX2's commit is reflected.

+-----+------+
| id  | name |
+-----+------+
| 1   | tx2  |
+-----+------+

Summary

The anomalies that occur vary depending on the transaction isolation level.

Anomalies are patterns where data reading and consistency change before and after commit.

To learn more about transaction anomalies, it might be better to refer to a book or something related to transactions.

References

Tags: Transaction MySQL
Share: 𝕏 Post Facebook Hatena
✏️ View source / Discuss on GitHub
☕ Support

If you enjoy this blog, consider supporting it. Every bit helps keep it running!


Related Articles