MySQL's Terrible ACID

By Ruohang Feng（@Vonng）| WeChat | 2023-12-28

Tags:

MySQL was once the world’s most popular open-source relational database. But popularity doesn’t equal excellence, and popular things can have major issues. JEPSEN’s isolation level evaluation of MySQL 8.0.34 has blown the lid off this one - when it comes to correctness, a basic requirement for any respectable database product, MySQL’s performance is a spectacular mess.

MySQL’s documentation claims to implement Repeatable Read (RR) isolation, but the actual correctness guarantees are much weaker. Building on Hermitage’s research, JEPSEN further points out that MySQL’s RR isolation level is not actually repeatable read, and is neither atomic nor monotonic - it doesn’t even meet the basic level of Monotonic Atomic View (MAV), which is barely below Read Committed (RC) in most DBMSes

Furthermore, MySQL’s Serializable (SR) isolation level, which could theoretically “avoid” these anomalies, is impractical for production use and isn’t recognized as best practice by either official or the community. Even worse, under AWS RDS default configuration, MySQL SR doesn’t actually meet “serializable” requirements. Professor Li Haixiang’s analysis of MySQL consistency further reveals design flaws and issues with SR.

In summary, MySQL’s ACID properties have flaws and don’t match their documented promises - this can lead to severe correctness issues. While these problems can be worked around through explicit locking and other mechanisms, users should be fully aware of the trade-offs and risks: exercise caution when choosing MySQL for scenarios where correctness/consistency matters.

Why Does Correctness Matter?

Reliable systems need to handle various errors, and in the harsh reality of data systems, many things can go wrong. Ensuring data integrity without loss or corruption is a massive undertaking prone to errors. Transactions solved this problem. They’re one of the greatest abstractions in data processing and the crown jewel of relational databases’ pride and dignity.

The transaction abstraction reduces all possible outcomes to two scenarios: either COMMIT successfully or ROLLBACK completely. Having this “undo button” means programmers no longer need to worry about half-failed operations leaving data consistency in a horrific crash site. Application error handling becomes much simpler because it doesn’t need to deal with partial failures. The guarantees it provides are summarized in four words: ACID.

Transaction Atomicity lets you abort and discard all writes before committing, while Durability promises that once a transaction commits successfully, any written data won’t be lost even if hardware fails or the database crashes. Isolation ensures each transaction can pretend it’s the only one running on the entire database - the database guarantees that when multiple transactions commit, the result is the same as if they ran one after another serially, even though they may actually run concurrently. Atomicity and isolation serve Consistency - which is really about application Correctness - the C in ACID is actually an application property rather than a transaction property, thrown in to make a nice acronym.

However, in practice, full Isolation is rare - users rarely use the so-called “Serializable” isolation level because it comes with significant performance overhead. Some popular databases like Oracle don’t even implement it - Oracle has an isolation level called “Serializable”, but it actually implements something called snapshot isolation, which is weaker than true serializability.

Isolation Level	Dirty Read	Non-Repeatable Read	Phantom Read
Read Uncommitted (RU)	⚠️ Yes	⚠️ Yes	⚠️ Yes
Read Committed (RC)	❌ No	⚠️ Yes	⚠️ Yes
Repeatable Read (RR)	❌ No	❌ No	⚠️ Yes
Serializable (SR)	❌ No	❌ No	❌ No

RDBMSes allow different isolation levels, letting users trade off between performance and correctness. ANSI SQL92 (poorly) standardized this trade-off by defining four isolation levels based on three types of concurrency anomalies: Weaker isolation levels “theoretically” provide better performance but allow more types of anomalies that can affect application correctness.

To ensure correctness, users can employ additional concurrency control mechanisms like explicit locking or SELECT FOR UPDATE, but this adds complexity and impacts system simplicity. For financial scenarios, correctness is crucial - accounting errors and reconciliation mismatches can have serious real-world consequences. However, for fast-and-loose internet scenarios, missing a few records may be acceptable - correctness often takes a back seat to performance. This laid the groundwork for the correctness issues in MySQL that rode the internet wave to popularity.

What Did Hermitage Tell Us?

Before diving into JEPSEN’s findings, let’s revisit the Hermitage project. Started in 2014 by Martin Kleppmann (author of the internet classic “DDIA”), it aimed to evaluate the correctness of mainstream relational databases. The project designed a series of concurrent transaction scenarios to assess the actual level of database’s claimed isolation levels.

The Hermitage evaluation results table reveals two glaring issues (marked with red circles): Oracle’s Serializable fails to prevent G2 anomalies, making it effectively “Snapshot Isolation”.

MySQL’s issues are even more striking: because its default Repeatable Read can’t prevent PMP / G-Single anomalies, Hermitage downgraded its actual level to Monotonic Atomic View.

It’s worth noting that ANSI SQL 92 isolation levels are a notoriously poor and widely criticized standard, defining only three anomalies to distinguish four isolation levels - but actual anomalies/isolation levels are far more numerous. The famous paper “A Critique of ANSI SQL Isolation Levels” proposed corrections, introduced several important new isolation levels, and provided their partial order of strength (left figure).

Under the new model, many databases’ “Read Committed” and “Repeatable Read” are actually the more practical “Monotonic Atomic View” and “Snapshot Isolation”. But MySQL is truly unique: in Hermitage’s evaluation, MySQL’s Repeatable Read falls far short of Snapshot Isolation, doesn’t meet ANSI 92 Repeatable Read standards, and actually provides Monotonic Atomic View level. JEPSEN’s research further reveals that MySQL Repeatable Read doesn’t even satisfy Monotonic Atomic View, barely stronger than Read Committed.

What New Issues Did JEPSEN Find?

JEPSEN is the most authoritative testing framework in distributed systems. They recently published their evaluation of MySQL’s latest 8.0.34 version. Readers should read the original paper, but here’s the abstract:

MySQL is a popular relational database. We revisited Kleppmann’s 2014 Hermitage findings and confirmed that MySQL’s Repeatable Read isolation still exhibits G2-item, G-single, and lost update anomalies. Using Elle, our transactional consistency checker, we discovered that MySQL’s repeatable read isolation also violates internal consistency. Worse yet - it violates monotonic atomic view: a transaction can observe another transaction’s effects, then fail to observe those same effects in a subsequent attempt. As a bonus, we found that AWS RDS MySQL clusters frequently violate serializability requirements. This research was conducted independently, without compensation, and follows Jepsen research ethics.

MySQL 8.0.34’s RU, RC, and SR isolation levels match ANSI standard descriptions. And under default configuration (RR with innodb_flush_log_at_trx_commit = on), Durability isn’t an issue. The problems lie in MySQL’s default Repeatable Read isolation:

Doesn’t satisfy ANSI SQL92 Repeatable Read (G2, WriteSkew)
Doesn’t satisfy Snapshot Isolation (G-single, ReadSkew, LostUpdate)
Doesn’t satisfy Cursor Stability (LostUpdate)
Violates internal consistency (revealed by Hermitage)
Violates read monotonicity (newly revealed by JEPSEN)

Under MySQL RR, transactions exhibited phenomena violating internal consistency, monotonicity, and atomicity. This pushed its rating down to an undefined isolation level barely above RC.

JEPSEN’s tests revealed six anomalies, skipping the known issues from 2014, let’s focus on JEPSEN’s new findings with some concrete examples.

Isolation Issues: Non-Repeatable Reads

In this test case (JEPSEN 2.3), there’s a simple people table with id as primary key, pre-populated with one row.

CREATE TABLE people (
  id     int PRIMARY KEY,
  name   text not null,
  gender text not null
);
INSERT INTO people (id, name, gender) VALUES (0, "moss", "enby");

Then concurrent write transactions run - each transaction first reads the name field of this row, updates the gender field, then reads the name field again. Proper repeatable read means that within this transaction, both reads of name should return the same result.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
START TRANSACTION;                     -- Start RR transaction
SELECT name FROM people WHERE id = 0;  -- Returns "pebble"
UPDATE people SET gender = "femme" WHERE id = 0; -- Random update
SELECT name FROM people WHERE id = 0;  -- Returns "moss"
COMMIT;

But in the test results, 126 out of 9048 transactions showed internal consistency errors - despite running at Repeatable Read isolation, the name actually changed between reads. This behavior contradicts MySQL’s isolation level documentation, which claims: “Consistent reads within the same transaction read the snapshot established by the first read.” It contradicts MySQL’s consistent read documentation, which specifically states that “InnoDB assigns a transaction a timepoint at its first read, and concurrent transaction effects shouldn’t appear in subsequent reads.”

ANSI/Adya repeatable read essentially means: once a transaction observes a value, it can count on that value staying stable for the rest of the transaction. MySQL does the opposite: write requests are invitations for another transaction to sneak in and wreck the state you just read. This isolation design and behavior is incredibly stupid. But wait, there’s more - like monotonicity and atomicity issues.

Atomicity Issues: Non-Monotonic Views

Kleppmann’s Hermitage rated MySQL repeatable read as Monotonic Atomic View (MAV). According to Bailis et al, monotonic atomic view ensures that once transaction T2 observes any results of transaction T1, T2 observes all of T1’s results.

If MySQL’s RR just took a new snapshot each time it executed a write query, it could still provide MAV isolation guarantees if snapshots were monotonic - that’s actually how PostgreSQL’s Read Committed isolation works.

However, in standard MySQL single-node deployments, that’s not the case: MySQL frequently violates monotonic atomic view under RR isolation. This JEPSEN example (2.4) demonstrates: there’s a mav table pre-populated with two records (id=0,1), both with initial value of 0.

CREATE TABLE mav (
  id      int PRIMARY KEY,
  `value` int not null,
  noop    int not null
);
INSERT INTO mav (id, `value`, noop) VALUES (0, 0, 0);
INSERT INTO mav (id, `value`, noop) VALUES (1, 0, 0);

The workload is mixed read-write transactions: write transactions increment the value field of both records in the same transaction; by transaction atomicity, other transactions observing these records should see value increasing in lock-step.

START TRANSACTION;
SELECT value FROM mav WHERE id = 0;    --> 0 reads 0
update mav SET noop = 73 WHERE id = 1; --> "invites" new snapshot
SELECT value FROM mav WHERE id = 1;    --> 1 reads new value 1, so other row should be 1 too
SELECT value FROM mav WHERE id = 0;    --> 0 but reads old value 0
COMMIT;

However, from this reading transaction’s view, it observed an “intermediate state”. The reading transaction first reads record 0’s value, then sets record 1’s noop to a random value (based on the previous case, this should let it see other transactions’ changes), then reads value from records 0/1. The result: reading record 0 gets the new value, reading record 1 gets the old value, indicating serious flaws in both monotonicity and atomicity.

MySQL’s consistent read documentation extensively discusses snapshots, but this behavior doesn’t look anything like snapshots. Snapshot systems typically provide consistent, point-in-time views of database state. They’re usually atomic: either containing all of a transaction’s results or none. Even if MySQL somehow got a non-atomic snapshot of the write transaction’s intermediate state, it must see row 0’s new value before getting row 1’s new value. But that’s not what happened: this read transaction saw changes to row 1 but missed changes to row 0 - what kind of snapshot is this?

Therefore, MySQL’s Repeatable Read isolation is neither atomic nor monotonic. In this respect, it’s even worse than most databases’ Read Committed, which at least provides actual Monotonic Atomic View.

Another noteworthy issue: MySQL transactions can violate atomicity under default configuration. I raised this issue for industry discussion in an article two years ago. The MySQL community’s stance is that this is a configurable feature via sql_mode rather than a defect.

But this doesn’t change the fact: MySQL violates the principle of least surprise by allowing users to do such atomicity-breaking stupid things under default configuration. Similar issues exist with the replica_preserve_commit_order parameter.

Serialization Issues: Useless and Terrible

Can Serializable prevent these concurrency anomalies? Theoretically yes, that’s what it’s designed for. But disturbingly, JEPSEN observed “Fractured Read-Like” anomalies (an example of G2) in AWS RDS clusters even under Serializable isolation. These anomalies should be prevented by RR and only appear in RC or lower levels.

Deeper investigation revealed this was related to MySQL’s replica_preserve_commit_order parameter: disabling it allows MySQL to achieve higher parallelism when replaying logs at the cost of correctness. With this option disabled, JEPSEN observed similar G-Single and G2-Item anomalies in local cluster SR isolation.

Serializable systems should guarantee transactions appear to execute in total order - not preserving this order on replicas is terrible. That’s why this parameter was disabled by default in the past (8.0.26 and below) but changed to enabled by default in MySQL 8.0.27 (2021-10-19). However, AWS RDS cluster parameter groups still use the old default “OFF” without proper documentation, hence these anomalies.

While this anomalous behavior can be avoided by enabling the parameter, using Serializable itself isn’t encouraged by MySQL official/community. The common view in the MySQL community is: avoid Serializable unless absolutely necessary; MySQL docs state: "SERIALIZABLE enforces stricter rules than REPEATABLE READ and is used mainly in special situations, like XA transactions and solving concurrency and deadlock issues."

Coincidentally, Professor Li Haixiang (former Tencent T14) specializing in database consistency, evaluated actual isolation levels of various databases including MySQL (InnoDB/8.0.20) in his “Third Generation Distributed Database” series, providing this more detailed “《Consistency Ba-Xian Diagram》” from another perspective.

In the diagram, blue/green represents correctly using rules/rollback to avoid anomalies; yellow A represents anomalies, more yellow “A"s mean more correctness issues; red “D” indicates using performance-impacting deadlock detection to handle anomalies, more red D’s mean more performance issues.

Clearly, PostgreSQL SR and CockroachDB SR built on it have the best correctness, followed by Oracle SR; they mainly avoid concurrency anomalies through mechanisms and rules; while MySQL’s correctness level is painfully bad.

Professor Li analyzed this in detail in 《MySQL Is Good For Nothing》: although MySQL’s Serializable can ensure correctness through extensive use of deadlock detection algorithms, handling concurrency anomalies this way severely impacts the database’s performance and practical value.

The Trade-off Between Correctness and Performance

Professor Li raised a question in 《Third Generation Distributed Databases: The Kicking Era》: How to trade off between system correctness and performance?

The database world has some “habits become nature” loops. For example, many databases default to Read Committed, and many people say “RC isolation level is enough”! But why? Why set it to RC? Because they think RC level gives better database performance.

But as shown below, there’s a vicious cycle: users want better database performance, so developers set application isolation level to RC. However, users, especially in finance/insurance/securities/telecom industries, also expect data correctness, so developers have to add SELECT FOR UPDATE locks in SQL statements to ensure data correctness. This leads to severe database system performance degradation. TPC-C and YCSB scenario tests show that user-added locking causes severe performance degradation, while performance overhead under strong isolation levels isn’t actually that bad.

Using weak isolation levels seriously deviates from the original intent of the “transaction” abstraction - writing reliable transactions in lower isolation level databases is extremely complex, and the number and impact of weak isolation-related errors are widely underestimated[13]. Using weak isolation levels essentially kicks the responsibility for correctness & performance that should be guaranteed by the database to application developers.

The root of this weak isolation habit might trace back to Oracle and MySQL. For example, Oracle never provided true serializable isolation (SR is actually Snapshot Isolation), even today. So they had to promote ”using RC isolation" as a good thing. Oracle was one of the most popular databases in the past, so later ones followed suit.

And the stereotype that weak isolation levels perform better might come from MySQL - SR implemented with extensive deadlock detection (marked red) does perform terribly. But this isn’t necessarily true for other DBMSes. For example, PostgreSQL’s Serializable Snapshot Isolation (SSI) algorithm introduced in 9.1 can provide full serializability with minimal performance overhead compared to Snapshot Isolation.

Furthermore, hardware performance improvements and price collapse under Moore’s Law mean OLTP performance is no longer scarce - in today’s world where Twitter can run on a single server, over-provisioned abundant hardware performance costs pocket change. Compared to potential losses and mental overhead from data errors, worrying about performance overhead from serializable isolation is really making mountains out of molehills.

Times have changed, software and hardware progress has made “default serializable isolation, prioritize 100% correctness” truly feasible. Trading correctness for slight performance gains seems outdated even for fast-and-loose internet scenarios. New generation distributed databases like CockroachDB and FoundationDB have chosen to default to Serializable isolation.

Doing the right thing matters, and correctness shouldn’t be a trade-off. On this point, open-source relational database giants MySQL and PostgreSQL chose opposite paths in their early implementations: MySQL pursued performance at the cost of correctness; while academic PostgreSQL pursued correctness at the cost of performance. MySQL took the lead riding the internet wave in the first half due to performance advantages. But when performance is no longer the core consideration, correctness became MySQL’s fatal bleeding point.

There are many ways to solve performance problems, even waiting for hardware performance exponential growth is a viable approach (like Paypal); but correctness issues often involve global architectural restructuring, impossible to fix overnight. Over the past decade, PostgreSQL stayed true while innovating, making great strides while ensuring best correctness, outperforming MySQL in many scenarios; while functionally crushing MySQL with ecosystem extensions introducing vector, JSON, GIS, time series, full-text search and other features.

In StackOverflow’s 2023 global developer survey, PostgreSQL’s developer usage officially surpassed MySQL, becoming the world’s most popular database. While MySQL, with its mess of correctness issues and difficulty achieving high performance, should really think about its path forward.

References

[1] JEPSEN: https://jepsen.io/analyses/mysql-8.0.34

[2] Hermitage: https://github.com/ept/hermitage

[4] Jepsen Research Ethics: https://jepsen.io/ethics

[5] innodb_flush_log_at_trx_commit: https://dev.mysql.com/doc/refman/8.0/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit

[6] Isolation Levels Documentation: https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html#isolevel_repeatable-read

[7] Consistent Read Documentation: https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html

[9] Monotonic Atomic View/MAV: https://jepsen.io/consistency/models/monotonic-atomic-view

[10] Highly Available Transactions: Virtues and Limitations, Bailis et al: https://amplab.cs.berkeley.edu/wp-content/uploads/2013/10/hat-vldb2014.pdf

[12] replica_preserve_commit_order: https://dev.mysql.com/doc/refman/8.0/en/replication-options-replica.html#sysvar_replica_preserve_commit_order

[13] The number and impact of weak isolation-related errors are widely underestimated: https://dl.acm.org/doi/10.1145/3035918.3064037

[14] Testing PostgreSQL’s Parallel Performance: https://lchsk.com/benchmarking-concurrent-operations-in-postgresql

[15] Running Twitter on a Single Server: https://thume.ca/2023/01/02/one-machine-twitter/

Last modified 2025-04-20: update extension info (d63be44)