I’ve been doing some work with PostgreSQL recently. I have a table that stores user ratings, with a UNIQUE constraint:
UNIQUE (author_id, talk_id)
Think of it as a key-value store, just using PostgreSQL. Each user can store only one rating per talk (the latest one). It didn’t make sense for me to use a dedicated key-value database, since PostgreSQL was already used by the application.
Now, how do we deal with storing ratings, so that the UNIQUE constraint is satisfied? Easy! We do a DELETE and then an INSERT, all inside a transaction. The row effectively gets replaced. Nice and easy, right?
Well, not really. When testing, PostgreSQL promptly complained:
Error 23505 / duplicate key value violates unique constraint “ratings_author_id_key” has occurred.
How is that possible, I thought — we were supposed to be in a transaction after all!
Well, it turns out the default level of transaction isolation in postgres is “Read Committed”, not “Serializable” as the SQL standard mandates. So it is entirely possible that in the middle of a transaction someone else will insert a row, and that will cause our INSERT to violate the UNIQUE constraint.
The solution is to either use a MERGE statement, which PostgreSQL still does not support, or set the transaction isolation level to SERIALIZABLE. This, apart from performance implications, means you will get aborted transactions:
Error 40001 / could not serialize access due to concurrent update has occurred.
…so you have to deal with those in your application and retry. A major pain in the rear.
Problem is, most people assume they can simply do a DELETE and then an INSERT inside a transaction, and since the default isolation level in PostgreSQL is not SERIALIZABLE they introduce a race condition. I would venture a guess that there are hundreds of applications online with a race condition of this kind. I’ve seen a Drupal bug report mentioning a “small race condition” of this kind. A “small race condition” is like being “slightly pregnant”, if you ask me.