One of the great new features of SQL Server 2005 is the snapshot isolation level. But exactly how safe is that feature? Can you still guarantee your data integrity if you use snapshot isolation level?
With most forms of data integrity, this is not an issue. But with referential integrity, it might be – after all, checking referential integrity usually requires the database to read data in other tables than the one being updated. Since readers and writers are supposed not to block each other if you use snapshot isolation, it’s easy to imagine a scenario where two concurrent data modifications try to make changes that collide with each other, yet both are allowed because of the snapshot isolation level.
Since this is quite a broad subject, I’ll write at least two blog entries about it. In this first instalment, I’ll look into how SQL Server handles this potential thread to integrity for the most common form of referential integrity: the foreign key constraint. Since testing is the only way to find out, I used the following code to create a test database with two tables and one foreign key constraint.
USE master;
go
IF EXISTS (SELECT * FROM sys.databases WHERE name = ‘SnapshotTest’)
BEGIN;
DROP DATABASE SnapshotTest;
END;
go
CREATE DATABASE SnapshotTest;
go
ALTER DATABASE SnapshotTest
SET ALLOW_SNAPSHOT_ISOLATION ON;
go
USE SnapshotTest;
go
CREATE TABLE Customers
(CustID int NOT NULL PRIMARY KEY,
CustName varchar(40) NOT NULL
);
CREATE TABLE Orders
(OrderID char(7) NOT NULL PRIMARY KEY,
OrderType char(1) CHECK (OrderType IN (‘A’, ‘B’)),
CustID int NOT NULL REFERENCES Customers (CustID)
);
INSERT INTO Customers (CustID, CustName)
VALUES (1, ‘First test customer’);
INSERT INTO Customers (CustID, CustName)
VALUES (2, ‘Second test customer’);
go
With these two tables all set up, it’s time to start some testing. Let’s see if we are able to add an order for our only customer in one connection and at the same time remove that customer in another connection. Here’s the SQL for connection 1.
USE SnapshotTest;
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRANSACTION;
— Check to see that the customer exists
SELECT *
FROM Customers
WHERE CustID = 1;
— Insert an order for the customer
INSERT INTO Orders (OrderID, OrderType, CustID)
VALUES (‘Order01’, ‘A’, 1);
— Twiddle thumbs for 10 seconds before commiting
WAITFOR DELAY ‘0:00:10’;
COMMIT TRANSACTION;
go
— Check results
SELECT * FROM Customers;
SELECT * FROM Orders;
go
And here’s the SQL for connection 2.
USE SnapshotTest;
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRANSACTION;
— Check to see that the customer has no orders
SELECT *
FROM Orders
WHERE CustID = 1;
— Remove the customer
DELETE Customers
WHERE CustID = 1;
— Twiddle thumbs for 10 seconds before commiting
WAITFOR DELAY ‘0:00:10’;
COMMIT TRANSACTION;
go
— Check results
SELECT * FROM Customers;
SELECT * FROM Orders;
go
If I first start the SQL for connection 1, then (using less than 10 seconds) switch to connection 2 and start that SQL, the results are both encouraging and disencouraging at the same time. The SELECT statement at the beginning of the transaction is not blocked by the INSERT from the other transaction and shows the stale data, as expected in the snapshot isolation level. However, the DELETE statement (that, under the covers, uses the exact same read operation to check the foreign key constraint), is blocked.
The good news is that this means that even with snapshot isolation, it is impossible to violate a foreign key constraint. Considering the importance and value of the data that is sitting in our databases and the enormous costs involved with cleaning up bad data in databases that fail to guard integrity, this is Very Good News indeed.
But there’s bad news as well. Keeping our data integrity safe does come at a price. It means that the entire commercial blurb about how snapshot isolation improves concurrency because readers and writers no longer block each other has to be taken with a grain of salt. Apparently, writers do block readers if those readers are tasked with checking a foreign key constraint. And that’s not limited to situations that might lead to a violation of referential integrity – change the SQL for connection 2 to remove the second test customer instead of the first and you’ll see two transactions that should be able to execute simultaneously, yet still are blocking each other. Using snapshot isolation might not yield the concurrency gain you are hoping for!
Another bad thing is the choice of error messages. If you run the SQL above, you’ll get this error message from the second connection once the first connection has committed the changes:
Msg 3960, Level 16, State 1, Line 2
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table ‘dbo.Orders’ directly or indirectly in database ‘SnapshotTest’ to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.
If you change the SQL in the second connection to delete the second test customer, or if you change the SQL in the first transaction to ROLLBACK rather than COMMIT the changes, the second connection will still be blocked, but after the first connection finishes, the second connection continues, and you’ll never see any warning or error message to explain why this connection was blocked. If you ever get called in to investigate slowness in a database that uses snapshot isolation, would you consider that updates to different tables might be blocking each other? Until performing the tests above, I would have started looking elsewhere! My suggestion to Microsoft would be to change the error messages – in both cases, a warning message stating that snapshot isolation has temporarily been put out of effect should be given as soon as the second connection is blocked. Once the first connection ends, the second connection should either receive the normal foreign key constraint violation error, or it should continue without further messages.
To wrap it up, we can conclude that Microsoft’s SQL Server team has been smart enough to make sure that snapshot isolation won’t allow violation of foreign key constraints, but at the price of temporarily disabling snapshot isolation and thereby reducing concurrency. And unfortunately, they have failed to make trouble-shooting easier by causing SQL Server to send a warning message to the client if this happens.
That concludes the first part of this series. In the next part, I’ll be looking at some other, more obscure sorts of referential integrity in conjunction with snapshot isolation. Feel free to experiment with the sample code above while waiting – and if you see anything worth mentioning, be sure to post a comment!
5 Comments. Leave new
Very nice. I hadn’t noticed this one myself. I had warned against using snapshot for writers anyhow, though this makes it a bit more safe anyhow. If you are using triggers for any data validation, it wouldn’t be so smart.
Hi Louis,
Thanks for your comment. 🙂
You’re right about the trigger not being so smart – but the person writing the trigger can be. I’ve covered that in parts two and three (but I bet you’ve already found them).
Hi Hugo,
I wish I’d come across this entry, it answers a question that I’ve just blogged about myself – I wondered why the FK look up still blocked with READ_COMMITTED_SNAPSHOT and ALLOW_SNAPSHOT_ISOLATION and reading your entry it makes complete sense.
My particular problem was to do with when the table is a heap you don’t block, when its clustered you do block which is the behaviour I go through.
Good work Hugo!!
Tony.
Extremely late comment, but I’m curious. You said: "writers do block readers if those readers are tasked with checking a foreign key constraint."
If they’re tasked with checking a foreign key constraint, (connection two in your example) doesn’t that make them a writer? In my head Readers are connections that only do SELECTs and writers are connections that do INSERT/UPDATE/DELETE.
This blocking problem is not due to snapshot isolation. In all isolation levels, the "parent" end of an FK constraints must be stabilized for the duration of the DML statement.