With many applications,
clients need to fetch the data to browse through it, make modifications
to one or more rows, and then post the changes back to the database in
SQL Server. These human-speed operations are slow in comparison to
machine-speed operations, and the time lag between the fetch and post
might be significant. (Consider a user who goes to lunch after
retrieving the data.)
For these applications, you would not want to use normal locking schemes such as SERIALIZABLE or HOLDLOCK
to lock the data so it can’t be changed from the time the user
retrieves it to the time he or she applies any updates. This would
violate one of the key rules for minimizing locking contention and
deadlocks that you should not allow user interaction within
transactions. You would also lose all control over the duration of the
transaction. In a multiuser OLTP environment, the indefinite holding of
the shared locks could significantly affect concurrency and overall
application performance due to blocking on locks and locking contention.
On the other hand, if the
locks are not held on the rows being read, another process could update a
row between the time it was initially read and when the update is
posted. When the first process applies the update, it would overwrite
the changes made by the other process, resulting in a lost update.
So how do you implement such an application? How do
you allow users to retrieve information without holding locks on the
data and still ensure that lost updates do not occur?
Optimistic locking is a technique used in situations
in which reading and modifying data processes are widely separated in
time. Optimistic locking helps a client avoid overwriting another
client’s changes to a row without holding locks in the database.
One approach for implementing optimistic locking is to use the rowversion data type. Another approach is to take advantage of the optimistic concurrency features of snapshot isolation.
Optimistic Locking Using the rowversion Data Type
SQL Server 2008 provides a special data type called rowversion that can be used for optimistic locking purposes within applications. The purpose of the rowversion data type is to serve as a version number in optimistic locking schemes. SQL Server automatically generates the value for a rowversion column whenever a row that contains a column of this type is inserted or updated. The rowversion
data type is an 8-byte binary data type, and other than guaranteeing
that the value is unique and monotonically increasing, the value is not
meaningful; you cannot look at the individual bytes and make any sense
of them.
Note
In previous versions of SQL Server, the rowversion data type was also referred to as the timestamp data type. While this data type synonym still exists in SQL Server 2008, it has been deprecated and the rowversion data type name should be used instead to ensure future compatibility.
In an application that uses optimistic locking, the
client reads one or more records from the table, being sure to retrieve
the primary key and current value of the rowversion column for
each row, along with any other desired data columns. Because the query
is not run within a transaction, any locks acquired for the SELECT
are released after the data has been read. At some later time, when the
client wants to update a row, it must ensure that no other client has
changed the same row in the intervening time. The UPDATE statement must include a WHERE clause that compares the rowversion value retrieved with the original query, with the current rowversion value for the record in the database. If the rowversion
values match—that is, if the value that was read is the same as the
value currently in the database—no changes to that row have occurred
since it was originally retrieved. Therefore, the change attempted by
the application can proceed. If the rowversion value in the client application does not
match the value in the database, that particular row has been changed
since the original retrieval of the record. As a result, the state of
the row that the application is attempting to modify is not the same as
the row that currently exists in the database. As a result, the
transaction should not be allowed to take place, to avoid the lost
update problem.
To ensure that the client application does not
overwrite the changes made by another process, the client needs to
prepare the T-SQL UPDATE statement in a special way, using the rowversion column as a versioning marker. The following pseudocode represents the general structure of such an update:
UPDATE theTable
SET theChangedColumns = theirNewValues
WHERE primaryKeyColumns = theirOldValues
AND rowversion = itsOldValue
Because the WHERE clause includes the primary key, the UPDATE
can apply only to exactly one row or to no rows; it cannot apply to
more than one row because the primary key is unique. The second part of
the WHERE clause provides the optimistic “locking.” If another client has updated the row, the rowversion no longer has its old value (remember that the server changes the rowversion value automatically with each update), and the WHERE
clause does not match any rows. The client needs to check whether any
rows were updated. If the number of rows affected by the update
statement is zero, the row has been modified since it was originally
retrieved. The application can then choose to reread the data or do
whatever recovery it deems appropriate. This approach has one problem:
how does the application know whether it didn’t match the row because
the rowversion was changed, because the primary key had changed, or because the row had been deleted altogether?
In SQL Server 2000, there was an undocumented tsequal() function (which was documented in prior releases) that could be used in a WHERE clause to compare the rowversion value retrieved by the client application with the rowversion value in the database. If the rowversion
values matched, the update would proceed. If not, the update would
fail, with error message 532, to indicate that the row had been
modified. Unfortunately, this function is no longer provided in SQL
Server 2005 and later releases. Any attempt to use it now results in a
syntax error. As an alternative, you can programmatically check whether
the update modified any rows, and if not, you can check whether the row
still exists and return the appropriate message. Listing 1 provides an example of a stored procedure that implements this strategy.
Listing 1. An Example of a Procedure for Optimistic Locking
create proc optimistic_update
@id int, -- provide the primary key for the record
@data_field_1 varchar(10), -- provide the data value to be updated
@rowversion rowversion -- pass in the rowversion value retrieved with
-- the initial data retrieval
as
-- Attempt to modify the record
update data_table
set data_field_1 = @data_field_1
where id = @id
and versioncol = @rowversion
-- Check to see if no rows updated
IF @@ROWCOUNT=0
BEGIN
if exists (SELECT * FROM data_table WHERE id=@id)
-- The row exists but the rowversions don't match
begin
raiserror ('The row with id "%d" has been updated since it was read',
10, 1, @id)
return -101
end
else -- the row has been deleted
begin
raiserror ('The row with id "%d" has been deleted since it was read',
10, 2, @id)
return -102
end
end
ELSE
PRINT 'Data Updated'
return 0
|
Using
this approach, if the update doesn’t modify any rows, the application
receives an error message and knows for sure that the reason the update
didn’t take place is that either the rowversion value didn’t match or the row was deleted. If the row is found and the rowversion values match, the update proceeds normally.
Optimistic Locking with Snapshot Isolation
SQL Server 2008’s Snapshot Isolation mode provides
another mechanism for implementing optimistic locking through its
automatic row versioning. If a process reads data within a transaction
when Snapshot Isolation mode is enabled, no locks are acquired or held
on the current version of the data row. The process reads the version of
the data at the time of the query. Because no locks are held, it
doesn’t lead to blocking, and another process can modify the data after
it has been read. If another process does modify a data row read by the
first process, a new version of the row is generated. If the original
process then attempts to update that data row, SQL Server automatically
prevents the lost update problem by checking the row version. In this
case, because the row version is different, SQL Server prevents the
original process from modifying the data row. When it attempts to modify
the data row, the following error message appears:
Msg 3960, Level 16, State 4, Line 2
Snapshot isolation transaction aborted due to update conflict. You cannot use
snapshot isolation to access table 'dbo.data_table' directly or indirectly in
database 'bigpubs2008' to update, delete, or insert the row that has been modified
or deleted by another transaction. Retry the transaction or change the isolation
level for the update/delete statement.
To see how this works, you can create the following table:
use bigpubs2008
go
--The first statement is used to disable any previously created
--DDL triggers in the database which would prevent creating a new table.
DISABLE TRIGGER ALL ON DATABASE
go
create table data_table
(id int identity,
data_field_1 varchar(10),
timestamp timestamp)
go
insert data_table (data_field_1) values ('foo')
go
Next, you need to ensure that bigpubs2008 is configured to allow snapshot isolation:
ALTER DATABASE bigpubs2008 SET ALLOW_SNAPSHOT_ISOLATION ON
In one user session, you execute the following SQL statements:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
go
begin tran
select * from data_table
go
id data_field_1 timestamp
---------- ------------ ----------------
1 foo 0x0000000000000BC4
Now, in another user session, you execute the following UPDATE statement:
update data_table set data_field_1 = 'bar'
where id = 1
Then you go back to the original session and attempt the following update:
update data_table set data_field_1 = 'fubar'
where id = 1
go
Msg 3960, Level 16, State 4, Line 2
Snapshot isolation transaction aborted due to update conflict. You cannot use
snapshot isolation to access table 'dbo.data_table' directly or indirectly in
database 'bigpubs2008' to update, delete, or insert the row that has been modified
or deleted by another transaction. Retry the transaction or change the isolation
level for the update/delete statement.
Note that for the first process to hold on to the row version, the SELECT and UPDATE
statements must be run in the same transaction. When the transaction is
committed or rolled back, the row version acquired by the SELECT statement is released. However, because the SELECT statement run at the Snapshot Isolation level does not hold any locks, there are no locks being acquired or held by that SELECT statement within the transaction, so it avoids the problems that would normally be encountered by using HOLDLOCK
or the Serializable Read isolation level. Because no locks were held on
the data row, the other process was allowed to update the row after it
was retrieved, generating a new version of the row. The automatic row
versioning provided by SQL Server’s Snapshot Isolation mode prevented
the first process from overwriting the update performed by the second
process, thereby preventing a lost update.
Caution
Locking contention is prevented in the preceding example only because the transaction performed only a SELECT before attempting the UPDATE. A SELECT
run with Snapshot Isolation mode enabled reads the current version of
the row and does not acquire or hold locks on the actual data row.
However, if the process were to perform any other modification on the
data row, the update or exclusive locks acquired would be held until the
end of the transaction, which could lead to locking contention,
especially if user interaction is allowed within the transaction after
the update or exclusive locks are acquired.
Because of the overhead
incurred by snapshot isolation and the cost of having to roll back
update conflicts, you should consider using Snapshot Isolation mode only
to provide optimistic locking for systems where there is little
concurrent updating of the same resource so that it is unlikely that
your transactions have to be rolled back because of an update conflict.