5. Deadlock diagnosis
Troubleshooting
deadlocks isn't the nicest (or easiest) task for a DBA; however, SQL
Profiler makes it much simpler. In demonstrating this, consider listing 1,
which contains a simple T-SQL batch file. When a second instance of
this batch file is executed within 10 seconds of the first (from a
separate connection/query window), a deadlock will occur, with the
second query chosen as the deadlock victim and killed by SQL Server.
Example 1. T-SQL code to generate a deadlock
-- T-SQL to simulate a deadlock. Run this from 2 separate query windows -- The 2nd will deadlock if executed within 10 seconds of the 1st
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE -- hold share lock till end
BEGIN TRANSACTION SELECT * FROM HumanResources.Department
WAITFOR DELAY '00:00:10.000' -- delay for 2nd transaction to start
UPDATE HumanResources.Department SET Name = 'Production' WHERE DepartmentID=7
COMMIT TRANSACTION
|
What's
happening here is that the shared lock on the HumanResources.Department
table is being held until the end of the transaction, courtesy of the
Serializable isolation level chosen at the start of the batch. When the
update command runs later in the transaction, the locks need
to be upgraded to an exclusive lock. However, in the meantime, another
instance of this batch has run, which also has a shared lock on the
HumanResources.Department table and also needs to upgrade its locks.
Both transactions will wait on each other forever. SQL Server detects
this situation as a deadlock and chooses to kill one of the
transactions to enable the other to complete.
When
creating a Profiler trace, one of the events we can capture is the
deadlock graph, found under the Locks category when the Show All Events
check box is checked. We saw this check box earlier in figure 2.
When we select the deadlock event, an additional tab becomes available
in the Trace Properties window, which enables the deadlock graph(s) to
be saved to a file.
With
this event selected and the trace running, executing the above code
from two separate connections will generate a deadlock graph event and
a deadlock results file. Opening the file in SQL Server Management
Studio (by double-clicking it) will reveal the graph shown in figure 8.
You'll
notice in the above image the blue X through the left-hand query. This
was the chosen deadlock victim. Hovering the mouse over the query will
reveal the query text. The item in the middle of the graph represents
the resource that the two queries deadlocked over.
The
ability to easily capture deadlock information using Profiler makes the
process of identifying the application or database code causing the
deadlocks significantly easier.
Not only can Profiler detect deadlocks, it can also be used to detect blocked processes.
6. Blocked process report
When
a process attempts to lock a resource that's currently locked by
another process, it waits for the lock to be released, and it's known
for the duration of the wait as a blocked process.
Unlike deadlocks, blocked processes are perfectly normal; however,
depending on the frequency and duration of the waits, blocked processes
can significantly contribute to poor application performance. Consider
the code in listing 2.
Example 2. T-SQL code to generate a block
-- T-SQL to simulate lengthly locking period BEGIN TRANSACTION SELECT * FROM HumanResources.Department(xlock) WAITFOR DELAY '00:00:50.000' — hold lock for 50 seconds COMMIT TRANSACTION
|
What
we're doing here is simulating a long-running transaction that holds
locks for the duration of the transaction, in this case an exclusive
lock on the department table for 50 seconds. Similar to our deadlock
example earlier, we'll run this batch from two separate connections.
With the first execution holding an exclusive lock on the department
table, the second execution will be blocked, waiting for the first to
complete.
In helping to diagnose blocked
processes such as the one that we're simulating here, we can set the
Blocked Process Threshold server configuration setting. As shown listing 3,
this setting takes a parameter representing the number of seconds a
process can be blocked before an alert is raised. Let's configure this
value to 20 seconds.
Example 3. Setting the Blocked Process Threshold
-- Set the Blocked Process Threshold to 20 seconds EXEC sp_configure 'show advanced options', 1 GO RECONFIGURE GO
EXEC sp_configure 'blocked process threshold', 20 GO RECONFIGURE GO
|
With
this setting in place, we can create a SQL Server Profiler trace that
profiles the Blocked Process Report event (found in the Errors and
Warnings category). After running the T-SQL batch from two separate
connections, our Profiler trace will capture the Blocked Process Report
event, as shown in figure 9.
The XML contained within the report provides details on both the
blocking and blocked processes, including the T-SQL, client
application, and login name.
A
common characteristic of poorly designed database applications is
frequent and lengthy process blocking, largely due to inappropriate
locking levels and transaction length. In combination with the Blocked
Process Threshold setting, SQL Server Profiler's Blocked Process Report
event offers an effective and easily implemented method of detecting
and analyzing such blocking events.
In
closing our coverage of SQL Server Profiler, let's look at how we can
enhance the collected events with Performance Monitor data.
7. Correlating traces with performance logs
Correlating
different perspectives of the same event almost always leads to a
deeper understanding of any situation. In SQL Server monitoring, one of
the opportunities we have for doing that is importing Performance
Monitor data into a Profiler trace, as shown in figure 10.
After
opening a saved Profiler trace file (or table), you can choose the File
> Import Performance Data menu option to select a Performance
Monitor log file to import. You can then select the appropriate
counters from the log for inclusion.
Once
the log file is imported, clicking a particular traced command in the
top half of the window will move the red bar to the appropriate spot in
the Performance Monitor log (and vice versa). In the example shown in figure 10,
you can see a spike in CPU usage just over halfway through the
displayed time frame. Clicking on the start of the spike will select
the appropriate command in the trace results, with the full command
shown at the bottom of the screen. In this example, you can see the
cause of the CPU spike: a cross join between two tables with a random
ordering clause.
In addition to
enhancing our Profiler traces, Performance Monitor data is invaluable
in gaining a deeper understanding of SQL Server activity, as well as
enabling baseline analysis, both of which we'll cover next.