SQL Server 2012 : Delivering A SQL Server Health Check (part 7)

12/7/2013 8:22:54 PM

After looking at memory usage at the operating-system level, you are going to want to take a look at what is happening with SQL Server’s internal memory usage, which you can do using the query shown in Listing 26.

LISTING 26: SQL server memory information

-- SQL Server Process Address space info 
--(shows whether locked pages is enabled, among other things)
SELECT physical_memory_in_use_kb,locked_page_allocations_kb, 
       page_fault_count, memory_utilization_percentage, 
       available_commit_limit_kb, process_physical_memory_low, 
       process_virtual_memory_low
FROM sys.dm_os_process_memory WITH (NOLOCK) OPTION (RECOMPILE);
 
-- You want to see 0 for process_physical_memory_low
-- You want to see 0 for process_virtual_memory_low
-- This indicates that you are not under internal memory pressure

This query tells you how much memory is actually being used by the SQL Server Database Engine. This information is more reliable than what is displayed in Windows Task Manager. It also tells you whether this SQL Server instance is using locked pages in memory. Finally, it indicates whether the SQL Server process is signaling that it is low on physical or virtual memory.

One classic way of measuring whether SQL Server is under internal memory pressure is to look at its Page Life Expectancy (PLE) (See Chapter 3, “Understanding Memory”), which you can do using the query shown in Listing 27.

LISTING 27: Page Life Expectancy information

-- Page Life Expectancy (PLE) value for default instance
SELECT cntr_value AS [Page Life Expectancy]
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE [object_name] LIKE N'%Buffer Manager%' -- Handles named instances
AND counter_name = N'Page life expectancy' OPTION (RECOMPILE);
 
-- PLE is one way to measure memory pressure.
-- Higher PLE is better. Watch the trend, not the absolute value.

This query returns the current Page Life Expectancy (PLE) value, in seconds, for the default instance of SQL Server. PLE is a measurement of how long SQL Server expects to keep in the SQL Server buffer pool before it is flushed or evicted. Higher PLE values are better than lower PLE values. You should develop an awareness of the normal range of PLE values for your more important SQL Server instances. That will help you identify a current PLE that is abnormally high or low.

Microsoft has a long-standing recommendation of 300 as a threshold for acceptable PLE, which is often debated in the SQL Server community. One thing that everyone does agree on though is that a PLE value of less than 300 is quite bad. Modern database servers with high amounts of physical memory typically have much higher PLE values than 300. Instead of focusing on the current PLE value, watch the trend over time.

After looking at Page Life Expectancy, you are going to want to look at Memory Grants Outstanding, using the query shown in Listing 28.

LISTING 28: Memory Grants Outstanding information

-- Memory Grants Outstanding value for default instance
SELECT cntr_value AS [Memory Grants Outstanding]
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE [object_name] LIKE N'%Memory Manager%' -- Handles named instances
AND counter_name = N'Memory Grants Outstanding' OPTION (RECOMPILE);
 
-- Memory Grants Outstanding above zero 
-- for a sustained period is a secondary indicator of memory pressure

This query returns the current value for Memory Grants Outstanding for the default instance of SQL Server. Memory Grants Outstanding is the total number of processes within SQL Server that have successfully acquired a workspace memory grant (refer to Chapter 3). You want this value to be zero if at all possible. Any sustained value above zero is a secondary indicator of memory pressure due to queries that are using memory for sorting and hashing. After looking at Memory Grants Outstanding, you should also look at Memory Grants Pending (which is a much more important indicator of memory pressure), by using the query shown in Listing 29.

LISTING 29: Memory Grants Pending information

-- Memory Grants Pending value for default instance
SELECT cntr_value AS [Memory Grants Pending]
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE [object_name] LIKE N'%Memory Manager%' -- Handles named instances
AND counter_name = N'Memory Grants Pending' OPTION (RECOMPILE);
 
-- Memory Grants Pending above zero 
-- for a sustained period is an extremely strong indicator of memory pressure

This query returns the current value for Memory Grants Pending for the default instance of SQL Server. Memory Grants Pending is the total number of processes within SQL Server that are waiting for a workspace memory grant. You want this value to be zero if at all possible. Any sustained value above zero is an extremely strong indicator of memory pressure. Especially if you see any signs of internal memory pressure from the previous three queries, take a closer look at the overall memory usage in SQL Server by running the query shown in Listing 30.

LISTING 30: Memory clerk information

-- Memory Clerk Usage for instance
-- Look for high value for CACHESTORE_SQLCP (Ad-hoc query plans)
SELECT TOP(10) [type] AS [Memory Clerk Type], 
       SUM(pages_kb) AS [SPA Mem, Kb] 
FROM sys.dm_os_memory_clerks WITH (NOLOCK)
GROUP BY [type]  
ORDER BY SUM(pages_kb) DESC OPTION (RECOMPILE);
 
-- CACHESTORE_SQLCP  SQL Plans         
-- These are cached SQL statements or batches that 
-- aren’t in stored procedures, functions and triggers
--
-- CACHESTORE_OBJCP  Object Plans      
-- These are compiled plans for 
-- stored procedures, functions and triggers
--
-- CACHESTORE_PHDR   Algebrizer Trees  
-- An algebrizer tree is the parsed SQL text 
-- that resolves the table and column names

This query gives you a good idea of what (besides the buffer cache) is using large amounts of memory in SQL Server. One key item to look out for is high values for CACHESTORE_SQLCP, which is the memory clerk for ad hoc query plans. It is quite common to see this memory clerk using several gigabytes of memory to cache ad hoc query plans.

If you see a lot of memory being used by the CACHESTORE_SQLCP memory clerk, you can determine whether you have many single-use, ad hoc query plans using a lot of memory in your procedure cache by running the query shown in Listing 31.

LISTING 31: Single-use ad-hoc queries

-- Find single-use, ad-hoc queries that are bloating the plan cache
SELECT TOP(20) [text] AS [QueryText], cp.size_in_bytes
FROM sys.dm_exec_cached_plans AS cp WITH (NOLOCK)
CROSS APPLY sys.dm_exec_sql_text(plan_handle) 
WHERE cp.cacheobjtype = N'Compiled Plan' 
AND cp.objtype = N'Adhoc' 
AND cp.usecounts = 1
ORDER BY cp.size_in_bytes DESC OPTION (RECOMPILE);
 
-- Gives you the text and size of single-use ad-hoc queries that 
-- waste space in the plan cache
-- Enabling 'optimize for ad hoc workloads' for the instance 
-- can help (SQL Server 2008 and above only)
-- Enabling forced parameterization for the database can help, but test first!

This query returns the query text and size of your largest (in terms of memory usage) single-use, ad hoc query plans wasting space in your plan cache. If you see a lot of single-use, ad hoc query plans in your plan cache, you should consider enabling the instance-level optimize for ad hoc workloads setting (also see Chapter 3). This setting enables SQL Server 2008 and later to store a much smaller version of the ad hoc execution plan in the plan cache the first time that plan is executed. This can reduce the amount of memory that is wasted on single-use, ad hoc query plans that are highly likely to never be reused. Conversely, sometimes the result of enabling this setting is that more of these smaller, ad hoc plans are stored in the plan cache (because more smaller plans can fit in the same amount of memory as fewer, larger plans), so you may not see as much memory savings as you anticipated.

Even so, we don’t see any good reason not to enable this setting on all SQL Server 2008 and later instances. When I talked to one of the developers at Microsoft who worked on this feature a couple of years ago, the only downside to this setting that she could see was a scenario in which you had several identical ad hoc query plans that would be executed between two and ten times, in which case you would take a small hit the second time the plan was executed. That seems like an edge case to me.

Others