This section describes how SQL Server executes the
query plan. First, however, it is useful to step back a little and look
at the larger picture — namely, how the SQL Server architecture changed
with SQL Server 2005 and the introduction of SQLOS.
1. SQLOS
SQL Server 2005 underwent a major
change in the underlying architecture with the introduction of SQLOS.
This component provides basic services to the other SQL Server
components, such as the Relational Engine and the Storage Engine. This
architecture is illustrated in the diagram shown in Figure 1.
The main services provided by SQLOS are
scheduling, which is where our main interest lies; and memory
management, which we also have an interest in because the memory
management services are where the procedure cache lives, and that’s
where your query plans live. SQLOS also provides many more services
that are not relevant to the current discussion.
SQLOS implements a hierarchy of system objects that provide the framework for scheduling. Figure 2
shows the basic hierarchy of these objects — from the parent node,
SQLOS, down to the workers, tasks, and OS threads where the work is
actually performed.
The starting point for scheduling and memory allocation is the memory node.
Memory Nodes
The SQLOS memory node is a logical
container for memory associated with a node, which is a collection of
CPUs with shared memory. This can be either a “real” memory node, if
the server has a NUMA architecture, or an artificial grouping that you
created as a “soft” NUMA configuration.
Along with the memory nodes created to model the
physical hardware of the server, there is always one additional memory
node used by the dedicated administrator connection (DAC). This ensures
that some resources are always available to service the DAC, even when
all other system resources are being used.
On an eight-processor SMP system without soft
NUMA, there is one memory node for general server use, and one for the
DAC. This is illustrated in Figure 3.
On an eight-processor NUMA system with two nodes
of four cores, there would be two memory nodes for general use, and a
third for the DAC. This is illustrated in Figure 4.
By querying the DMV sys.dm_os_memory_nodes, you can view the layout of memory nodes on your server. However, it makes more sense to include the node_state_desc column from sys.dm_os_nodes using this query. Note the join between node_id in sys.dm_os_nodes and memory_node_id in sys.dm_os_memory_nodes:
select c.node_id, c.memory_node_id, m.memory_node_id, c.node_state_desc
, c.cpu_affinity_mask, m.virtual_address_space_reserved_kb
from sys.dm_os_nodes as c inner join sys.dm_os_memory_nodes as m
on c.node_id= m.memory_node_id
Here is the output from the preceding query when run on a 16-way SMP server:
NODE_ID MEMORY_NODE_ID MEMORY_NODE_ID NODE_STATE_DESC CPU_AFFINITY_MASK VIRTUAL_ADDRESS_SPACE_RESERVED_KB
0 0 0 ONLINE 65535 67544440
64 0 64 ONLINE DAC 0 2560
In this case, Node 0 has nearly all the
64GB of memory on this server reserved, and Node 64 is reserved for the
DAC, which has just 2.5MB of memory reserved.
Following is the output from this query on a
192-processor NUMA system. The server is structured as eight NUMA
nodes. Each NUMA node has four sockets, and each socket has six cores
(using Intel Xeon hexa-core processors), resulting in 24 cores per NUMA
node:
NODE_ID MEMORY_NODE_ID MEMORY_NODE_ID NODE_STATE_DESC CPU_AFFINITY_MASK VIRTUAL_ADDRESS_SPACE_RESERVED_KB
0 0 0 ONLINE 16777215 268416
1 1 1 ONLINE 16777215 248827056
2 2 2 ONLINE 16777215 22464
3 3 3 ONLINE 16777215 8256
4 4 4 ONLINE 281474959933440 11136
5 5 5 ONLINE 281474959933440 4672
6 6 6 ONLINE 281474959933440 4672
7 7 7 ONLINE 281474959933440 5120
64 0 64 ONLINE DAC 0 2864