Remotely Running PerfMon
Like many server management tools, an
instance of PerfMon can be connected to a remote server for remote
monitoring. This avoids the need to connect via Remote Desktop and may
reduce the overhead of monitoring on the target server.
To run PerfMon against a remote server, when
adding counters, specify the target server name, replacing <Local
computer> in the “Select counters from computer” drop-down box (see Figure 7).
In order to use PerfMon remotely, you’ll need to be a Local
Administrator on the target server, and the remote registry service
should be running.
Factors to Consider When Running PerfMon
Monitoring servers adds overhead but it
can be necessary. All data capture tools impose some cost to the target
server. Our challenge is to resolve an incident (often performance
related) while minimizing the overhead. When monitoring, you should
consider performance implications with a view to reducing overhead and
minimizing two main risks:
- Making problems worse
- Affecting data capture
PerfMon counters are themselves updated
by the application, even when not consumed by PerfMon. Therefore, any
performance overhead with PerfMon is only usually encountered when
polling (or sampling) these counters and when writing these to disk if
a collector has been set up.
The overhead of using PerfMon to monitor normal
servers with regular workload is typically minimal. Performance becomes
a discussion point when monitoring servers operating in time-sensitive
environments (e.g., trading or reservation platforms) or with servers
suffering acute performance problems — those in which the monitoring
overhead could tip the server over the edge.
Because reading PerfMon counters is the only real
overhead of concern, you should consider network time and disk activity
during monitoring. If you can perceive performance degradation when
running PerfMon, you can quickly and easily stop logging and measure
any performance improvement.
NOTE One
of the challenges with many performance problems is that you must
obtain a PerfMon log to identify the cause of the problem. Without a
log, engineers and managers can observe poor application performance
and hypothesize about potential causes and remedies, but performance
data is needed in order to diagnose the problem and take remedial action.
Frequently,
you just have to accept the risk and overhead of running PerfMon
because there simply is no better way to obtain performance data that
will help solve a problem.
The Impact of Running PerfMon
PerfMon is a lightweight tool and its
impact on any given server is partly related to how PerfMon is
configured, but it is also dependent on the workload of that server
while PerfMon is running. To illustrate this scenario, consider two
servers: Server A is suffering under heavy workload with 99% CPU
utilization and poor disk performance, while server B currently runs
with 20% CPU and good disk response times. In this case, it’s likely
that the impact to server A is greater because PerfMon could consume 1%
or 2% available CPU capacity, whereas that same amount added by PerfMon
to server B will have negligible detectable impact.
Many organizations attempt to reduce the risk and
impact to systems by monitoring during periods of low activity — e.g.,
during lunch or late afternoon — when user volumes and activity are
typically lower, but this is usually the worst idea! It is essential to
capture data while the problem is happening, not on either side of the
problem (typically when concurrency is at its peak). Additionally, the
worse the problem, the easier it is to spot. Often problems are
accentuated with user activity, so if they’re more likely to occur and
be worse when they do happen, you’ve got the best chance possible to
capture a log containing them.
There are three key factors to consider when
determining the impact of PerfMon: sample interval, number of counters,
and disk performance. The following sections take a brief look at each.
Sample Interval
The sample interval controls the
frequency with which PerfMon polls counters to read their values. The
more often PerfMon samples, the greater the impact to the server and
the more log data generated. The default is 15 seconds, which is
usually fine when tracing for a few hours only; when tracing over
longer periods, reducing the sample interval reduces both the overhead
of PerfMon and the size of the file generated.
Consider a situation in which you have a busy
system with a high workload but very short transactions — sampling
every 60 seconds could miss many of these very short transactions. The
sample interval can affect the shape of the data, so always be aware of
it and the overall monitoring window when reviewing performance logs,
especially when looking at min, max, and average values. Take into
account system activity and usage patterns to ensure that the log is
representative of typical workload.
Number of Counters
A consideration with similar impact to
sample interval, more counters results in a higher cost to sample and
store those counter values. Most instance counters have a _TOTAL
counter, which is a total of the individual counter instances combined.
In some cases, such as for disk counters, this total is of limited use,
as usually the details about each disk (instance) counter are required
to identify disk performance problems. The total can hide problems,
because an average might look healthy; but a very busy disk could be
masked by several other disks with little activity.
Disk Performance
When capturing performance data using
Data Collector Sets, consider where the log files will be stored. The
objective is to minimize the impact to SQL Server; log performance data
to a file on disk (not a database); and, where available, use a disk
that will not contend with any databases — i.e., avoid any disks where
data or log files are stored.
PerfMon logs grow in a linear and predictable
pattern (unlike SQL Profiler trace files, which are workload
dependent); for example, sampling 100 counters every 15 seconds for 5
minutes might create a 2MB PerfMon log file, so it would be reasonable
to estimate that logging 100 counters for six hours would generate a
144MB log file. Generally, I try to avoid capturing data to a system
drive, as the implications of filling that drive are much greater than
when logging to a nonsystem drive.
Servers Suffering Very Poor Performance
When capturing PerfMon logs on servers
with acute performance problems, run PerfMon as cautiously as possible
to reduce the impact while still harvesting performance data. Here are
some guidelines:
- Run PerfMon remotely.
- Reduce the sampling interval.
- Include as few counters as possible.
- Log to disk.
Common PerfMon Problems
You may sometimes encounter problems
with PerfMon itself — specifically, counters could be missing, they
might not be displayed correctly, or there could be problems connecting
to servers remotely. This section contains a brief summary of some
common issues and how to resolve them.
Using PerfMon on 64-bit Systems Using WOW
When running x64 Windows with x86 SQL
Server, you’re using Windows on Windows (WOW), which means x64 Windows
is emulating an x86 environment to host x86 SQL Server. If you’re using
x64 Windows and x64 SQL Server, this section isn’t relevant to you.
When PerfMon runs on an x64 host, none of the
counters pertaining to x86 applications are available because the x64
PerfMon cannot load x86 counters. You can overcome this by launching
the x86 version of the Microsoft Management Console (MMC) with the
PerfMon snap-in. Run the following to launch the PerfMon x86 from an
x64 Windows computer:
mmc /32 perfmon.msc
NOTE If
you’re running SQL Server in a Windows on Windows (WOW) mode — i.e.,
x86 SQL Server on x64 Windows — you’ll be unable to run PerfMon
remotely from other x64 machines because the remote Registry service is
an x64 process; therefore, counters are visible only to x86 processes.
Remote Monitoring Fails
If you’re trying to monitor remote
servers without any success, the most likely cause is permissions
problems. Try the following troubleshooting tips:
- Ensure that the account is local administrator on the target server.
- Confirm NetBIOS access to the target server.
- Ensure that the remote Registry service is running on the target server.
- Ensure that no local security policy or Active Directory group policy is restricting access.
SQL Server Counters Are Missing
When you open PerfMon, you might find
that there are no SQL Server counters available in the counter list.
This problem occurs more often on clustered instances. If counters are
missing, check the SQL Server Error Log and the Windows Event
Application log to determine whether any errors are logged regarding
the failed counters. If there are no errors in either log, you can
unload the counters as follows:
unlodctr mssqlserver
Once the counters have been unloaded, verify the path to sqlctr.ini and use the following command to reload the counters:
lodctr C:\Program Files\Microsoft SQL Server\MSSQL10.1\MSSQL\Binn\sqlctr.ini
As with any change, test the process on
a nonproduction server to gain confidence in the process (even if there
is no problem on the test server, you can still test the commands).
After reloading the counters, if they still aren’t listed, use the
following process to rebuild them.
Counters Are Missing or Numbers Appear Instead of Names
If when you attempt to add
performance counters the list contains numbers instead of counter
names, the counters could have been corrupted by a process incorrectly
modifying the Registry.