The ULS logs can help you
figure out what happened after an issue has already occurred, but the
Health Analyzer is a trustworthy (albeit sometimes overzealous)
sidekick that wants to keep you out of trouble in the first place.
Whether you realize it or not, you’ve probably already seen the Health
Analyzer at work if you’ve ever opened Central Admin and seen a yellow
or red bar warning that serious or critical issues have been detected
and require attention. Central Admin just doesn’t seem right without
those bars at the top. Figure 1
shows your old friend. However, Health Analyzer is capable of doing
much more than just reporting on issues; in some cases it can even fix
them for you!
If you are familiar with Health Analyzer in
SharePoint 2010, then you will notice that it remains basically
unchanged for 2013. The Health Analyzer’s role is to run a whole series
of predefined checks against various portions of the farm, looking for
potential problems before they become serious enough to affect it.
These checks can run the gamut from service account permissions checks
to available server disk space to database index fragmentation.
SharePoint 2013 increases the number of out-of-the-box rules to 68
(from SharePoint 2010’s 52), and spending a few moments looking through
the list of checks and how they are done is certainly a worthwhile use
of time, as it will familiarize you with what Health Analyzer does and
does not check, and how often those checks are run. To view the list of
configured Health Analyzer rules, go to the Monitoring section of
Central Admin, and under the Health Analyzer section, you will see
Review Rule Definitions.
The rules are divided into sections based on the
type of check that is done, and include a short description of the
rule, its run schedule, enabled/disabled status, and whether or not it
is permitted to attempt an automatic repair of the issue. Figure 2
shows some of the rule definitions under the Security heading. While
technically this is just another SharePoint list to which you can add
new items, simply typing up a description of a new rule and adding to
this list doesn’t make a functional rule. This list is really just a
configuration front end for a series of timer jobs that actually
execute the check. Without the corresponding timer job, these list
items have no effect whatsoever. To get a feel for the issues the
Health Analyzer is watching for, you can page through the list of job
definitions.
To view the current list of triggered warnings,
from Central Admin select Monitoring ⇒ Health Analyzer ⇒ Review
Problems and Solutions. Alternately, just click the ever-present red or
yellow bar at the top of Central Admin. If some of the rules have
triggered warnings, you can also reach this page by clicking a link
provided on the yellow or red Health Analyzer warning bar. Here you
will see a list of the outstanding problems, along with the last time
that the check was run and found to still be failing. Clicking the item
description for a rule brings up more details about the failure, and in
many cases the Explanation field will indicate precisely what caused
the check to fail. Figure 3 shows the description for the rule that checks for an outbound SMTP server.
Some of the Health Analyzer rules have
unrealistically high standards, which do not necessarily apply to all
scenarios. For instance, one of the rules triggers a warning if a
SharePoint server’s free disk space drops below an amount equal to five
times the server’s total RAM. This is a well-intentioned rule, but it
doesn’t take into account high-powered SharePoint Web Front End
servers, which could easily be equipped with 32GB of RAM or more. Thus,
it is necessary to take Health Analyzer’s warnings with a grain of
salt, and evaluate any warning against the usage scenario of the farm.
If you determine that the rule does not apply to your farm, you can
disable that rule from the Review Rule Definitions page. You should
look at the Health Analyzer the same way you look at the GPS in your
car. It’s just one of the things you take into consideration when
making decisions. Your GPS can be helpful, but it can also tell you to
turn right when there is no road to turn right onto. Just as you don’t
blindly follow your GPS off a cliff, you shouldn’t blindly follow the
Health Analyzer when it tells you to shrink your databases.
Some failures can be fixed automatically by
Health Analyzer; and in fact, some are configured to do so by default.
For instance, if it is detected that databases being used by SharePoint
have excessively fragmented indexes, then Health Analyzer will launch a
re-indexing stored procedure inside the database. Other alerts, such
as, “Outbound e-mail has not been configured,” can’t be automatically
fixed by Health Analyzer, as the “fix” requires information that it
does not have — namely, the mail server host name and the sender and
reply-to addresses. In addition, some alerts, such as, “Drive is out of
space,” shouldn’t be fixed automatically by SharePoint, unless it will
call your storage vendor and order more drives. Moreover, you don’t
want it to free up space by deleting all the Justin Bieber MP3s that
you have hidden on your SharePoint server.
Unlike the ULS, control over the Health Analyzer
with PowerShell is quite limited; there is a grand total of three
cmdlets related to Health Analyzer, and they don’t really do a whole
lot. You can use the cmdlet Get-SPHealthAnalysisRule to retrieve the full list of Health Analyzer rules, or you can provide the optional -IdentityEnabled property of a Health Analyzer Rule, you cannot. You can try setting it to either $true or $false, but even if you call the Update() method, the new setting won’t stick. To change the Enabled state of a Health Analyzer Rule, you have to use one of the other two cmdlets: Enable-SPHealthAnalysisRule and Disable-SPHealthAnalysisRule, which are self-explanatory.
parameter to retrieve a single rule by its name or GUID. However, the
only information returned is the rule’s system name, GUID,
enabled/disabled status, category, and summary. Notably absent are the
options for Scope, Server, and Repair Automatically. In addition, while
it may appear that you can set the
Earlier you learned about the Health Analyzer
Rule that triggers an alert if the free space on any of your SharePoint
server drives drops below five times the amount of RAM in the system.
You also learned that in systems with a lot of RAM, that alert is not
practical. You can use Get-SPHealthAnalysisRule to view a list of all rules and their names. Then you can use Disable-SPHealthAnalysisRule to disable that rule and get that monkey off your back:
Get-SPHealthAnalysisRule AppServerDrivesAreNearlyFullWarning | Disable
-SPHealthAnalysisRule
PowerShell will ask you to confirm that
you really want to disable this rule. You do. Now that unhelpful alert
will no longer sully your Central Admin pages.