To diagnose and resolve startup
problems, you need to understand the sequence of events that occur
after you press the power button on a computer. When you press the power
button, the following happens:
-
The firmware interface performs system configuration, also known as power-on self test (POST).
-
The firmware interface performs setup of the computer, also known as initialization of the computer.
-
The firmware interface passes control to the operating system loader, also known as the boot manager.
-
The boot manager starts the boot loader. The boot loader uses the
firmware interface boot services to complete operating system boot and
load the operating system. Loading the operating system involves:
-
Loading (but not running) the operating system kernel. Normally, Ntoskrnl.exe.
-
Loading (but not running) the hardware abstraction layer (HAL). Normally, Hal.dll.
-
Loading the HKEY_LOCAL_MACHINE\SYSTEM registry hive into memory (from %SystemRoot%\System32\Config\System).
-
Scanning the
HKEY_LOCAL_MACHINE\SYSTEM\Services key for device drivers and then
loading (but not initializing) the drivers that are configured for the
boot class into memory. Drivers are also services (which means both
device drivers and system services are prepared).
-
Enabling memory paging.
-
The boot loader passes control to the operating system kernel.
-
The kernel and the HAL initialize the Windows executive, which in
turn processes the configuration information stored in the
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet hive and then starts device
drivers and system services.
-
The kernel starts the Session Manager (Smss.exe), which in turn:
-
Initializes the system environment by creating system environment variables.
-
Starts the Win32 subsystem (Csrss.exe). Here, Windows switches the display output from text mode to graphics mode.
-
Starts the Windows Logon Manager (Winlogon.exe), which in turn starts
the Services Control Manager (Services.exe) and the Local Security
Authority (Lsass.exe) and waits for a user to log on.
-
Creates additional paging files that are required.
-
As necessary, performs delayed renaming of in-use files that were updated in the previous session.
-
The Windows Logon Manager waits for a user to log on. The logon user
interface and the default credential provider collect the user name and
password and pass this information to the Local Security Authority for
authentication.
-
The Windows Logon Manager runs Userinit.exe and the File Explorer
shell. Userinit.exe initializes the user environment by creating user
environment variables, running startup programs, and performing other essential tasks.
This sequence of events is for a cold start of a computer from power
on through logon. The sequence of events varies if the computer is
resuming from sleep, standby, or hibernation. The sequence of events
also varies if you are starting an operating system other than Windows
or a Windows operating system other than Windows Vista or later.
Note
REAL WORLD With WOA, the
sequence of events is similar but slightly different as well. Here, UEFI
provides the services for necessary for loading the operating system.
Windows Boot Manager initializes the operating system by starting the
Windows Boot Loader, which in turn starts the operating system by using
information in the BCD store. The boot loader passes control to the
operating system kernel. The kernel and the HAL initialize the Windows
executive. Information needed to configure WOA is stored in tables so
the operating system can read the table and configure WOA. In order to
load device drivers and continue boot, the Windows executive initializes
the simple peripheral busses (a series of low-power serial busses) and
then the device drivers that support connections to those busses. The
kernel can then start the Session Manager which in turn brings up the
rest of the system.
Sometimes you can identify the source of a startup problem by pinpointing where the startup process breaks. Table 1
lists the various startup phases and provides a possible cause of
problems in each phase. The phase numbers are meant only to aid in the
subsequent discussion.
Table 1. Troubleshooting Startup
PHASE |
PHASE TITLE |
POSSIBLE CAUSE OF PROBLEM |
---|
1 |
System configuration, power-on self-test |
Hardware failure or missing device |
2 |
Setup, initial startup |
Firmware configuration, the disk subsystem, or the file system |
3 |
Operating system loader, boot manager |
BCD data, improper operating system selection for loading, or invalid boot loader |
4 |
Kernel, HAL, Windows executive |
Driver or service configuration or service dependencies |
5 |
Session Manager |
Graphics display mode, system environment, or component configuration |
Troubleshooting Startup Phase 1
When you power on a computer from a cold state, system configuration (power-on self-test) occurs first. During this phase, the firmware
performs initial checks of hardware, verifies that required devices are
present, and reads the system configuration settings from nonvolatile
memory on the motherboard. Although nonvolatile memory could be
Electronically Erasable Programmable Read-Only Memory (EEPROM), flash,
or battery-backed RAM, it is more typically flash memory that remains
even after you shut down and unplug the computer.
After the motherboard firmware performs its tests and reads its
settings, add-on devices that have their own firmware, such as video
cards and host controller cards, perform their tests and load their
settings. If startup fails in this phase, the computer likely has a
hardware failure. A required device, such as a keyboard, mouse, or hard
disk, could also be missing. In most cases, the firmware interface
displays an error message that indicates the problem. If video isn’t
working, the firmware interface might indicate the problem by emitting a
series of beeps.
You can resolve a problem with a keyboard, mouse, or display by
checking the device’s connection to the computer. If another device is
causing a problem, you might be able to resolve the problem by changing
the device configuration in the firmware interface, or you might need to replace the device.
Troubleshooting Startup Phase 2
Once system configuration is complete, the computer enters the setup, or initial startup, phase. Firmware
interface settings determine the devices the computer uses to start the
operating system. The boot order and the boot enabled or disabled state
of each device affects startup. As discussed previously, the computer
tries to boot using the device listed first. If that fails, the computer
tries the second boot device, and so on. If none of the configured
devices are bootable, you’ll see an error similar to the following:
Non-system disk or disk error
Replace and press any key when ready to continue
Here, you’ll want to check the boot order and be sure it is set
correctly. If you are trying to boot from DVD media, check that the
media is present and that DVD booting is enabled. If you are trying to
boot from a hard disk, make sure booting from a hard disk is enabled and
listed prior to any USB or other removable media you’ve inserted. If
you’ve recently installed a hard disk, power off and unplug the
computer, and then verify that all cables are connected correctly and
that any jumpers are configured correctly.
Because configuring boot options in firmware isn’t necessarily
intuitive, I’ll provide examples from a cross-section of computers by
various vendors. On an HP notebook computer, the boot settings are found
on the Boot Options and Boot Order submenus on the System Configuration
page. The Boot Options submenu has these options:
-
F10 And F12 Delay (sec)
Sets the amount of
time for the user to press F10 or F12 at startup. On this laptop, F10
and F12 access boot options and advanced boot options, respectively.
-
DVD Boot
Enables or disables DVD boot during startup.
-
Floppy Boot
Enables or disables the floppy boot during startup.
-
Internal Network Adapter Boot
Enables or disables networking booting during startup.
Use the Up and Down Arrow keys to select an option, and then press Enter to view and set the option.
On the Boot Order submenu, the boot order is listed as the following:
-
USB Floppy
-
ATAPI CD/DVD ROM Drive
-
Notebook Hard Drive
-
USB Diskette On Key
-
USB Hard Drive
-
Network Adapter (only if Internal Network Adapter Boot is enabled)
Here, you use the Up and Down Arrow keys to select a device, and then
press F5 or F6 to move the device up or down in the list. It is
important to note that this computer (like many newer computers)
distinguishes between USB flash keys (referred to as USB diskettes on keys) and USB drives (referred to as USB hard drives). Computer users won’t really see a difference between the two.
On a Dell Inspiron laptop, you manage boot settings on the Boot page. The boot order is listed as:
-
Hard disk
-
USB hard disk
-
CD/DVD
-
USB CD/DVD
-
USB Floppy
-
Network
You use the Up and Down Arrow keys to navigate the boot priority
list. Press Enter to select a priority level for editing and then to
select the device that should have that priority. Select Disabled to
temporarily disable that boot priority level.
More desktop computers are being shipped with hardware redundant
array of independent disks (RAID) controller cards. On a Dell computer I
have, the SATA Operation option of the Drives submenu is used to enable
or disable the hardware RAID controller card. Typically, RAID
controller cards for desktop computers support RAID 0 and RAID 1. RAID 0
offers no data protection and simply stretches a logical disk volume
across multiple physical disks. RAID 1 offers data protection by
mirroring the disks. When disks are mirrored, two physical disks appear
as one disk, and each disk has identical copies of any data.
Note
REAL WORLD A computer with a
hardware RAID controller may not boot if one of the drives required for
RAID operations is removed from the computer without first disabling the
hardware RAID. If the remaining drive is bootable, disable RAID in BIOS
and then restart the computer to enable booting of the operating
system.
Troubleshooting Startup Phase 3
After setup, the firmware interface passes control to the boot manager. The boot manager in turn starts the boot loader.
On computers using BIOS, the computer reads information from the
master book record (MBR), which normally is the first sector of data on
the disk. The MBR contains boot instructions and a partition table that
identifies disk partitions. The active partition, also known as the boot partition,
has boot code in its first sector of data as well. The data provides
information about the file system on the partition and enables the firmware
to locate and start the Bootmgr stub program in the root directory of
the boot partition. Bootmgr switches the process into 32-bit or 64-bit
protected mode from real mode and loads the 32-bit or 64-bit Windows
Boot Manager as appropriate (found within the stub file itself). Windows
Boot Manager locates and starts the Windows Boot Loader (Winload).
Problems can occur if the active boot partition does not exist or if
any boot sector data is missing or corrupt. Errors you might see
include:
Error loading operating system
and
Invalid partition table
In many cases, you can restore proper operations by using the Startup Repair tool.
In contrast, computers using EFI have a built-in boot manager. When
you install Windows, Windows adds an entry to the EFI boot manager
called Windows Boot Manager, which
points to the boot manager’s executable file on the EFI system partition
(\Efi\Microsoft\Boot\Bootmgfw.efi). The boot manager then passes
control to the Windows Boot Loader.
Problems can occur if you install a different operating system or
change the EFI boot manager settings. In many cases, you’ll be able to
restore proper operations by using the Startup Repair tool or by changing EFI boot manager settings.
Troubleshooting Startup Phase 4
The boot loader uses the firmware interface boot services to complete operating system boot. The boot loader loads the operating system kernel (Ntoskrnl.exe) and then loads the hardware abstraction layer (HAL), Hal.dll. Next, the boot loader loads the HKEY_LOCAL_MACHINE\SYSTEM
registry hive into memory (from %SystemRoot%\System32\Config\System),
and then it scans the HKEY_LOCAL_MACHINE\SYSTEM\Services key for device
drivers. The boot loader scans this registry hive to find drivers that
are configured for the boot class and loads them into memory.
Once the boot loader passes control to the operating system kernel,
the kernel and the HAL initialize the Windows executive, which in turn
processes the configuration information stored in the
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet hive and then starts device
drivers and system services. Drivers and services are started according
to their start-type value. This value is set on the Start subkey under
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Name, where Name is
the name of the device or service. Valid values are 0 (identifies a
boot driver), 1 (identifies a system driver), 2 (identifies an auto-load
driver or service), 3 (identifies a load-on-demand driver or service), 4
(identifies a disabled and not-started driver or service), and 5
(identifies a delayed-start service). Drivers are started in the order
boot, system, auto load, load on demand, and delayed start.
Most problems in this phase have to do with invalid driver and
service configurations. Some drivers and services are dependent on other
components and services. If dependent components or services are not
available or configured properly, this also could cause startup
problems.
During startup,
subkeys of HKEY_LOCAL_MACHINE\SYSTEM are used to configure devices and
services. The Select subkey has several values used in this regard:
-
The Current value is a pointer to the ControlSet subkey containing
the current configuration definitions for all devices and services.
-
The Default value is a pointer to the ControlSet subkey containing the configuration definition the computer uses at the next startup, provided that no error occurs and that you don’t use an alternate configuration.
-
The Failed value is a pointer to the ControlSet subkey containing a configuration definition that failed to load Windows.
-
The LastKnownGood value is a pointer to the ControlSet subkey
containing the configuration definition that was used for the last
successful logon.
During normal startup, the computer uses the Default control set.
Generally, if no error has occurred during startup or you haven’t
selected the last known good configuration, the Default, Current, and
LastKnownGood values all point to the same ControlSet subkey, such as
ControlSet001. If startup fails and you access the last known good
configuration by using the Advanced Boot options, the Failed entry is
updated to point to the configuration definition that failed to load. If
startup succeeds and you haven’t accessed the last known good
configuration, the LastKnownGood value is updated to point to the
current configuration definition.
Troubleshooting Startup Phase 5
During the final phase of startup, the kernel starts the Session
Manager (Smss.exe). The Session Manager initializes the system
environment by creating system environment variables and starting the
Win32 subsystem (Csrss.exe).
This is the point at which Windows switches from the text presentation
mode used initially to a graphics presentation mode. Generally, if the
display adapter is broken or not properly seated, the computer won’t
display in either text or graphics mode, but if the display adapter is
configured improperly, you’ll often notice this when the computer
switches to graphics mode.
The display is only one of several components that might first
present problems during this late phase of startup. If startup fails
during this phase, you can identify problem components by using boot
logging. If the computer has a Stop error in this phase, use the
information provided by the Stop message to help you identify the
problem component.
The Session Manager starts the Windows Logon Manager (Winlogon.exe), which in turn starts the Services Control Manager (Services.exe) and the Local
Security Authority (Lsass.exe) and waits for a user to log on. When a
user logs on, the Windows Logon Manager runs Userinit.exe and the File
Explorer shell. Userinit.exe initializes the user environment by
creating user environment variables, running startup programs, and
performing other essential tasks. The File Explorer shell provides the
desktop, taskbar, and menu system.
If you encounter startup problems during or after logon, the problem is likely due to a misconfigured service or startup application.