Unless you're a DBA, you'd probably define
a database backup as a complete copy of a database at a given point in
time. While that's one type of database backup, there are many others. Consider a multi-terabyte database that's used 24/7:
How long does the backup take, and what impact does it have on users?
Where are the backups stored, and what is the media cost?
How much of the database changes each day?
If
the database failed partway through the day, how much data would be
lost if the only recovery point was the previous night's backup?
In
considering these questions, particularly for large databases with high
transaction rates, we soon realize that simplistic backup strategies
limited to full nightly backups are insufficient for a number of
reasons, not the least of which is the potential for data loss. Let's
consider the different types of backups in SQL Server.
1. Full backup
Full
backups are the simplest, most well understood type of database backup.
Like standard file backups (documents, spreadsheets, and so forth), a
full backup is a complete copy of the database at a given time. But
unlike with a normal file backup, you can't back up a database by
simply backing up the underlying .mdf and .ldf files.
One
of the classic mistakes made by organizations without appropriate DBA
knowledge is using a backup program to back up all files on a database
server based on the assumption that the inclusion of the underlying
database files (.mdf and .ldf) in the backup will be sufficient for a
restore scenario. Not only will this backup strategy be unsuccessful,
but those who use such an approach usually fail to realize that fact
until they try to perform a restore.
For a database backup to be valid, you must use the BACKUP DATABASE
command or one of its GUI equivalents.
-- Full Backup to Disk
BACKUP DATABASE [AdventureWorks2008]
TO DISK = N'G:\SQL Backup\AdventureWorks.bak'
WITH INIT
You
can perform backups in SQL Server while the database is in use and is
being modified by users. Such backups are known as online backups. In
order for the resultant backup to be restored as a transactionally
consistent database, SQL Server includes part
of the transaction log in the full database backup. Before we cover the
transaction log in more detail, let's consider an example of a full
backup that's executed against a database that's being actively
modified.
Figure 1 shows a hypothetical example of a transaction that starts and completes during a full backup, and modifies a page after
the backup process has read it from disk. In order for the backup to be
transactionally consistent, how will the backup process ensure this
modified page is included in the backup file? In answering this
question, let's walk through the backup step by step. The step numbers
in the following list correspond to the steps in figure 1.
When the backup commences, a checkpoint is issued that flushes dirty buffer cache pages to disk.
After
the checkpoint completes, the backup process begins reading pages from
the database for inclusion in the backup file(s), including page X.
Transaction
A modifies page X. The backup has already included page X in the backup
file, so this page is now out of date in the backup file.
Transaction
B begins, but won't complete until after the backup finishes. At the
point of backup completion, this transaction is the oldest active
(uncommitted/incomplete) transaction.
Transaction A completes successfully.
The backup completes reading pages from the database.
As described shortly, the backup process includes part of the transaction log in the backup.
If
the full backup process didn't include any of the transaction log, the
restore would produce a backup that wasn't transactionally consistent.
Transaction A's committed changes to page X wouldn't be in the restored
database, and because transaction B hasn't completed, its changes would
have to be rolled back. By including parts of the transaction log, the
restore process is able to roll forward committed changes and roll back
uncommitted changes as appropriate.
In
our example, once SQL Server completes reading database pages at step
7, it will include all entries in the transaction log since the oldest log sequence number (LSN) of one of the following:
The checkpoint (step 1 in our example)
The oldest active transaction (step 5)
The LSN of the last replicated transaction (not applicable in our example)
In
our example, transaction log entries since step 1 will be included
because that's the oldest of these items. However, consider a case
where a transaction starts before the
backup begins and is still active at the end of the backup. In such a
case, the LSN of that transaction will be used as the start point.
It's important to point out here that even though parts of the transaction log are included in a full backup, this doesn't constitute a transaction log backup.
Another classic mistake made by inexperienced SQL Server DBAs is never
performing transaction log backups because they think a full backup
will take care of it. A database in full recovery mode (discussed
shortly) will maintain entries in the transaction log until it's backed
up. If explicit transaction log backups are never performed, the
transaction log will continue growing forever (until it fills the
disk). It's not unusual to see a 2GB database with a 200GB transaction
log!
Finally,
when a full backup is restored as shown in our next example, changes
since the full backup are lost. In later examples, we'll look at
combining a full backup with differential and transaction log backups
to restore changes made after the full backup was taken.
-- Restore from Disk
RESTORE DATABASE [AdventureWorks2008]
FROM DISK = N'G:\SQL Backup\AdventureWorks.bak'
WITH REPLACE
To reduce the user impact and storage costs of nightly full backups, we can use differential backups.
Backing
up a database to multiple files can lead to a significant reduction in
backup time, particularly for large databases. When you use the T-SQL BACKUP DATABASE command, the DISK = clause can be repeated multiple times (separated by commas), once for each backup file, as per this example:
|
BACKUP DATABASE [ADVENTUREWORKS2008]
TO
DISK = 'G:\SQL BACKUP\ADVENTUREWORKS_1.BAK'
, DISK = 'G:\SQL BACKUP\ADVENTUREWORKS_2.BAK'
, DISK = 'G:\SQL BACKUP\ADVENTUREWORKS_3.BAK'