Ease of Recovery
I often quote an old coworker of mine, “No one cares if you can back up—only if you
can restore.” Ease of recovery and speed of recovery often are overlooked when evaluating
backup products. Many small factors can make doing restores either very easy or
impossible:
- Platform independence
-
This is a very important factor. Some products have gone to the trouble to
ensure that every volume made by every version of their software can be read by
every other version, no matter what platform it was made on. If this has been done,
and a client is destroyed, its data still can be recovered, even if the replacement
version is not of the same type. If volumes are not platform-independent, an
administrator might need to keep a functional machine of each operating system
version just for restores. Having true platform independence also makes doing regular restores much
easier.
- Parallel restore
-
This can be a very nice feature. When restoring a large directory or filesystem,
the backups for that filesystem may be spread out over several volumes. Some
products are able to read all these volumes at once, actually making the restore
faster than the backup. When investigating this possibility, you also should find
out if these volumes can be loaded in any order.
- User restores
-
Some environments have sophisticated users who like to be able to do their own
restores. If this is true in your environment, this feature will come in handy.
Those who do not want users doing their own restores will want to know whether this
feature can be disabled.
- Relocated restores
-
This is very important. You sometimes need to restore a
file that was originally located on another system. This different location may be a
different host or a different directory. Some products do not allow this.
- Bare-metal restores
-
A bare-metal restore is restoring a system from scratch, without even having a
functioning operating system.This is sort of the Holy Grail of backups—the ability to
restore an entire system from nothing. Several products now offer bare-metal
recovery for one or more platforms.
- Multiple versions
-
This is also very important. A lot of backup products not only track the most
recent version of a file that was backed up but also track all versions of the file
that are on backup volumes. Sometimes it’s necessary to restore a file to the way it
looked four days ago.
- Tracking deleted files
-
This one surprises many people. Suppose there is a filesystem that changes quite
a bit. New files are added and deleted every day. (A good example of this would be
where Oracle’s archived redo logs go. Hundreds of files may be added and deleted
every day.) When asking the backup software to restore this filesystem, you might
expect it to restore the filesystem the way it looked yesterday. Unfortunately, many
products will instead restore all
files that were ever located in that directory! It takes extra effort on
the part of the software to “notice” that a file has been deleted and to
not restore it unless told to do so. Failure to track deleted
files can make restoring some filesystems very difficult.
- Overwriting options
-
Has a user ever called and said that he blew away half his home directory
because he typed rm
-r
*
by accident? This user doesn’t want the program
to blow away everything in his directory by restoring on top of it. A good way to
protect against that is to tell the backup software to “restore everything in here,
except those files that are newer than what we have on backup.” There are a number
of other overwriting options, such as unconditional overwrite, prompt before
overwrite, and don’t overwrite the same exact file.
Robustness
Horrible things happen to backup systems. Systems reboot and get powered off. Backup
drives hang, libraries jam, and networks die. The “robustness” of a product can be
measured by how well it deals with these sorts of problems. Can the product reroute
backups from a failed backup drive to a good one? Will it even notice that a backup drive
or process is hung? Is it able to recover from a client rebooting while the client is
being backed up? Will this client rebooting in the middle of its backup corrupt the
index?
These are very important considerations. Things will go wrong, and when they do, you
want to know the backups are still OK. If the backup product is able to reroute around
failures, retry open files, and restart failed backups, the worst that should happen to
you is a really long report when you come in the next morning.
Automation
There are some very nice tape and optical libraries out there now. They have bar
codes, automatic cleaning,
hot-swappable power supplies, field-replaceable drives, and more. How well does this
product take advantage of these things? One of the greatest features of using a modern
library using bar-coded volumes. Put 20 volumes in the library, and then tell the backup
software to read the bar code and electronically label the volumes according to what the
bar code says. That sure beats swapping 20 volumes in and out of backup drives for a
half-hour while they get labeled!
Volume Verification
Another often-ignored area of data protection software is its ability to verify its
own backups. There are plenty of horror stories out there about people who did backups for
years or months assuming that they were working just fine. However, when they went to read
the backup volumes, the backup software told them that it couldn’t read them. The only way
to ensure that this never happens to you is to run regular verification tests against your
media. There are several different types of verification:
- Reading part of volume and comparing it
-
There is at least one major vendor that works this way. If you turn on media
verification, it forwards to the end of the volume and reads a file or two. It
compares those files against what it believes should be there. This is obviously the
lowest level of verification.
- Comparing table of contents to index
-
This is a step up from the first type of verification. It is the equivalent of
doing a tar
tvf
. It does not verify the contents of the file;
it verifies only that the backup software can read the header of the file.
- Comparing contents of backup against contents of filesystem
-
This type of verification is common in low-end PC backup software. Basically,
the backup software looks at its backup of a particular filesystem, then compares
its contents against the actual contents of the filesystem. Some software packages
that do this will automatically back up any files that are different from what’s on
the backup or that do not exist on the backup. This type of verification is very
difficult, because most systems are changing constantly.
- Comparing checksum to index
-
Some backup software products record a checksum for each file that they back up.
They then are able to read the backup volume and compare the checksum of the file on
the volume with the checksum that is recorded in the index for that file. This makes
sure that the file on the backup volume will be readable when the time comes.
Cost
The pricing aspect of backup software is too complex to cover in detail here, but
suffice it to say that there are a number of factors that may be included in the total
price, depending on which vendor you buy from:
-
The number of clients that you want to back up
-
The number of backup drives you wish to use
-
What type of backup drives you want to use (high-speed devices often cost
more)
-
The number of libraries and the number of drives and slots that they have
-
The size of the systems (in CPU power)
-
The speed of backup that you need
-
The number of database servers you have
-
The number of different types of databases that you have
-
The number of other special-treatment clients (e.g., MVS, Back Office) that
require special interfaces
-
The type of support you expect (24/7, 8/5, etc.)