|
In the field of information
technology, backup refers to the copying of data so that
these additional copies may be restored after a data loss
event. Backups are useful primarily for two purposes: to restore a
computer to an operational state following a disaster (called
disaster recovery) and to restore small numbers of files after they
have been accidentally deleted or corrupted. Backups differ from
archives in the sense that archives are the primary copy of data and
backups are a secondary copy of data. Backup systems differ from
fault-tolerant systems in the sense that backup systems assume that
a fault will cause a data loss event and fault-tolerant
systems assume a fault will not. Backups are typically that
last line of defence against data loss, and consequently the
least granular and the least convenient to use.
Since a backup system contains at least one
copy of all data worth saving, the data storage requirements are considerable.
Organizing this storage space and managing the backup process is a complicated
undertaking.
Storage, the base of a backup
system
Data repository models
Any backup strategy starts with a concept of a
data repository. The backup data needs to be stored somehow and probably should
be organized to a degree. It can be as simple as a sheet of paper with a list of
all backup tapes and the dates they were written or a more sophisticated setup
with a computerized index, catalogue, or relational database. Different
repository models have different advantages. This is closely related to choosing
a backup rotation scheme.
- Unstructured
- An unstructured repository may simply be a
stack of floppy disks or CD-R media with minimal information about what was
backed up and when. This is the easiest to implement, but probably the least
likely to achieve a high level of recoverability.
- Full + Incrementals
- A Full + Incremental repository aims to
make storing several copies of the source data more feasible. At first, a
full backup (of all files) is taken. After that an incremental
backup (of only the files that have changed since the previous backup) can
be taken. Restoring whole systems to a certain point in time would require
locating the full backup taken previous to that time and all the incremental
backups taken between that full backup and the particular point in time to
which the system is supposed to be restored. This model offers a high level
of security that
something can be restored and can be used with removable media such as
tapes and optical disks. The downside is dealing with a long series of
incrementals and the high storage requirements.
- Mirror + Reverse Incrementals
- A Mirror + Reverse Incrementals repository
is similar to a Full + Incrementals repository. The difference is instead of
an aging full backup followed by a series of incrementals, this model offers
a mirror that reflects the system state as of the last backup and a history
of reverse incrementals. One benefit of this is it only requires an initial
full backup. Each incremental backup is immediately applied to the mirror
and the files they replace are moved to a reverse incremental. This model is
not suited to use removable media since every backup must be done in
comparison to the mirror.
- Continuous data protection
- This model takes it a step further and
instead of scheduling periodic backups, the system immediately logs every
change on the host system.
Storage media
Regardless of the repository model that is
used, the data has to be stored on some data storage medium somewhere.
- Magnetic tape
- Magnetic tape has long been the most
commonly used medium for bulk data storage, backup, archiving, and
interchange. Tape has typically had an order of magnitude better
capacity/price ratio when compared to hard disk, but recently the ratios for
tape and hard disk have become a lot closer. There are myriad formats, many
of which are proprietary or specific to certain markets like mainframes or a
particular brand of personal computers. Tape is a sequential access medium,
so even though access times may be poor, the rate of continuously writing or
reading data can actually be very fast. Some new tape drives are even faster
than modern hard disks.
- Hard disk
- The capacity/price ratio of hard disk has
been rapidly improving for many years. This is making it more competitive
with magnetic tape as a bulk storage medium. The main advantages of hard
disk storage are the high capacity and low access times.
- Optical disk
- A CD-R can be used as a backup device. One
advantage of CDs is that they can hold 650 MiB of data on a 12 cm (4.75")
reflective optical disc. (This is equivalent to 12,000 images or 200,000
pages of text.) They can also be restored on any machine with a CD-ROM
drive. CDs may all look the same, but there are different file formats for
different applications. Another common format is DVD-R. Many optical disk
formats are WORM type, which makes them useful for archival purposes since
the data can't be changed.
- Floppy disk
- During the period 1975–95, most
personal/home computer users associated backup mostly with copying floppy
disks. The low data capacity of floppy disks make it an unpopular choice in
2006.
- Solid state storage
- Also known as flash memory, thumb drives,
USB keys, compact flash, smart media, memory stick, Secure Digital cards,
etc., these devices are relatively costly for their low capacity, but offer
excellent portability and ease-of-use.
- Remote backup
- As broadband internet access becomes more
widespread, network and remote backup/online backups are gaining in
popularity. Backing up online via the internet to a remote location can
eliminate some worse case scenarios, such as someone's study burning down to
the ground along with the computer, its hard drive, and any on-site backup
disks. Several companies are now offering service on encrypted and secure
synchronised backup solutions. A drawback to this type of backup is the
speed of an internet connection is usually substantially slower than the
speed of local data storage devices, so this can be a road block for people
with large amounts of data. It also has the risk of losing control over
personal or sensitive data.
Managing the data repository
Regardless of the data repository model or data
storage media used for backups, a balance needs to be struck between
accessibility, security and cost.
- On-line
- On-line storage (sometimes called
secondary storage) is typically the most accessible type of data storage. A
good example would be a large disk array. This type of storage is very
convenient and speedy, but is relatively expensive and is typically located
in close proximity to the systems being backed up. This proximity is a
problem in the case of a disaster.
- Near-line
- Near-line storage (sometimes called
tertiary storage) is typically less accessible and less expensive than
on-line storage, but still useful for backup data storage. A good example
would be a tape library. A mechanical device is usually involved in moving
media units from storage into a drive where the data can be read or written.
- Off-line
- Off-line storage is similar to near-line,
except it requires human interaction to make storage media available. This
can be as simple as storing backup tapes in a file cabinet.
- Off-site vault
- To protect against a disaster or other
site-specific problem, many people choose to send backup media to an
off-site vault. The vault can be as simple as the System Administrator's
home office or as sophisticated as a disaster hardened, temperature
controlled, high security bunker that has facilities for backup media
storage. An example off-site vault vendor is Iron Mountain.
- Data Recovery Center
- In the event of a disaster, the data on
backup media will not be sufficient to recover. Computer systems onto which
the data can be restored and proper networks are necessary too. Some
organizations have their own data recovery centers that are equipped for
this scenario. Other organizations contract this out to a third-party
recovery center. An example third-party recovery center is SunGard.
Selection, access, and manipulation of data
Approaches to backing up files
Deciding what to backup at any given time is a
harder process than it seems. By backing up too much redundant data, the data
repository will fill up too quickly. If we don't backup enough data, critical
information can get lost. The key concept is to only backup files that have
changed.
- Copying files
- Just copy the files in question somewhere.
- File system dump
- Copy the file system that holds the files
in question somewhere. This usually involves unmounting the file system and
running a program like
dump. This is also known as a raw partition backup. This type of
backup has the possibility of running faster than a backup that simply
copies files. A feature of some dump software is the ability to restore
specific files from the dump image.
- Identification of changes
- Some file systems have an archive bit for
each file that says it was recently changed. Some backup software looks at
the date of the file and compares it with the last backup, to determine
whether the file was changed.
- Block Level Incremental
- A more sophisticated method of backing up
changes to files is to only backup the blocks within the file that changed.
This requires a higher level of integration between the file system and the
backup software.
- Versioning file system
- A versioning file system keeps track of
all changes to a file and makes those changes accessible to the user. This
is a form of backup that is integrated into the computing environment.
Approaches to backing up live data
If a computer system is in use while it is
being backed up, the possibility of files being open for reading or writing is
real. If a file is open, the contents on disk may not correctly represent what
the owner of the file intends. This is especially true for database files of all
kinds.
- Snapshot - copy-on-write
- A snapshot is an instantaneous function of
some file systems that presents a copy of the file system as if it were
frozen in a specific point in time. Closing all files, taking a snapshot,
then reopening the files and running the backup on the snapshot is an
effective way to work around this problem.
- Open file backup - file locking
- Many backup software packages feature the
ability to backup open files. Some simply check for openness and try again
later.
- Hot database backup
- Some database management systems offer a
means to generate a backup image of the database while it is online and
useable ("hot"). This usually includes a consistent image of the data files
at a certain point in time plus a log of changes made while the procedure is
running.
Backing up non-file data
Not all information stored on the computer is
stored in files. Accurately recovering a complete system from scratch requires
keeping track of this non-file data too.
- System description
- System specifications are needed to
procure an exact replacement after a disaster.
- File metadata
- Each file's permissions, owner, group,
ACLs, and any other metadata need to be backed up for a restore to properly
recreate the original environment.
- Partition layout
- The layout of the original disk, as well
as partition tables and file system settings, is needed to properly recreate
the original system.
- Boot sector
- The boot sector can sometimes be recreated
more easily than saving it. Still, it usually isn't a normal file and the
system won't boot without it.
- Deleted files
- How does one backup the fact that a file
once existed (and could be restored from backups) but is now deleted from
the system and shouldn't
be part of any potential restore.
- Moved files
- How does one backup the fact that a file
has moved?
Manipulating the backed up data
It is frequently useful to manipulate the
backed up data to optimize the backup process. These manipulations can improve
backup speed, restore speed, data security, and media usage.
- Compression
- Data compression can be very useful for
fitting the maximum amount of source data onto a limited amount of backup
storage media. Compression is frequently performed by tape drives
transparently.
- De-duplication
- When multiple similar systems are backed
up to the same destination storage device, there exists the potential for
much redundancy within the backed up data. If 20 Windows workstations were
backed up to the same data repository, they might share a common set of
system files. The data repository really only needs to store one copy of
those files to be able to restore any one of those workstations. This
technique can be applied at the file level or even on raw blocks of data,
potentially resulting in a massive reduction in required storage space.
- Duplication
- Sometimes backup jobs are duplicated to a
second set of storage media. This can be done to rearrange the backup images
to optimize restore speed, to have a second copy for safe keeping in a
different location or on a different storage medium.
- Encryption
- High capacity removable storage media such
as backup tapes present a data security risk if they are lost. Encrypting
the data on these media can mitigate this problem, but presents new
problems. First, encryption is a CPU intensive process that can slow down
backup speeds. Second, once data has been encrypted, it can not be
effectively compressed. Third, the security of the encrypted backups is only
as effective as the security of the key management policy.
- Staging
- Sometimes backup jobs are copied to a
staging disk before being copied to tape. This can be useful if there is a
problem matching the speed of the final destination device with the source
system as is frequently faced in network-based backup systems.
Managing the backup process
It is important to understand that backup is a
process. As long as new data is being created and changes are being made,
backups will need to be updated. Individuals and organizations with anything
from one computer to thousands (or even millions) of computer systems all have
requirements for protecting data. While the scale is different, the objectives
and limitations are essentially the same. Likewise, those who perform backups
need to know to what extent they were successful, regardless of scale.
Objectives
- Recovery Point Objective (RPO)
- The point in time that the restarted
infrastructure will reflect. Essentially, this is the roll-back that will be
experienced as a result of the recovery. The most desirable RPO would be the
point just prior to the data loss event. Making a more recent recovery point
achievable requires increasing the frequency of synchronization between the
source data and the backup repository.
- Recovery Time Objective (RTO)
- The amount of time elapsed between
disaster and recovery.
- Data security
- In addition to preserving access to data
for its owners, data must be restricted from unauthorized access. Backups
must be performed in a manner that does not compromise the original owner's
undertaking.
Limitations
An effective backup scheme will take into
consideration the limitations of the situation. All backup schemes have some
impact on the system being backed up. If this impact is significant, the backup
needs to be time-limited to a convenient backup window or alternate means of
protecting data need to be employed. These alternate means tend to be more
expensive. All types of storage media have a finite capacity with a real cost.
Matching the correct amount of storage capacity (over time) with the backup
needs is an important part of the design of a backup scheme. Likewise, limited
network bandwidth comes into play with distributed backup systems.
Implementation
Meeting the defined objectives in the face of
the above limitations can be a difficult task. The tools and concepts below can
make that task more achievable.
- Scheduling
- Using a Job scheduler can greatly improve
the reliability and consistency of backups. Many backup software packages
include this functionality.
- Authentication
- Over the course of regular operations, the
user accounts and/or system agents that perform the backups need to be
authenticated at some level. The power to copy all data off of or onto a
system requires unrestricted access. Using an authentication mechanism is a
good way to prevent the backup scheme from being used for unauthorized
activity.
- Chain of trust
- Removable storage media are physical items
and must only be handled by trusted individuals. Establishing a chain of
trusted individuals (and vendors) is critical to defining the security of
the data.
Measuring the process
To ensure that the backup scheme is working as
expected, the process needs to include monitoring key factors and maintaining
historical data.
- Backup validation
- (also known as "Backup Success
Validation") The process by which owners of data can get information
regarding how their data was backed up. This same process is also used to
prove compliance to regulatory bodies outside of the organization, for
example, a biotech company might be required to show "proof" to the Food and
Drug Administration (FDA) that their test result data are backed up
properly. Terrorism, data complexity, data value and increasing dependence
upon ever-growing volumes of data all contribute to the anxiety around and
dependence upon successful backups. For that reason, many organizations rely
on third-party or "independent" solutions to test, validate, optimize and
charge for their backup operations (backup reporting]). Some modern backup
to disk software have built-in validation capabilities.
- Reporting
- In larger configurations, reports are
useful for monitoring media usage, device status, errors, vault coordination
and other information about the backup process.
- Logging
- In addition to the history of computer
generated reports, activity and change logs are useful for understanding the
backup system better.
- Verification
- Many backup programs make use of checksums
or hashes. These offer several advantages. First, they allow data integrity
to be verified without reference to the original file: if the file as stored
on the backup medium has the same checksum as the saved value, then it is
very probably correct. Second, some backup programs can use checksums to
avoid making redundant copies of files, to improve backup speed. This is
particularly useful for the de-duplication process.
Lore
Advice
- The more important the data that are
stored in the computer the greater is the need for backing up these data.
- A backup is only as useful as its
associated restore strategy.
- Storing the copy near the original is
unwise, since many disasters such as fire, flood and electrical surges are
likely to cause damage to the backup at the same time.
- Automated backup should be considered, as
manual backups are affected by human error.
Events
- The September 11, 2001 attacks on the
World Trade Center presented many organizations with unprecedented disaster
recovery scenarios, due to its scope.
- A few years earlier, during a fire at the
headquarters of Credit Lyonnais, a major bank in Paris, system
administrators ran into the burning building to rescue backup tapes because
they didn't have offsite copies.
- Privacy Rights Clearinghouse has
documented:
- 9 instances of stolen or lost backup
tapes (among major organizations) in 2005. Affected organizations
included Bank of America, Ameritrade, Citigroup, and Time Warner.
- 7 instances of stolen or lost backup
tapes (among major organizations) in 2006.
Glossary of backup terms
- Backup policy
- An organization's procedures and rules for
ensuring that adequate amounts and types of backups are made, including
suitably frequent testing of the process for restoring the original
production system from the backup copies.
- Backup rotation scheme
- A method for effectively backing up data
where multiple media are systematically moved from storage to usage in the
backup process and back to storage. There are several different schemes.
Each takes a different approach to balance the need for a long retention
period with frequently backing up changes. Some schemes are more complicated
than others.
- Backup software
- Computer software applications that are
used for performing the backing up of data, i.e., the systematic generation
of backup copies.
- Backup window
- The period of time that a system is
available to perform a backup procedure. Backup procedures can have
detrimental effects to system and network performance, sometimes requiring
the primary use of the system to be suspended. These effects can be
mitigated by arranging a backup window with the users or owners of the
system.
- Copy backup
- Term for full backup used by Windows
Server 2003.
- Cumulative incremental backup
- Term for a differential backup used by
NetBackup.
- Daily backup
- Term for incremental backup used by
Windows Server 2003.
- Data salvage
- The process of recovering data from
storage devices when the normal operational methods are impossible. This
process is typically performed by specialists in controlled environments
with special tools. For example, a crashed hard disk may still have data on
it even though it doesn't work properly. A data salvage specialist might be
able to recover much of the original data by opening it up in a clean room
and tinkering with the internal parts.
- Differential backup
- A cumulative backup of all changes made
since the last full backup. The advantage to this is the quicker recovery
time, requiring only a full backup and the latest differential backup to
restore the system. The disadvantage is that for each day elapsed since the
last full backup, more data needs to be backed up, especially if a majority
of the data has been changed.
- Differential incremental backup
- Term for an incremental backup used by
NetBackup.
- Disaster recovery
- The process of recovering after a business
disaster and restoring or recreating data. One of the main purposes of
creating backups is to facilitate a successful disaster recovery. For
maximum effectiveness, this process should be planned in advance and
audited.
- Disk image
- A method of backing up a whole disk or
file system in a single image. Since the underlying data structures are what
is actually backed up, this method does not allow for file level control
over what is selected for backup or restore.
- FlashBackup
- Term for raw partition backup used
by NetBackup Advanced Client. In NBAC, support is limited to the VxFS (Veritas),
ufs (Solaris), Online JFS (HP-UX), and NTFS (Windows) file system types.
Similar to the UNIX utility
dump.
- Full backup
- A backup of all (selected) files on the
system. In contrast to a drive image, this does not included the file
allocation tables, partition structure and boot sectors.
- Incremental backup
- A backup that only contains the files that
have changed since the most recent backup (either full or incremental). The
advantage of this is quicker backup times, as only changed files need to be
saved. The disadvantage is longer recovery times, as the latest full backup,
and all incremental backups up to the date of data loss need to be restored.
- Media spanning
- Sometimes a backup job is larger than a
single destination storage medium. In this case, the job must be broken up
into fragments that can be distributed across multiple storage media.
- Multiplexing
- The practice of combining multiple backup
data streams into a single stream that can be written to a single storage
device. For example, backing up 4 PC's to a single tape drive at once.
- Multi-streaming
- The practice of making creating multiple
backup data streams from a single system to multiple storage devices. For
example, backing up a single database to 4 tape drives at once.
- Normal backup
- Term for full backup used by Windows
Server 2003.
- Near store
- Provisionally backing up data to a local
staging backup device, possibly for later archival backup to a remote store
device.
- Open file backup
- Term for the ability to backup a file
while it is in use by another application.
- Remote store
- Backing up data to an offsite permanent
backup facility, either directly from the live data source or else from an
intermediate near store device.
- Restore time
- The amount of time required to bring a
desired data set back from the backup media.
- Retention time
- The amount of time in which a given set of
data will remain available for restore. Some backup products rely on daily
copies of data and measure retention in terms of days. Others retain a
number of copies of data changes regardless of the amount of time.
- Synthetic backup
- Term used by NetBackup for a restorable
backup image that is synthesized on the backup server from a previous full
backup and all the incremental backups since then. It is equivalent to what
a full backup would be if it were taken at the time of the last incremental
backup.
- Tape library
- A storage device which contains tape
drives, slots to hold tape cartridges, a barcode reader to identify tape
cartridges and an automated method for physically moving tapes within the
device. These devices can store immense amounts of data.
- True image restore
- Term used by NetBackup for the collection
of file deletion and file movement records so that an accurate restore can
be performed. For instance, consider a system that has a directory with 5
documents in it on Friday. On Saturday, the system gets a full backup that
includes those 5 documents. On Monday, the owner of those documents deletes
2 of them and updates 1 of the 3 remaining. That updated document gets
backed up as part of The Monday night incremental backup. On Tuesday
afternoon the system crashes. If we perform a normal restore of the full
backup from Saturday and the incremental backup from Monday to the fresh
system, we will have restored the 2 documents that were intentionally
deleted. True image restore keeps track of the deletions with each
incremental backup and prevents the deleted files from being inappropriately
restored.
- Virtual Tape Library (VTL)
- A storage device that appears to be a tape
library to backup software, but actually stores data by some other means. A
VTL can be configured as a temporary storage location before data is
actually sent to real tapes or it can be the final storage location itself.
|