Chapter 11

Backups

Like most of those who study history, he (Napoleon III) learned from the mistakes of the past how to make new ones.

A.J.P. Taylor.

Introduction

This is THE MOST IMPORTANT responsibility of the System Administrator. Backups MUST be made of all the data on the system.  It is inevitable that equipment will fail and that users will "accidentally" delete files.  There should be a safety net so that important information can be recovered.

It isn't just users who accidentally delete files

A friend of mine who was once the administrator of a UNIX machine (and shall remain nameless, but is now a respected Academic at CQU) committed one of the great no-no's of UNIX Administration. 

Early on in his career he was carefully removing numerous old files for some obscure reason when he entered commands resembling the following (he was logged in as root when doing this).

cd / usr/user/panea     notice the mistake
rm -r *  

The first command contained a typing mistake (the extra space) that meant that instead of being in the directory /usr/user/panea he was now in the / directory.  The second command says delete everything in the current directory and any directories below it.  Result: a great many files removed.

The moral of this story is that everyone makes mistakes.  Root users, normal users, hardware and software all make mistakes, break down or have faults.  This means you must keep backups of any system.

Characteristics of a good backup strategy

Backup strategies change from site to site.  What works on one machine may not be possible on another.  There is no standard backup strategy.  There are however a number of characteristics that need to be considered including

§         ease of use,

§         time efficiency,

§         ease of restoring files,

§         ability to verify backups,

§         tolerance of faulty media, and

§         portabilty to a range of machines.

Ease of use

If backups are easy to use, you will use them. AUTOMATE!!  It should be as easy as placing a tape in a drive, typing a command and waiting for it to complete.  In fact you probably shouldn't have to enter the command, it should be automatically run. 

When backups are too much work

At many large computing sites operators are employed to perform low-level tasks like looking after backups.  Looking after backups generally involves obtaining a blank tape, labelling it, placing it in the tape drive and then storing it away.

A true story that is told by an experienced Systems Administrator is about an operator that thought backups took too long to perform.  To solve this problem the operator decided backups finished much quicker if you didn't bother putting the tape in the tape drive.  You just labelled the blank tape and placed it in storage. 

Quite alright as long as you don't want to retrieve anything from the backups.

Time efficiency

Obtain a balance to minimise the amount of operator, real and CPU time taken to carry out the backup and to restore files.  The typical tradeoff is that a quick backup implies a longer time to restore files.  Keep in mind that you will in general perform more backups than restores.

On some large sites, particular backup strategies fail because there aren’t enough hours in a day.  Backups scheduled to occur every 24 hours fail because the previous backup still hasn't finished.  This obviously occurs at sites which have large disks.

Ease of restoring files

The reason for doing backups is so you can get information back.  You will have to be able to restore information ranging from a single file to an entire file system.  You need to know on which media the required file is and you need to be able to get to it quickly.

This means that you will need to maintain a table of contents and label media carefully.


Ability to verify backups

YOU MUST VERIFY YOUR BACKUPS.  The safest method is once the backup is complete, read the information back from the media and compare it with the information stored on the disk.  If it isn’t the same then the backup is not correct.

Well that is a nice theory but it rarely works in practice.  This method is only valid if the information on the disk hasn't changed since the backup started. This means the file system cannot be used by users while a backup is being performed or during the verification.  Keeping a file system unused for this amount of time is not often an option.

Other quicker methods include

§         restoring a random selection of files from the start, middle and end of the backup,
If these particular files are retrieved correctly the assumption is that all of the files are valid.

§         create a table of contents during the backup; afterwards read the contents of the tape and compare the two.

These methods also do not always work.  Under some conditions and with some commands the two methods will not guarantee that your backup is correct.

Tolerance of faulty media

A backup strategy should be able to handle

§         faults in the media, and

§         physical dangers.

There are situations where it is important that

§         there exist at least two copies of full backups of a system, and

§         that at least one set should be stored at another site.

Consider the following situation. 

A site has one set of full backups stored on tapes.  They are currently performing another full backup of the system onto the same tapes.  What happens when the backup system is happily churning away when it gets about halfway and crashes (the power goes off, the tape drive fails etc).  This could result in the both the tape and the disk drive being corrupted.  Always maintain duplicate copies of full backups.

An example of the importance of storing backups off site was the Pauls ice-cream factory in Brisbane.  The factory is located right on the riverbank and during the early 1970's Brisbane suffered problems caused by a major flood.  The Pauls’ computer room was in the basement of their factory and was completely washed out.  All the backups were kept in the computer room.

 

Portabilty to a range of platforms

There may be situations where the data stored on backups must be retrieved onto a different type of machine.  The ability for backups to be portable to different types of machine is often an important characteristic. 

For example:

The computer currently being used by a company is the last in its line.  The manufacturer is bankrupt and no one else uses the machine.  Due to unforeseen circumstances the machine burns to the ground.  The Systems Administrator has recent backups available and they contain essential data for this business.  How are the backups to be used to reconstruct the system?

Considerations for a backup strategy

Apart from the above characteristics, factors that may affect the type of backup strategy implemented will include

§         the available commands
The characteristics of the available commands limit what can be done.

§         available hardware
The capacity of the backup media to be used also limits how backups are performed.  In particular how much information can the media hold?

§         maximum expected size of file systems
The amount of information required to be backed up and whether or not the combination of the available software and hardware can handle it.  A suggestion is that individual file systems should never contain more information than can fit easily onto the backup media.

§         importance of the data
The more important the data is, the more important that it be backed up regularly and safely.

§         level of data modification
The more data being created and modified, the more often it should be backed up.  For example the directories /bin and /usr/bin will hardly ever change so they rarely need backing up.  On the other hand directories under /home are likely to change drastically every day.

The components of backups

There are basically three components to a backup strategy.  The

§         scheduler
Decides when the backup is performed.

§         transport, and
The command that moves the backup from the disks to the backup media.

§         media
The actual physical device on which the backup is stored.

Scheduler

The scheduler is the component that decides when backups should be performed and how much should be backed up.  The scheduler could be the root user or a program, usually cron (discussed in a later chapter).

The amount of information that the scheduler backs up can have the following categories

§         full backups,
All the information on the entire system is backed up.  This is the safest type but also the most expensive in machine and operator time and the amount of media required.

§         partial backups, or
Only the busier and more important file systems are backed up.  One example of a partial backup might include configuration files (like /etc/passwd), user home directories and the mail and news spool directories.  The reasoning is that these files change the most and are the most important to keep a track of.  In most instances this can still take substantial resources to perform.

§         incremental backups.
Only those files that have been modified since the last backup are backed up.  This method requires less resources but a large amount of incremental backups make it more difficult to locate the version of a particular file you may desire.

Transport

The transport is a program that is responsible for placing the backed-up data onto the media.  There are quite a number of different programs that can be used as transports.  Some of the standard UNIX transport programs are examined later in this chapter.

There are two basic mechanisms that are used by transport programs to obtain the information from the disk

§         image, and

§         through the file system.

Image transports

An image transport program bypasses the file system and reads the information straight off the disk using the raw device file.  To do, this the transport program needs to understand how the information is structured on the disk.  This means that transport programs are linked very closely to exact file systems since different file systems structure information differently.

Once read off the disk, the data is written byte by byte from disk onto tape.  This method generally means that backups are usually quicker than the "file by file" method.  However restoration of individual files generally takes much more time. 

Transport programs that use the method include dd, volcopy and dump.

File by file

Commands performing backups using this method use the system calls provided by the operating system to read the information.  Since almost any UNIX system uses the same system calls, a transport program that uses the file by file method (and the data it saves) is more portable.

File by file backups generally take more time but it is generally easier to restore individual files.  Commands that use this method include tar and cpio.

Media

Backups are usually made to tape based media.  There are different types of tape.  Tape media can differ in

§         physical size and shape, and

§         amount of information that can be stored.
From 100Mb up to 8Gb.

Different types of media can also be more reliable and efficient.  The most common type of backup media used today are 4 millimetre DAT tapes.

Reading

Under the Resource Materials section for Week 6 on the 85321 Web site/CD-ROM you will find a pointer to the USAIL resources on backups.  This includes a pointer to discussion about the different type of media which are available.

Commands

As with most things, the different versions of UNIX provide a plethora of commands that could possibly act as the transport in a backup system.  The following table provides a summary of the characteristics of the more common programs that are used for this purpose.


Command

Availability

Characteristics

dump/restore

BSD systems

image backup, allows multiple volumes, not included on most AT&T systems

tar

almost all systems

file by file, most versions do not support multiple volumes, intolerant of errors

cpio

AT&T systems

file by file, can support multiple volumes some versions don't,

Table 11.1.
The Different Backup Commands.

There are a number of other public domain and commercial backup utilities available which are not listed here.


dump and restore

A favourite amongst many Systems Administrators, dump is used to perform backups and restore is used to retrieve information from the backups.

These programs are of BSD UNIX origin and have not made the jump across to SysV systems.  Most SysV systems do not come with dump and restore.  The main reason is that since dump and restore bypass the file system, they must know how the particular file system is structured.  So you simply can't recompile a version of dump from one machine onto another (unless they use the same file system structure).

Many recent versions of systems based on SVR4 (the latest version of System V UNIX) come with versions of dump and restore.

dump on Linux

There is a version of dump for Linux.  However, it may be possible that you do not have it installed on your system.  RedHat 5.0 includes an RPM package which includes dump.  If your system doesn't have dump and restore installed you should install it now.  RedHat provides a couple of tools to installe these packages: rpm and glint.  glint is the GUI tool for managing packages.  Refer to the RedHat documentation for more details on using these tools.

You will find the dump package under the Utilities/System folder.  Before you can install the dump package you will have to install the rmt package.

dump

The command line format for dump is

dump [ options [ arguments ] ] file system

dump [ options [ arguments ] ] filename

Arguments must appear after all options and must appear in a set order.

dump is generally used to backup an entire partition (file system).  If given a list of filenames, dump will backup the individual files.

dump works on the concept of levels (it uses 9 levels).  A dump level of 0 means that all files will be backed up.  A dump level of 1...9 means that all files that have changed since the last dump of a lower level will be backed up.  Table 11.2 shows the arguments for dump.


 


Options

Purpose

0-9

dump level

a archive-file

archive-file will be a table of contents of the archive.

f dump-file

specify the file (usually a device file) to write the dump to, a – specifies standard output

u

update the dump record (/etc/dumpdates)

v

after writing each volume, rewind the tape and verify.  The file system must not be used during dump or the verification.

Table 11.2.
Arguments for dump

There are other options.  Refer to the man page for the system for more information. 

For example:

dump 0dsbfu 54000 6000 126 /dev/rst2 /usr

full backup of /usr file system on a 2.3 Gig 8mm tape connected to device rst2  The numbers here are special information about the tape drive the backup is being written on.

The restore command

The purpose of the restore command is to extract files archived using the dump command.  restore provides the ability to extract single individual files, directories and their contents and even an entire file system.

restore -irRtx [ modifiers ] [ filenames ]

The restore command has an interactive mode where commands like ls etc can be used to search through the backup.


Arguments

Purpose

i

interactive, directory information is read from the tape after which you can browse through the directory hierarchy and select files to be extracted.

r

restore the entire tape.  Should only be used to restore an entire file system or to restore an incremental tape after a full level 0 restore.

t

table of contents, if no filename provided, root directory is listed including all subdirectories (unless the h modifier is in effect)

x

extract named files.  If a directory is specified, it and all its sub-directories are extracted.

Table 11.3.
Arguments for the restore Command.

 

 

Modifiers

Purpose

a archive-file

use an archive file to search for a file's location.  Convert contents of the dump tape to the new file system format

d

turn on debugging

h

prevent hierarchical restoration of sub-directories

v

verbose mode

f dump-file

specify dump-file to use, - refers to standard input

s n

skip to the nth dump file on the tape

Table 11.4.
Argument modifiers for the restore Command.

Using dump and restore without a tape

Not many of you will have tape drives or similar backup media connected to your Linux machine.  However, it is important that you experiment with the dump and restore commands to gain an understanding of how they work.  This section offers a little kludge which will allow you to use these commands without a tape drive.  The method relies on the fact that UNIX accesses devices through files.

Our practice file system

For all our experimentation with the commands in this chapter we are going to work with a practice file system.  Practising backups with hard-drive partitions is not going to be all that efficient as they will almost certainly be very large.  Instead we are going to work with a floppy drive.

The first step then is to format a floppy with the ext2 file system.  By now you should know how to do this.  Here's what I did to format a floppy and put some material on it.

[root@beldin]# /sbin/mke2fs /dev/fd0
mke2fs 1.10, 24-Apr-97 for EXT2 FS 0.5b, 95/08/09
Linux ext2 filesystem format
Filesystem label=
360 inodes, 1440 blocks
72 blocks (5.00%) reserved for the super user
First data block=1
Block size=1024 (log=0)
Fragment size=1024 (log=0)
1 block group
8192 blocks per group, 8192 fragments per group
360 inodes per group

Writing inode tables: done
Writing superblocks and filesystem accounting information: done
[root@beldin]# mount -t ext2 /dev/fd0 /mnt/floppy
[root@beldin]# cp /etc/passwd /etc/issue /etc/group /var/log/messages /mnt/floppy
[root@beldin dump-0.3]#

Doing a level 0 dump

So I've copied some important stuff to this disk.  Let's assume I want to do a level 0 dump of the /mnt/floppy file system.  How do I do it?

[root@beldin]# /sbin/dump 0f /tmp/backup /mnt/floppy
  DUMP: Date of this level 0 dump: Sun Jan 25 15:05:11 1998
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/fd0 (/mnt/floppy) to /tmp/backup
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 42 tape blocks on 0.00 tape(s).
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: DUMP: 29 tape blocks on 1 volumes(s)
  DUMP: Closing /tmp/backup
  DUMP: DUMP IS DONE

The arguments to the dump command are

§         0
This tells dump I wish to perform a level 0 dump of the file system.

§         f
This is telling dump that I will tell it the name of the file that it should write the backup to.

§         /tmp/backup
This is the name of the file I want the backup to go to.  Normally, this would be the device file for a tape drive or other backup device.  However, since I don't have one I'm telling it a normal file.

§         /mnt/floppy
This is the file system I want to backup.

What this means is that I have now created a file, /tmp/backup, which contains a level 0 dump of the floppy.

[root@beldin]# ls -l /tmp/backup
-rw-rw-r--   1 root     tty         20480 Jan 25 15:05 /tmp/backup

Restoring the backup

Now that we have a dump archive to work with, we can try using the restore command to retrieve files.

[root@beldin dump-0.3]# /sbin/restore -if /tmp/backup
restore > ?
Available commands are:
        ls [arg] - list directory
        cd arg - change directory
        pwd - print current directory
        add [arg] - add `arg' to list of files to be extracted
        delete [arg] - delete `arg' from list of files to be extracted
        extract - extract requested files
        setmodes - set modes of requested directories
        quit - immediately exit program
        what - list dump header information
        verbose - toggle verbose flag (useful with ``ls'')
        help or `?' - print this list
If no `arg' is supplied, the current directory is used
restore > ls
.:
group       issue       lost+found/ messages    passwd

restore > add passwd
restore > extract
You have not read any tapes yet.
Unless you know which volume your file(s) are on you should start
with the last volume and work towards towards the first.
 Specify next volume #: 1
Mount tape volume 1
Enter ``none'' if there are no more tapes
otherwise enter tape name (default: /tmp/backup)
set owner/mode for '.'? [yn] y
restore > quit
[root@beldin]# ls -l passwd
-rw-r--r--   1 root     root          787 Jan 25 15:00 passwd

Alternative

Rather than backup to a normal file on the hard-drive you could choose to backup files directly to a floppy drive (i.e. use /dev/fd0 rather than /tmp/backup).  One problem with this alternative is that you are limited to 1.44Mb.  According to the "known bugs document" distributed with Linux dump it does not yet support multiple volumes.

Exercises

11.1      Do a level 0 dump of a portion of your home directory.  Examine the file /etc/dumpdates.  How has it changed?

11.2      Use restore to retrieve some individual files from the backup and also to retrieve the entire backup.

The tar command

tar is a general purpose command used for archiving files.  It takes multiple files and directories and combines them into one large file.  By default the resulting file is written to a default device (usually a tape drive).  However the resulting file can be placed onto a disk drive.

tar -function[modifier] device [files]

The purpose and values for function and modifier are shown in Tables 11.5 through 11.7.

When using tar, each individual file stored in the final archive is preceded by a header that contains approximately 512 bytes of information.  Also the end of the file is always padded so that it occurs on an even block boundary.  For this reason, every file added into the tape archive has on average an extra .75Kb of padding per file. 


 

Arguments

Purpose

function

A single letter specifying what should be done, values listed in Table 11.6

modifier

Letters that modify the action of the specified function, values listed in Table 11.7

files

The names of the files and directories to be restored or archived.  If it is a directory then EVERYTHING in that directory  is  restored  or archived

Table 11.5.
Arguments to tar.

 

Function

Purpose

c

create a new tape, do not write after last file

r

replace, the named files are written onto the end of the tape

t

table, information about specified files is listed, similar in output to the command ls -l, if no files specified all files listed

u *

update, named files are added to the tape if they are not already there or they have been modified since being previously written

x

extract, named files restored from the tape, if the named file matches a directory all the contents are extracted recursively

*  the u function can be very slow
Table 11.6.
Values of the function argument for tar.

 

Modifier

Purpose

v

verbose, tar reports what it is doing and to what

w

tar prints the action to be taken, the name of the file and waits for user confirmation

f

file, causes the device parameter to be treated as a file

m

modify, tells tar not to restore the modification times as they were archived but instead to use the time of extraction

o

ownership, use the UID and GID of the user running tar not those stored on the tape

Table 11.7.
Values of the modifier argument for tar.

If the f modifier is used it must be the last modifier used.  Also tar is an example of a UNIX command where the - character is not required to specify modifiers.

For example:

tar -xvf temp.tar            tar xvf temp.tar

extracts all the contents of the tar file temp.tar

tar -xf temp.tar hello.dat       

extracts the file hello.dat from the tar file temp.tar

     tar -cv /dev/rmt0 /home

archives all the contents of the /home directory onto tape, overwriting whatever is there

Exercises

11.3      Create a file called temp.dat under a directory tmp that is within your home directory.  Use tar to create an archive containing the contents of your home directory.

11.4      Delete the $HOME/tmp/temp.dat created in the previous question.  Extract the copy of the file that is stored in the tape archive (the term tape archive is used to refer to a file created by tar) created in the previous question.

The dd command

The man page for dd lists its purpose as being "copy and convert data".  Basically dd takes input from one source and sends it to a different destination.  The source and destination can be device files for disk and tape drives, or normal files.

The basic format of dd is

dd [option = value ....]

Table 11.8. lists some of the different options available.

Option

Purpose

if=name

input file name (default is standard input)

of=name

output file name (default is standard output)

ibs=num

the input block size in num bytes (default is 512)

obs=num

the output block size in num bytes (default is 512)

bs=num

set both input and output block size

skip=num

skip num input records before starting to copy

files=num

copy num files before stopping (used when input is from magnetic tape)

conv=ascii

convert EBCDIC to ASCII

conv=ebcdic

convert ASCII to EBCDIC

conv=lcase

make all letters lowercase

conv=ucase

make all letters uppercase

conv=swab

swap every pair of bytes

Table 11.8.
Options for dd.


For example:

dd if=/dev/hda1 of=/dev/rmt4

with all the default settings copy the contents of hda1 (the first partition on the first disk) to the tape drive for the system

Exercises

11.5      Use dd to copy the contents of a floppy disk to a single file to be stored under your home directory.  Then copy it to another disk.

The mt command

The usual media used in backups is magnetic tape.  Magnetic tape is a sequential media.  That means that to access a particular file you must pass over all the tape containing files that come before the file you want.  The mt command is used to send commands to a magnetic tape drive that control the location of the read/write head of the drive.

mt [-f tapename] command [count]

Arguments

Purpose

tapename

raw device name of the tape device

command

one of the commands specified in table 11.10.  Not all commands are recognised by all tape drives.

count

number of times to carry out command

Table 11.9.
Parameters for the mt Command.

 

Commands

Action

fsf

move forward the number of files specified by the count argument

asf

move forward to file number count

rewind

rewind the tape

retension

wind the tape out to the end and then rewind

erase

erase the entire tape

offline

eject the tape

Table 11.10.
Commands Possible using the mt Command.

For example:

mt -f /dev/nrst0 asf 3

moves to the third file on the tape

mt -f /dev/nrst0 rewind
mt -f /dev/nrst0 fsf 3

same as the first command

The mt command can be used to put multiple dump/tar archive files onto the one tape.  Each time dump/tar is used, one file is written to the tape.  The mt command can be used to move the read/write head of the tape drive to the end of that file, at which time dump/tar can be used to add another file.

For example:

mt -f /dev/rmt/4 rewind

rewinds the tape drive to the start of the tape

tar -cvf /dev/rmt/4 /home/jonesd

backs up my home directory, after this command the tape will be automatically rewound

mt -f /dev/rmt/4 asf 1

moves the read/write head forward to the end of the first file

tar -cvf /dev/rmt/4a /home/thorleym

backs up the home directory of thorleym onto the end of the tape drive

There are now two tar files on the tape, the first containing all the files and directories from the directory /home/jonesd and the second containing all the files and directories from the directory /home/thorleym.

Compression programs

Compression programs are sometimes used in conjunction with transport programs to reduce the size of backups.  This is not always a good idea.  Adding compression to a backup adds extra complexity to the backup and as such increases the chances of something going wrong.

compress

compress is the standard UNIX compression program and is found on every UNIX machine (well, I don't know of one that doesn't have it).  The basic format of the compress command is

compress filename

The file with the name filename will be replaced with a file with the same name but with an extension of .Z added, and that is smaller than the original (it has been compressed).

A compressed file is uncompressed using the uncompress command or the -d switch of compress.      

uncompress filename   or   compress -d filename

For example:

bash$ ls -l ext349*
-rw-r----- 1 jonesd      17340 Jul 16 14:28 ext349
bash$ compress ext349
bash$ ls -l ext349*
-rw-r----- 1 jonesd       5572 Jul 16 14:28 ext349.Z
bash$ uncompress ext349
bash$ ls -l ext349*
-rw-r----- 1 jonesd      17340 Jul 16 14:28 ext349

gzip

gzip is a new addition to the UNIX compression family.  It works in basically the same way as compress but uses a different (and better) compression algorithm.  It uses an extension of .z and the program to uncompress a gzip archive is gunzip.

For example:

bash$ gzip ext349
bash$ ls -l ext349*
-rw-r----- 1 jonesd    4029 Jul 16 14:28 ext349.z
bash$ gunzip ext349

Exercises

11.6      Modify your solution to exercise 11.5 so that instead of writing the contents of your floppy straight to a file on your hard disk it first compresses the file using either compress or gzip and then saves to a file.

Conclusions

In this chapter you have

§         been introduced to the components of a backup strategy scheduler, transport, and media

§         been shown some of the UNIX commands that can be used as the transport in a backup strategy

§         examined some of the characteristics of a good backup strategy and some of the factors that affect a backup strategy

Review questions

11.1.   

Design a backup strategy for your system.  List the components of your backup strategy and explain how these components affect your backup strategy. 

 

11.2.    Explain the terms media, scheduler and transport.

 

11.3.    Outline the difference between file by file and image transport programs.