Amanda and Backup at CUBINlab
Contents
- Introduction
- Configuration
- Day to Day Operation
- Off-site storage of backups
- Making New Tapes
- How to Restore Files with Amanda
- Odd Behaviour
Introduction
As a basic policy, Amanda is provided to workstation users as a service with each workstation user being responsible for the correct installation and configuration of Amanda. This responsibility can be delegated with the usual cavets. Backup of essential computing infrastructure is the responibility of your friendly non-system admin.
This document provides a number of examples specific to the operation of the Amanda tape backup system as used in CUBINlab. Detailed information on the commands can be found in the man pages for the Amanda applications:
amadmin (8) - administrative interface to control Amanda backups amanda (8) - Advanced Maryland Automatic Network Disk Archiver amcheck (8) - Amanda pre-run self-check amcleanup (8) - runs the Amanda cleanup process after a failure amdump (8) - backs up all disks in an Amanda configuration amflush (8) - flushes Amanda backup files from holding disk to tape amlabel (8) - labels an Amanda tape amrestore (8) - extract files from an Amanda tapeA list of the machines currently backed up follows:
- Network core: emu
- Resident Workstations: froggy, koala, sugar-glider, wallaby, wombat
- Distributed Simulator Cluster: einstein
Configuration
CUBINlab has a single backup set defined: emu. All configuration files relating to this backup set are located in /usr/local/etc/amanda/emu on emu.cubinlab.ee.unimelb.edu.au.
Amanda is installed so that it uses the user backup on the server emu for most of its operations. A sendmail alias entry redirects operator's email to notify those responsible for the operation of the backup system of significant events, failures, and summaries of backup operations.
To add or remove email adresses edit the file /usr/local/etc/amanda/emu/amanda.conf:
org "CUBIN lab" # your organization name for reports mailto "root,nhohn,darryl,jpap" # space separated list of operators at your site dumpuser "amanda" # the user to run dumps under inparallel 8 # maximum dumpers that will run in parallel netusage 8000 Kbps # maximum net bandwidth for Amanda, in KB per sec was 800
To add or remove disks from the backup set edit the file /usr/local/etc/amanda/emu/disklist:
# sample Amanda2 disklist file, derived from CS.UMD.EDU's disklist # # If your configuration is called, say, "csd2", then this file normally goes # in /etc/amanda/csd2/disklist. # # File format is: # # hostname diskdev dumptype [spindle [interface]] # # where the dumptypes are defined by you in amanda.conf. # At our site, root partitions have a different dumptype because they # are of lower priority; they don't contain user data, and don't change # much from the department prototype. In a crunch, they can be left for # last or skipped. # EMU server emu da0s1a comp-root # / emu da1s1e comp-user # /home #fw fw wd0s1a comp-root # / fw wd0s1f comp-user # /usr fw wd0s1e comp-user # /var fw wd0s1d comp-user # /var/spool # desk top workstations. Mainly Linux boxes. # darryl #sugar-glider hda7 comp-root # / #sugar-glider hda5 comp-root # /boot #sugar-glider hda6 comp-root-hard # /usr sugar-glider hda1 comp-user # /home # ... etc ...
To add a new machine it is necessary to install the Amanda software on that system see the file /usr/local/share/doc/amanda/INSTALL for details.
Day to Day Operation
Amanda typically operates quietly so long as the tapes are changed each week day. In case of problems the Amanda server, emu, e-mails the user backup and this mail is forwarded to the maintainers of the backups.
The crontab file controls when two amanda programs are run: amcheck and amdump. This file can be altered by using the command crontab -e backup as root:
> crontab -e backup # backup user crontab file # min hour daymth month daywk command 0 16 * * 1-5 /usr/local/sbin/amcheck -m emu 45 2 * * * /usr/local/sbin/amdump emuamcheck causes mail warning about failures to be sent to the people in operator's alias entry file. amdump performs the dumps at some quiet time in the early morning.
The current policy is that the tape not be changed on Saturday and Sunday
nights, leaving incremental dumps on the holding disk,
/scratch/amanda. This should then be flushed on Monday.
amcheck may fail because of a media or hardware error. In this case a media error occured (check console to determine nature of error)
> sudo -u amanda /usr/local/sbin/amcheck emu Amanda Tape Server Host Check ----------------------------- /scratch/amanda: 5043828 KB disk space available, that's plenty. amcheck: slot 1: reading label: I/O error ERROR: label VOL19 or new tape not found in rack. (expecting tape VOL19 or a new tape) NOTE: skipping tape-writeable test. Server check took 27.231 seconds. Amanda Backup Client Hosts Check -------------------------------- Client check: 4 hosts checked in 0.753 seconds, 0 problems found. (brought to you by Amanda 2.2.6)In this case we will attempt to relabel the tape and failing that replace and label the tape. Relabeling did not fix the problem, however, the replacement tape - after labeling - responded to amcheck as follows:
> sudo -u amanda /usr/local/sbin/amcheck emu Amanda Tape Server Host Check ----------------------------- /scratch/amanda: 5043828 KB disk space available, that's plenty. amcheck: slot 1: date X label VOL19 (exact label match) NOTE: skipping tape-writeable test. Tape VOL19 label ok. Server check took 18.995 seconds. Amanda Backup Client Hosts Check -------------------------------- Client check: 4 hosts checked in 0.482 seconds, 0 problems found. (brought to you by Amanda 2.2.6)
If a dump fails, Amanda, may store the data in a work area. This data can latter be dumped to tape using the amflush command. A typical example of this problem follows. It was caused by having a bad tape in the drive when the system was to dump.
From backup Fri Jun 6 02:52:08 2000 To: backup Subject: EMU AMANDA MAIL REPORT FOR June 6, 2000 *** A TAPE ERROR OCCURRED: [label VOL19 or new tape not found in rack]. *** PERFORMED ALL DUMPS AS INCREMENTAL DUMPS TO HOLDING DISK. THESE DUMPS WERE TO DISK. Flush them onto tape VOL19 or a new tape. Tonight's dumps should go onto tape VOL20 or a new tape. FAILURE AND STRANGE DUMP SUMMARY: emu c0t9d0s0 lev 2 FAILED [can't dump no-hold disk in degraded mode] STATISTICS: ....The fix in this case is to insert the correct tape and use the amflush program:
> sudo -u amanda /usr/local/sbin/amflush emu Scanning /scratch/amanda... 20000606: found non-empty Amanda directory. Flushing dumps in 20000606 using tape changer "chg-generic". Expecting tape VOL19 or a new tape. (The last dumps were to tape VOL18) Are you sure you want to do this?y Running in background, you can log off now. You'll get mail when amflush is finished.If there is more than one dump in the holding area the operator is prompted to select which directory is to be flushed.
If an operator wants to manually perform an Amanda operation, they must typically become the Amanda user, in EMU's case backup. The sudo program can be used for this purpose. For example: if an approved user wants to run amcheck to see if the right tape is in the drive:
> sudo -u amanda /usr/local/sbin/amcheck emu
Amanda Tape Server Host Check
-----------------------------
/scratch/amanda: 5109417 KB disk space available, that's plenty.
amcheck: slot 1: date 20000605 label VOL18 (active tape)
ERROR: label VOL19 or new tape not found in rack.
(expecting tape VOL19 or a new tape)
NOTE: skipping tape-writeable test.
Server check took 3.413 seconds.
Amanda Backup Client Hosts Check
--------------------------------
Client check: 4 hosts checked in 3.134 seconds, 0 problems found.
(brought to you by Amanda 2.2.6)
or amflush to clear any spooled dumps.
> sudo -u amanda /usr/local/sbin/amflush emu
Making New Tapes
Before a tape can be used in the Amanda system it must be labeled using the amlabel program.
> sudo -u amanda /usr/local/sbin/amlabel emu VOL00 labeling tape in slot 1 (/dev/nsa0): rewinding, writing label VOL00, writing end marker, done.
NOTE: Tape labels should be unique unless they replace a failed tape in the tape cycle. Furthermore, labels must follow the regular expression specified in the configuration file. When replacing an old tape with a new one with the same label, it might be necessary to remove the old label form the tape list by editing /usr/local/etc/amanda/emu/tapelist
Off-site Storage of Tapes
Amanda currently uses a cycle of 20 tapes. Ten tapes are kept in the CUBIN area, and 10 are kept in the Faculty of Engineering office, in Old Engineering. They are kept by Brian Shirriffs: phone x44325, mobile 0409 186 2182
How to Restore Files with Amanda
Example
This tutorial describes how to recover backups make using tar.
Backups on sugar-glider still use dump, and should
use the restore procedure at amanda-sugar-glider.php.
This example is based on recovering the contents of the directory
Locate the restore directory and change to it
> cd /scratch/restore
Run amrecover as root, and select the files to be recovered:
> unlimit filesize > sudo amrecover emu
Amanda will respond with something like
AMRECOVER Version 2.4.4p2. Contacting server on emu.cubinlab.ee.unimelb.edu.au ... 220 emu AMANDA index server (2.4.4p2) ready. 200 Access OK Setting restore date to today (2005-01-13) 200 Working date set to 2005-01-13. Scanning /scratch/amanda... 200 Config set to emu. 501 Host emu.cubinlab.ee.unimelb.edu.au is not in your disklist. Trying host emu.cubinlab.ee.unimelb.edu.au ... 501 Host emu.cubinlab.ee.unimelb.edu.au is not in your disklist. Trying host emu ... 200 Dump host set to emu. Trying disk /home ... Can't determine disk and mount point from $CWD '/scratch/restore' amrecover>
To allow full backups to be performed without overflowing the 20GB tape,
/homeon emu is broken into groups of directories: abc, def, ghi, jkl, mno, pqr, stuv, wxyz. Tell amrecover which one you directory is in:amrecover> setdisk /home/jkl 200 Disk set to /home/jkl.
If you want to recover files as they were at a particular date, use the
setdate YYYY-MM-DDcommand. To recover files as they were on 12 January, 2005, typeamrecover> setdate 2005-01-12 200 Working date set to 2005-01-12.
Note that backups are done in the early morning (from 2am), so to get the files as they were when you went home one night, you would specify the following day.Change to the desired directory:
amrecover> cd l/lha /home/jkl/l/lha
If you like, you can use the
lscommand to view the contents of the directory:amrecover> ls <snip> 2005-01-11 rsrch/ 2005-01-10 sendmail.cf 2005-01-10 sendmail.mc 2005-01-10 submit.cf 2005-01-10 submit.mc 2005-01-10 talk-schedule.html 2005-01-11 tex/ 2005-01-11 tmp/ 2005-01-10 usage
The output may be piped throughless, in which case you have to press "q" to get back to theamrecover>prompt.Unlike the unix
lscommand, the amrecoverlscommand doesn't take any arguments. You have tocdto the directory you want to list.The dates refer to the dates of the most recent dump which affects the specified directory. Notice that all directories are included in the highest level dump, even if none of their contents has changed.
The command
pwdtells you the current directory.Select which files/directories to extract:
amrecover> add tmp Added dir /l/lha/tmp at date 2005-01-11 Added dir /l/lha/tmp at date 2005-01-10
Amrecover leads you through the actual restore when you
extractthe files:amrecover> extract Extracting files using tape drive /dev/nsa0 on host emu.cubinlab.ee.unimelb.edu.au. The following tapes are needed: VOL15 VOL16 Restoring files into directory /scratch/restore Continue [?/Y/n]? y Extracting files using tape drive /dev/nsa0 on host emu.cubinlab.ee.unimelb.edu.au. Load tape VOL15 now Continue [?/Y/n/s/t]?
Write protect the tapes that you are going to restore from. (Open the white sliding tab to the right of the tape label.)
Insert the tapes as prompted, and respond "y" to all the "continue?" prompts.
If for any reason there is an error, you may have to rewind the tape, either by ejecting and reinserting it, or by> mt rewind
Note that amrecover will read the entire dump, even once it has finished restoring the directories you asked for. Don't press ^C, since that will abort amrecover totally.
To exit amrecover, type "quit":
amrecover> quit 200 Good bye. >
Make tapes writeable again
Get user to shift files to desired location, or; tar up files and untar them in the desired location.
Warning: if you mv the files as root you will destroy the files permissions.
Remove directory created in step 9.
Odd Behaviour
-
Shared Memory Problem
Backups and Flushes can fail due to the absence of shared memory. This problem appears as a failure of flush to send mail. The unix commands
ipcsandipcrmare useful for determining the number of shared memory items and removing unwanted ones. Note that both the number and size of shared memory blocks are restricted.A typical situation is as follows:
FAILURE AND STRANGE DUMP SUMMARY: taper: FATAL shmget: Cannot allocate memory
This error message means that there is no shared memory available for the backup. Rebooting emu is supposed to cure the problem but is not a solution for obvious reasons... Another option is to do the followingipcs -m
and thenipcrm -m ID
where ID is a shared memory item ID obtained from the above command ipcs -m -
First dump of a new partition
When a partition is backed for the first time, the backup has to be done on a tape, not on the holding disk, otherwise Amanda gives the following error message:FAILURE AND STRANGE DUMP SUMMARY: einstein hda3 lev 0 FAILED [can't switch to incremental dump] einstein hda1 lev 0 FAILED [can't switch to incremental dump]There is in fact nothing wrong apart from a missing tape. -
AMRECOVER
EOF, check amidxtaped.debug file. amrecover: short block 0 bytes You might need to rewind the tape.
AMANDA uses the non-rewinding version of the tape device, so that you can append data after amdump/amflush. amrecover DOES NOT REWIND THE TAPE FOR YOU, because it has no way of knowing what part of the tape you want to restore from. A simplemt -f /dev/nsa0 rewindbefore amrecover is a good idea.