CTS 2322 (Unix/Linux Administration II) Project
Backups and Archives


Due: by the start of class on the date shown on the syllabus

The purpose of this project is to practice creating a complete backup policy using a real scenario.  You will also explore several commonly used tools used to perform backups.

You may work in small groups on this project, as long as the names of all group members are included.  Note each group member must submit an identical copy of the project.

Part I — Define a Backup Policy

Design a backup policy / strategy for YborStudent.hccfl.edu.  You need to take into account available backup hardware, however you can recommend new hardware if you feel it is worth-while.  (In that case exact model name and number and prices, obtained from the Internet, are required.)

You can contact HCC's OIT for any information you may need.  See also HCC's backup policies and Obtaining Services from OIT (which contains HCC's SLA policy.)  Any information you get must be credited, that is you must say who told you what, and when.  In addition, you can use your YborStudent account to run various commands to examine the system, such as mount, df, du, etc.

See Backups and Archives for background information.  Your backup policy must be very specific and detailed, and include the following information:

  1. Which volumes/directories should be backed up?
  2. For each volume/directory to be backed up, should you create backups or archives?  What software tools (e.g., tar, dump, or some specific commercial software) will be used?
  3. Is there an existing SLA that must be taken into account for YborStudent?  If so what is it?  If not propose a reasonable SLA for recovery times.
  4. Is there an existing procedure at HCC for a student to request recovery of an accidentally deleted file?  If so what is the procedure?  If there isn't one, what would you suggest?
  5. For each volume/directory to be backed up or archived (hereafter simply referred to as backups), define the backup strategy and the schedule to be used.  (e.g., monthly full with daily incrementals.)  Make sure your backup cycles can be completed in a reasonable time.  If necessary, define a staggered schedule (e.g., if you have a monthly backup cycle defined, then what backups get made when).
  6. Define your backup retention policy (how long to retain each type of backup, for each volume/directory that you backup).
  7. What backup media and drives will be used?
  8. Where will backup media be stored?
  9. What is the media replacement policy (a.k.a. media rotation schedule)?
  10. What is the estimated budget for the first year for media and hardware?  Note you must include in your budget the prices of all media, software, hardware, and other expenses required.  If you discover HCC already has a backup infrastructure you plan to use, what equipment and media are in use and what would it cost to replace that with similar equipment today?

Please briefly justify your choices and decisions.

Part II — Making Backups and Archives

For this part you will backup /etc and /home from your assigned classroom computer, after you have completed your post-install setup tasks.  You will use a variety of tools and techniques for this including using LVM snap-shots.  (For more information please see the LVM Guide.)

Since we lack backup hardware you should keep the backup archives in /tmp.  Note you can always send the backup files as email attachments to yourself and then burn them to a CD.  (If your classroom computer includes a CD burner you can use it.) 

Perform the following steps and answer the following questions:

  1. Be sure that /tmp has enough free space to hold the backups!  /tmp may be part of your root partition, and it would be a very bad thing to run that out of space.  To estimate the size of the backups, calculate the size (used space) of the /home and /etc directories.  What command(s) did you use for this?  What are the sizes?  Note that compressed archives will take about 30% less space.  However you will need to store one complete backup of /home and two backups of /etcWhat is the estimate of the total space required for your backups?
  2. What would happen if the root partition (“/”) ran out of space?  It pays to create a separate partition or volume for /tmp.  If you have sufficient RAM you could use tmpfs, a RAM disk designed for /tmp

    Check now to see if your /tmp is already a separate volume.  What is the exact command line you ran to check this?  If you have such a volume, verify the free space available on that volume is sufficient for your backups.

    If you don't have a separate volume for /tmp already, You have two choices: save the backups there anyway if the root volume has sufficient space, or create a new LVM volume formatted with ext4 for /tmp big enough for you needs.  Note the volume holding /tmp must be a bit larger than just the size needed for backups, as other files are stored in there (and the filesystem itself uses an overhead of around 10%).

    Follow these directions, only if you need to create a new volume for your /tmp:

    1. Examine the space available on your drive, both free space (if any) as well as volumes that are using only a small amount of their allocated space.  You will not only need space for the new /tmp volume, but also for a backup snap-shot volume, about 20% of the size of the /home volume.  How much free disk space do you need altogether?
    2. If you don't have sufficient free space available on your volume group, then you must shrink one (or more) existing logical volumes to free up enough space for the new volumes.  (This is not possible with XFS!  With XFS, you need to copy all the files someplace, delete the old volume, create a new, smaller one, then restore all the old files.)  Which volumes will you need to shrink, and what will be the new sizes of those volumes?

      Note, resizing filesystems is a dangerous activity even for ext4.  If you need to shrink (for example) /home, before doing so it pays to try to make a tar archive of just /home as long as you have sufficient space for that anywhere.  Then, you can proceed to shrink that volume to make room for all your backups.

    3. For each logical volume you must shrink, you must first shrink the filesystem within.  For ext4 (or ext2 or ext3) filesystems, that is done with the resize2fs command.  Next you can shrink the logical volume with lvreduce.

      Here is a sample of reducing /home volume (/dev/VolGroup00/LogVol02) by 1 gigabyte, from 10 Gib to 9 GiB:

      df -h                                      # show used and available space
      lvdisplay | less                           # show lv sizes
      vgdisplay                                  # show vg free space
      umount /dev/VolGroup00/LogVol02            # volume for /home
      fsck -f /dev/VolGroup00/LogVol02           # required before resizing
      resize2fs -p /dev/VolGroup00/LogVol02 9G   # shrink filesystem by 1G
      lvreduce -L -1G /dev/VolGroup00/LogVol02   # shrink logical volume by 1G
      mount /home
      vgdisplay                                  # Note available space

      What are the exact command lines you used for this?

    4. Now that the volume group has sufficient free space for the new /tmp volume, create a new logical volume and then format it as ext4.  Here is an example of creating a 1 gigabyte volume for /tmp, using the name LogVol04:
      vgdisplay                                  # Note available space
      lvcreate -n LogVol04 -L 1G VolGroup00
      mkfs -t ext4 /dev/VolGroup00/LogVol04
      vi /etc/fstab                              # add entry for /tmp
        # Move old /tmp contents to new /tmp partition.  This will
        # likely require a reboot before the new /tmp files get used.
      mv /tmp /oldtmp
      mkdir /tmp
      chmod 1777 /tmp
      mount /tmp
      cd /oldtmp
      cp -a $(/bin/ls -A) /tmp
        # Do this after the next reboot:
      rm -ri /oldtmp
  3. What is the exact find command which will find the names of all files and directories in /etc that have been modified in the past 24 hours?  Make sure the list of names is depth-first (that is, the contents of a directory before the directory itself).  What is the purpose of the “-print0find option, and why should you use it in a “production-quality” script?  What are the matching options for tar, pax, and cpio, to make an archive of the files found by find?
  4. Backup /home using dump.  Save the backup in a file in /tmp, provided you have sufficient space for that.  (And check first!)  Note that the utility name varies with the type of filesystem used.  For XFS, it is called xfsdump.  Use a “level 0” dump.  Since we don't want to shut the system down to single user mode (or unmount /home), we will make an LVM snapshot volume of /home and then use dump on that.  The snap-shot volume typically needs about 15%–20% of the original filesystem's size.  How large will you make the /home snap-shot volume?

    Here is an example of creating, using, then removing an LVM snap-shot of /home (in this example /dev/VolGroup00/LogVol01):

    lvcreate --size 100m --snapshot --name home-snap /dev/VolGroup00/LogVol01
    mkdir /mnt/home-snap
    mount -t auto -o ro /dev/VolGroup00/home-snap /mnt/home-snap
    dump ... /mnt/home-snap  # read the man page for options to use
    umount /dev/VolGroup00/home-snap
    lvremove -v /dev/VolGroup00/home-snap
    rmdir /mnt/home-snap

    What are the exact command lines you used?

  5. Completely backup /etc using a compressed tar archive.  What is the exact command line(s) you used for this?

    Verify the archive creation was successful, by viewing the (table of) contents of the archive (at least part of it).  Does the archive contain absolute or relative pathnames of files?

  6. Completely backup /etc using a cpio archive.  Now compress the resulting archive using whichever compression method you used for the tar archive in the previous step (gzip, bzip2, or xz).  What are the exact command lines(s) you used for this step?

    Verify the archive creation was successful, by viewing the (table of) contents of the archive (at least part of it).  (Hint: use zcat to pipe the compressed archive into cpio).)  Does the archive contain absolute or relative pathnames of files?

  7. Which archive is smaller, the tar or the cpio archive of /etc, and by how much?  (Compare compressed to compressed, or uncompressed to uncompressed archives.)  Do you think the difference is significant?
  8. For each of the two archives, compute the MD5 checksum and store it in a file /tmp/name-of-archive.md5.
  9. Copy the tar archive backup of /etc along with the matching MD5 checksum file, to your home directory on the YborStudent.hccfl.edu server using scp command.  What is the exact command lines(s) you used?  Before running this command, check your quota on YborStudent and make sure you won't go over your hard limit!
  10. Copy the cpio archive backup of /etc, along with the matching MD5 checksum file, to your home directory on the YborStudent.hccfl.edu server using rsync command.  What is the exact command lines(s) you used?  (Note: by default modern rsync uses a secure SSH tunnel, the same as scp.)  Before running this command, check your quota on YborStudent and make sure you won't go over your hard limit!

    Describe briefly what the following command does, and list the meaning of each option used in your own words:

     rsync -HavRuzc /var/www/html/ example.com:/var/www/html/
  11. Now log into YborStudent and verify the integrity of the backup copies, using the MD5 checksum files.  What are the exact command line(s) to do that, and what were the results?
  12. Still logged on YborStudent, extract the file “/etc/group” to your home directory from the tar archive.  Then extract the file “/etc/hosts” to your home directory from the cpio archive.  What are the exact command lines needed for this?  Be careful not to try to extract the absolute pathname or you will attempt to over-write /etc/group.  (Don't worry, you don't have permission to do that!)

    When done, delete both archives.

(Optional) Part III — Using pax and star

The pax backup utility is a POSIX standard tool available on all Unix/Linux platforms.  (Although more popular, tar, cpio, and rsync are not standard tools.)  This tool is based on tar and cpio and can read and write a variety of archive formats.

POSIX has in recent years defined new formats which in theory can backup ACLs and extended attributes or “EAs”, such as used for SELinux.  The “ustar” format is technically known as the IEEE/Posix1003/IEC-9945-1-1988 Standard Data Interchange format.  The successor to ustar is the POSIX-1003.1-2001 Standard Data Interchange format, commonly called the “pax” archive format.

The star archive tool created by Jörg Schilling is a nonstandard but widely used tool similar to tar, but able to handle a wider range of archive types than even the pax tool.  Taking advantage of the extensible nature of the ustar format, Jörg has defined the “exustar” format, with which the star tool can read/write and which does backup ACLs and other file meta-data.  star may be the only archiving tool for Fedora which does this (pax should, but may not depending on the version).  However, it won't use this format by default!  You must use the correct option(s) to force star to use exustar format.

For this optional part of the assignment, you should make a full archive of your home directory using star, and then repeat with pax.  To ensure you have sufficient space for these archives, you may need to remove the other archives and backups from /tmp first.  To make this interesting, add some ACLs (with the command setfacl) and some extended attributes (with the command setfattr) to one or two files first.

After the archives are made and verified, delete the file(s) you modified with ACLs.  Finally, attempt to restore the files from the archive, including all attributes (date/time, modes, ACLs, and EAs).  Also see if you can recover the exustar archive made with star by using the pax tool.  What were the exact command lines you used?  Were the restore attempts successful?

After reading the man pages (really!), try something similar to the commands used in this star and pax session typescript.


A good site to check for backup hardware is NewEgg.com.

Be careful not to exceed your disk quota while doing this assignment!  If necessary, use the man (and / or info) pages to see how to exclude some directories from the archives when you create them, and exclude one or more of the larger directories.  (This won't affect the learning benefit from this assignment, but of course in real life you do need to backup everything.)  /etc/gconf/ and a few other directories in /etc can be quite large.  You can use du, sort, and head (or tail, depending on your sort options) to find the largest few directories to omit.

If necessary, copy the tar archive first, then delete it before copying over the cpio archive.

A fancy shell script to backup /etc can be found at Backup-etc.sh.  Try it using the “-v” option.

ACLs and EAs only work for some filesystem types (including ext* types), and then only if the appropriate mount options are used.  Look at the mount options for /home on YborStudent to see an example.

Remember that man pages are not tutorials.  You should search online for tutorials on these commands, or for answers to specific questions (for example, cpio extract file with absolute pathname to current directory).  Remember that finding answers to such technical questions is a required skill of any system administrator.

To Be Turned In:

A copy of your YborStudent backup policy, and a copy of your journal pages showing the steps you have taken and the answers to the questions for this assignment.  Don't turn in your whole journal, you will need to add to it every day in class!  It is common in fact to keep the journal as a text file on the system (with a paper backup of course).  Use Canvas and submit to the project's drop-box.  Please see your syllabus for more information about submitting projects.