CTS 2301C (Unix/Linux Administration I) Project
Backups and Archives

 

Due: by the start of class on the date shown on the syllabus

The purpose of this project is to practice creating a complete backup policy using a real scenario.  You will also explore several commonly used tools used to perform backups.

You may work in small groups on this project, as long as the names of all groups members are included.  Note each group member must submit an identical copy of the project.

Part I — Define a Backup Policy

Design a backup policy / strategy for YborStudent.hccfl.edu.  You need to take into account available backup hardware, however you can recommend new hardware if you feel it is worth-while.  (In that case exact model name and number and prices, obtained from the Internet, are required.)

You can contact HCC's OIT for any information you may need.  (See also HCC's backup policies, and Obtaining Services from OIT, which is HCC's SLA policy.)  Any information you get must be credited, that is you must say who told you what, and when.  In addition, you can use your YborStudent account to run various commands to examine the system, such as mount, df, du, etc.

See Backups and Archives for background information.  Your backup policy must be very specific and detailed, and include the following information:

  1. Which partitions/directories should be backed up?
  2. For each partition/directory to be backed up, should you create backups or archives?  What software tools (e.g., tar, dump, or some specific commercial software) will be used?
  3. Is there an existing SLA that must be taken into account for YborStudent?  If so what is it?  If not propose a reasonable SLA for recovery times.
  4. Is there an existing procedure at HCC for a student to request recovery of an accidentally deleted file?  If so what is the procedure?  If there isn't one, what would you suggest?
  5. For each partitions/directories to be backed up or archived (hereafter simply referred to as backups), define the backup strategy and the schedule to be used.  (e.g., monthly full with daily incrementals.)  Make sure your backup cycles can be completed in a reasonable time;  If necessary define a staggered schedule (e.g., if you have a monthly backup cycle defined, then what backups get made when).
  6. Define your backup retention policy (how long to retain each type of backup, for each partition/directory that you backup).
  7. What backup media and drives will be used?
  8. Where will backup media be stored?
  9. What is the media replacement policy (a.k.a. media rotation schedule)?
  10. What is the estimated budget for the first year for media and hardware?  Note you must include in your budget the prices of all media, software, hardware, and other expenses required.  If you discover HCC already has a backup infrastructure you plan to use, What equipment and media are in use and what would it cost to replace that with similar equipment today?

Part II — Making Backups and Archives

For this part you will backup /etc and /home on your assigned classroom computer, after you have completed your post-install setup tasks.  You will use a variety of tools and techniques for this including using LVM snap-shots.  (For more information please see the LVM Guide.)

Since we lack backup hardware you should keep the backup archives in /tmp.  Note you can always send the backup files as email attachments to yourselfand then burn them to a CD.  (If your classroom computer includes a CD burner you can use it.) 

Perform the following steps and answer the following questions:

  1. Be sure that /tmp has enough free space to hold the backups!  For now, /tmp is part of your root partition and it would be a very bad thing to run that out of space.  To estimate the size of the backups, calculate the size of /home and /etc directories.  What command(s) did you use for this?  What are the sizes?  Note that compressed archives will take about 30% less space.  However you will need to store one complete backup of /home and two backups of /etcWhat is the estimate of the total space required for your backups?
  2. What would happen if the root partition (/) ran out of space?  It pays to create a separate partition or volume for /tmp.  If you had sufficient RAM you could use tmpfs, a RAM disk designed for /tmp.  Since we don't have that option we will create a new LVM volume formatted with ext4 to hold /tmp.  Note you will need to make the new partition a bit larger than just the size needed for backups, as other files are stored in there (and the filesystem itself uses an overhead of around 10%).
    1. Examine the space available on your drive, both free space (if any) as well as volumes that are using only a small amount of their allocated space.  You will not only need space for the new /tmp volume, but also for a backup snap-shot volume, about 20% of the size of the /home volume.  How much free disk space do you need altogether?
    2. If you don't have sufficient free space available on your volume group then you must shrink one (or more) existing logical volumes to free up enough space for the new volumes.  Which volumes will you need to shrink, and what will be the new sizes of those volumes?
    3. For each logical volume you must shrink, you must first shrink the filesystem within.  For ext4 (or ext2 or ext3) filesystems that is done with the resize2fs command.  Next you can shrink the logical volume with lvreduce.

      Here is a sample of reducing /home volume (/dev/VolGroup00/LogVol02) by 1 gigabyte:

      df -h                                      # show used and available space
      lvdisplay | less                           # show lv sizes
      vgdisplay                                  # show vg free space
      umount /dev/VolGroup00/LogVol02            # volume for /home
      fsck -f /dev/VolGroup00/LogVol02           # required before resizing
      resize2fs -p /dev/VolGroup00/LogVol02 -1G  # shrink filesystem by 1G
      lvreduce -L -1G /dev/VolGroup00/LogVol02   # shrink logical volumn by 1G
      mount /home
      vgdisplay                                  # Note available space
      

      What are the exact commands you used for this?

    4. Now that the volume group has sufficient free space for the new /tmp volume, create a new logical volume and then format it as ext4.  Here is an example of creating a 1 gigabyte volume for /tmp, using the name LogVol04:
      vgdisplay                                  # Note available space
      lvcreate -n LogVol04 -L 1G VolGroup00
      mkfs -t ext4 /dev/VolGroup00/LogVol04
      vi /etc/fstab                              # add entry for /tmp
      
        # Move old /tmp contents to new /tmp partition.  This will
        # likely require a reboot before the new /tmp files get used.
      
      mv /tmp /oldtmp
      mkdir /tmp
      chmod 1777 /tmp
      mount /tmp
      cd /oldtmp
      cp -a $(/bin/ls -A) /tmp
      
        # Do this after the next reboot:
      rm -ri /oldtmp
      
  3. What is the exact find command which will find the names of all files and directories in /etc that have been modified in the past 24 hours?  Make sure the list of names is depth-first (that is, the contents of a directory before the directory itself).  What is the purpose of the -print0 find option, and why should you use it in a production-quality script?  What are the matching options for tar and cpio, to make an archive of the files found by find?
  4. Backup /home using dump.  Use a level 0 dump.  Since we don't want to shut the system down to single user mode (or unmount /home), we will make an LVM snapshot volume of /home and then use dump on that.  The snap-shot volume typically needs about 15%–20% of the original filesystem's size.  How large will you make the /home snap-shot volume?

    Here is an example of creating, using, then removing an LVM snap-shot of /home (in this example /dev/VolGroup00/LogVol01):

    lvcreate --size 100m --snapshot --name home-snap /dev/VolGroup00/LogVol01
    mkdir /mnt/home-snap
    mount -t auto -o ro /dev/VolGroup00/home-snap /mnt/home-snap
    dump ... /mnt/home-snap  # read the man page for options to use
    umount /dev/VolGroup00/home-snap
    lvremove -v /dev/VolGroup00/home-snap
    rmdir /mnt/home-snap
    

    What are the exact commands you used?

  5. Completely backup /etc using a compressed tar archive.  What is the exact command(s) you used for this?
  6. Completely backup /etc using a cpio archive.  Now compress the resulting archive using either gzip or bzip2, whichever method you used for the tar archive in the previous step.  What are the exact commands you used for this step?
  7. Which archive is smaller, the tar or the cpio archive of /etc, and by how much?  Do you think the difference is significant?
  8. For each of the two archives, compute the MD5 checksum and store it in a file /tmp/name-of-archive.md5.
  9. Copy the tar archive backup of /etc along with the matching MD5 checksum file, to your home directory on the YborStudent.hccfl.edu server using scp command.  What is the exact command(s) you used?  Before running this command, check your quota on YborStudent and make sure you won't go over your hard limit!
  10. Copy the cpio archive backup of /etc, along with the matching MD5 checksum file, to your home directory on the YborStudent.hccfl.edu server using rsync command.  What is the exact command(s) you used?  (Note by default a modern rsync uses a secure SSH tunnel, the same as scp.)  Before running this command, check your quota on YborStudent and make sure you won't go over your hard limit!

    Describe briefly what the following command does, and list the meaning of each option used in your own words:
     rsync -HavRuzc /var/www/html/ example.com:/var/www/html/
    
  11. Now log into YborStudent and verify the integrity of the backup copies, using the MD5 checksum files.  What are the exact commands to do that, and what are the results?
  12. Still logged on YborStudent, extract the file /etc/group to your home directory from the tar archive.  Then extract the file /etc/hosts to your home directory from the cpio archive.  What are the exact commands needed for this?  Be careful not to try to extract the absolute pathname or you will attempt to over-write /etc/group.  (Don't worry, you don't have permission to do that!)

    When done, delete both archives.

(Optional) Part III — Using pax and star

The pax backup utility is a POSIX standard tool available on all Unix/Linux platforms.  (Although more popular, tar, cpio, and rsync are not standard tools.)  This tool is based on tar and cpio and can read and write a variety of archive formats.

POSIX has in recent years defined new formats which in theory can backup ACLs and extended attributes or EAs, such as used for SELinux.  The ustar format is technically known as the IEEE/Posix1003/IEC-9945-1-1988 Standard Data Interchange format.  The successor to ustar is the POSIX-1003.1-2001 Standard Data Interchange format, commonly called the pax archive format.

The star archive tool created by Jörg Schilling is a nonstandard but widely used tool similar to tar but able to handle a wider range of archive types than even the pax tool.  Taking advantage of the extensible nature of the ustar format Jörg has defined exustar format, which the star tool can read/write and which does backup ACLs.  star is currently (2010) the only archiving tool for Fedora which does this.  However it won't use this format by default!  You must use the correct option(s) to force star to use exustar format.

For this optional part of the assignment you should make a full archive of your home directory using star and then repeat with pax.  To make this interesting, add some ACLs and some extended attributes to some files first.  After the archive is made, delete those files.  Finally attempt to restore the files from the archive, including all attributes (date/time, modes, ACLs, and EAs).  Also see if you can recover the exustar archive made with star by using the pax tool.  What were the exact commands you used?  Were the restore attempts successful?

After reading the man pages (really!) try something similar to the commands used in this star and pax session typescript.

Hints:

A good site to check for backup hardware is NewEgg.com.

Be careful not to exceed your disk quota while doing this assignment!  If necessary use the man (and / or info) pages to see how to exclude some directories from the archives when you create them.  (This won't affect the learning benefit from this assignment, but of course in real life you do need to backup everything.)  /etc/gconf/ and a few other directories in /etc can be quite large.  You can use du, sort, and head (or tail depending on your sort options) to find the largest few directories to omit.

If necessary copy the tar archive first, then delete it before copying over the cpio archive.

A fancy shell script to backup /etc can be found at Backup-etc.sh.  Try it using the -v option.

ACLs and EAs only work for some filesystem types (including ext* types) and then only if the appropriate mount options are used.  Look at the mount options for /home on YborStudent to see an example.

To Be Turned In:

A copy of your YborStudent backup policy, and a copy of your journal pages showing the steps you have taken and the answers to the questions for this assignment.  Don't turn in your whole journal, you will need to add to it every day in class!  It is common in fact to keep the journal as a text file on the system (with a paper backup of course).  You can send as email to (preferred).  Please see your syllabus for more information about submitting projects.