Using Temporary Files in Unix/Linux Shell Scripts

 

Written 4/2006 by Wayne Pollock, Tampa Florida USA

Adapted from: linuxcommand.org/writing_shell_scripts.php, and the Linux mktemp(1) man page.

Overview of the Problem:

Take a look at the following script:


#!/bin/sh
# A shell script to be run in the background, watch will
# check each 10 seconds and report on who logs in and who logs out.

new=swatch-new
old=swatch-old

who >$old         # Initialize the old list.

while :           # Do forever
do
  who >$new
  diff $old $new
  mv $new $old
  sleep 10
done |awk '/>/ {$1 = "Just logged in:   "; print}
           /</ {$1 = "Just logged out:  "; print}'

This script watches the output of who and reports logins and logouts.  One problem with this script is that when it terminates, it leaves behind the temporary files in the current directory (assuming that directory is writable, or the script fails).  A simple fix is to put temp files in the user's home directory:

new=$HOME/swatch-new
old=$HOME/swatch-old

It is a Unix tradition to use a directory called /tmp (or /var/tmp or even /dev/shm) to hold temporary files used by programs.  This prevents a user's home directory from getting full of old useless files.

Since everyone may write files into the /tmp directory, naturally there are some security concerns.  One feature of /tmp is that is has the text (or sticky) bit set, which means only the owner (or root) can delete or rename files in that directory.  However if an evil hacker can predict your filenames, they can create that file first (a denial of service).  They can also read your files, if created using a default umask value of 022.  Even if you immediately change the permissions, there will be a window of opportunity to access the file (sometimes called a race condition).

Predictable file names also allow an attacker to create symbolic links to other files that the attacker wants you to overwrite:

evil-user$ ln -s /home/you/important-file /tmp/swatch-new

For these reasons temporary file names should not be predictable.  A good temporary file name will help you figure out what script or program created the file but still not be predictable.  One approach is to make a temporary directory using a good naming scheme including a random number as part of the name.  While this does allow one to guarantee that a temporary file will not be subverted, it still allows the denial of service attack.

The preferred technique is to write temporary files in a local directory such as $HOME/tmp (a tmp subdirectory in the user's home directory), which should only be accessible to the owner.  (Or even create a securely named temporary directory within a user's personal temorary directory).  That still leaves the problem of accumulating files.  (At least /tmp is cleaned out periodically by the system!)  One technique is to use .logout (for shells that support it) to add a command to remove temporary files.  You could to the same thing using a crontab job.  One way you could do this is something like this: find ${TMPDIR:-/tmp} -mtime +29 -exec /bin/rm -f '{}' \;

Creating Good Temporary File Names:

The $TMPDIR variable contains /tmp (or ~/tmp or a similar directory, depending on availability).  This should be used for the location of the temporary file.

It is often a good idea to create a temporary directory (inside of $TMPDIR) if there is a utility that does this securely; then you can add as many temporary files of any name to that directory without worry.

It is a good idea to use the name of the program and the process ID (PID) as part of the file name.  Use the $$ shell variable as part of the file name.  This identifies which process is responsible for the file, and is useful as a debugging aid.  The PID never changes for the life of a shell script, and helps make unique names.  (Imagine running a script in two windows at the same time.)

To make the name highly unpredictable, append a random number to the file name.  Use the $RANDOM shell variable as part of the name.  Using these techniques, file names are both easily identifiable and unpredictable.

The following line of code creates the temporary files using these guidelines:

new=$TMPDIR/swatch-new-$$.$RANDOM
old=$TMPDIR/swatch-old-$$.$RANDOM

Using mktemp:

mktemp is a non-standard (not part of POSIX) utility provided to allow shell scripts to safely create temporary files.  It should be used if available.  This command guarantees the generated filename is not currently in use, will put the file in an appropriate tmp directory, and will create the file read-write for the owner only.  This makes using mktemp much safer than the alternative of:

touch $TMPDIR/program-name-$$.$RANDOM

Use of the -t option causes the file to be created in a directory, by default /tmp but you can over-ride that by setting the TMPDIR environment variable (say to ~/tmp).  The -t option is the default if you don't supply any other command line arguments.

If you don't care about the name of the file then just use:

TMP=`mktemp`

or:

TMP=$(mktemp)

You can control the name by using a template, with a leading part of the file name specified by you and a trailing part picked randomly:

TMP=`mktemp -t scriptname.$$.XXXXXXXXXX`

(Note if you supply a template you must use the -t option or the file will be created in the current directory.)  Each capitol letter X is replaced with a random character (not just a digit, making this more secure than using $RANDOM).  Ten X's should be good enough.

Unix systems may not have any mktemp utility as it is not part of POSIX.  There is a C function you can easily use, or this standard m4 macro:

echo 'mkstemp(fileXXXXXX)' | m4

(This works on Linux too, but isn't as useful at mktemp.)  The only other way I have found to do this same task using only POSIX standard utilities is:

#!/bin/sh
# mktemp.sh - A portable and compliant mktemp replacement.
# version 0.2 by Wayne Pollock, Tampa Florida USA, 2007
# Usage: mktemp.sh [ base_file_name ]
#
# TODO: Make production quality: add options and parsing,
#       help, better error handling.
#
# (See also the POSIX definition of pathchk utility, the
# "rationale" section, for examples and a discussion of this.)

set -C  # turn on noclobber shell option

# The only standard utility that provides random numbers:

rand()
{
    awk 'BEGIN {srand();printf "%d\n", (rand() * 10^8);}'
}

umask 177

NAME="$1"
NAME="${NAME:=tmp}"

while :
do
   TMP=${TMPDIR:-/tmp}/$NAME-$$.$(rand)
   : > $TMP && break
done

printf "%s\n" "$TMP"

Cleaning Up Temporary Files:

Good practice would dictate that a script deletes any temporary file(s) it creates when the script terminates.  This can not be done just by adding the following to the end of a script:

rm $TMP

This would seem to solve the problem, but what happens when the user types control-C to end the script?  The script will terminate at that point and the rm command never gets run!  You need a way to intercept various signals, to execute appropriate cleanup code (in this case to delete the temporary files).

Signal Handling Using trap:

The trap shell built-in command allows you to execute a command when a signal is received by your script.  It works like this:

trap command list-of-signals

If the command is a single dash (-), or with many shells an empty string (""), the listed signals are simply ignored.  If the command is missing altogether, the signals listed go back to their default action.

The signals that a script commonly needs to handle include SIGHUP (1), SIGINT (2), SIGQUIT (3), and SIGTERM (15).  The list can be either the names or the numbers in bash.  (See kill -l and man 7 signal for more information on signals.)

The swatch script might handle signals like this:


#!/bin/sh
# A shell script to be run in the background, watch will
# check each minute and report on who logs in and who logs out.

new=$(mktemp -t swatch-new-$$.XXXXXXXXXX)
old=$(mktemp -t swatch-old-$$.XXXXXXXXXX)

trap 'rm $old $new' 1 2 3 15

who >$old         # Initialize the old list.

while :           # Do forever
do
  who >$new
  diff $old $new
  mv $new $old
  sleep 10
done |awk '/>/ {$1 = "Just logged in:   "; print}
           /</ {$1 = "Just logged out:  "; print}'

The trap command will execute rm $old $new if any of the listed signals are received.

trap will only accept a single string containing the command to be performed when a signal is received.  You can use ; to separate multiple commands in the one string to get more complex behavior, but that gets ugly if you have more than two or three short commands.  A better way is to create a function (a good name is clean_up) that is called when you want to perform any actions at the end of a script:


#!/bin/sh

TEMP=$(mktemp -t foo-$$.XXXXXXXXXX)

function clean_up {
	# Perform program exit housekeeping
	rm $TEMP
	exit
}
trap clean_up 1 2 3 15

... rest of script goes here...

clean_up

One remaining problem is that a script may exit at several points in the middle of the script (say if an error occurs).  Having a single call to clean_up at the bottom may not work.  A pseudo-signal EXIT or 0 (zero) can be used with trap, to have the command executed on exit for any reason.

The re-written script then looks like this:


#!/bin/sh

TEMP=$(mktemp -t foo-$$.XXXXXXXXXX)

function clean_up {
	# Perform program exit housekeeping
	rm $TEMP
        trap 0  # reset to default action
	exit
}
trap clean_up 0 1 2 3 15

... rest of script goes here...

Using trap without any arguments (or just --) produces a list of the modified traps set, in a way that can be reused.  For example:

orig_traps=$(trap)  # save current traps
trap ...
...
eval "$orig_traps"  # restore traps

This might be useful in a shell function, so that when you leave the function you restore the traps the way there were.