CTS 1106 (Intro to Unix/Linux)
Shell Scripting Tutorial and Study Guide

 

©2009 by Wayne Pollock, Tampa Florida USA

Table Of Contents

  1. Overview
    1. Why Use Shell Scripts?
    2. Creating Simple Shell Scripts
    3. Script Writing Process
    4. Your First Script
  2. Additional Shell Features Useful In Scripts
    1. Sourcing Scripts
    2. Comments
    3. She-bang (shell-bang)
    4. Positional Parameters
    5. Command Substitution
    6. The Colon Command
    7. Simple Arithmetic
    8. Exit Status
    9. The if Statement
    10. The test Statement
    11. Debugging Shell Scripts
  3. Summary

Overview

A shell script is a plain text file that contains one or more commands.  The advantage to using a script is that complex or lengthy commands can be entered once, ahead of time, and then later run just by entering the shell script name.  A person only needs to spend the effort of solving some problem once, and can use the script to solve similar problems any time they come up in the future.

Why Use Shell Scripts?

Shell scripts have many uses.  The login scripts are just shell scripts that are run automatically whenever you log in.  You can add any commands you wish to these files.  Unix systems use scripts to initialize the system when it boots up, and other scripts to start and stop services such as a web server or print server.  The ability to read and modify shell scripts is therefore a skill needed by system administrators (SAs).  Developers and others can use scripts to initialize databases, deploy applications, automate testing, and produce reports.  SAs can also use scripts to automate deployment, patching, log file rotation, account creation, system monitoring, etc.

While there are alternative methods to perform all these tasks, the Unix approach has always been to provide a lot of tools that can be combined in scripts to perform nearly any task needed.  So rather than learn the details of dozens of GUI tools, a person can make a similar effort to learn basic scripting, and be able to do any task needed without special tools.  This approach give you more power as you're not limited to the functions provided by those GUI tools.

Creating Simple Shell Scripts

Anything you can do at the keyboard is legal in a script.  When you run a shell script you really start up a regular shell, with its input redirected from the file rather than the keyboard.  Indeed, the shell doesn't know or care if the commands it reads come from the keyboard or a file.  If you can enter commands at the command line, you can create a script.

There are exactly two differences between commands in a script and interactively entering commands: There is no history mechanism (the “!” history character isn't special), and there are no prompts.

Some students can be intimidated by shell scripts; creating them is “programming” in some sense.  However you do not need to be a programmer to create your own scripts.  Remember the commands you put into a script file are exactly the same as the ones you would enter at the keyboard.  It is true that the shell contains many powerful features that are useful in complex scripts.  But you don't need to use them at this stage of your career.

Script Writing Process

It is often the case that you will need to “write a script to do task XYZ”.  So where do you start?

  1. Start by solving the task at the keyboard.  That is, forget for the moment you need to write a script, and just figure out how to do the task at the command line.  Normally this is the hard part!
  2. Only once you have figured out the solution should you start creating the script.  This is the easy part.  You simply put the commands that you entered at the command line into a file with vi or some other editor.  This file will become the script.

    Note you don't have to retype all the commands you used.  If they are on the screen you can copy and paste into the editor.  Or you can use the fc command to redirect a part of your shell command history into a file.

  3. Once the commands are in the file, you can improve the script by using some of the extra shell features.  For example it would be a good idea to add some comments to the script.  Comments are just text in the file that the shell will ignore.  However they are visible to humans who need to read (or modify) your script sometime in the future, long after you've forgotten what all those command line options and arguments mean!
  4. Finally, change the permissions and move the file to a directory listed on PATH, to make the script more convenient to run.

Always do these four steps!  They can be a great temptation to skip step one and begin by editing a script in vi.  Don't do it!  It is much easier and faster to solve the task “at the keyboard”.

Your First Script

Let's start simply with a script that does the following:

Create a script that will display a calendar for the current month, followed by the current date.

This one is simple to solve at the keyboard (user input shown in boldface):

[auser@localhost ~]$ cal
     March 2008
Su Mo Tu We Th Fr Sa
                   1
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
[auser@localhost ~]$ date
Thu Mar 27 18:57:37 EDT 2008

Now on to step two.  Create a file “caldate” (or some other name if you prefer) containing two lines, the two commands used at the command line to solve the task.

Time to test the new script!  To run, start up a new shell with it's input redirected from the script file:  “bash < caldate”.  If the script doesn't work, most likely you have a typo.  The error messages you get should help pin-point the problem.  After modifying the file in the editor, try it again.  Repeat until it works the way you expect.

Note that scripts need read permission so the shell can read it.  (Because scripts must be readable to be run, there is little security for scripts; anyone can view or copy any script that they can run.  Compiled programs don't need to be readable, and are thus inherently more secure than scripts, which are interpreted (read) every time they are run.)

An easier way to run the script is by using “bash caldate”.  This works the same way as before (the shell reads commands from the script file).

Of course what we would like is to be able to run the script just by typing the script name.  If you try to run your script by just entering “caldate” at the shell prompt, it will fail.  One reason is that the current directory (“.”) isn't listed on PATH). 

To run any command (script or compiled program) that resides in a directory not listed on PATH, you need to use a pathname.  When you enter just a filename, the shell tries to find it in the directories listed on PATH if it isn't a built-in command.  The simplest pathname to use for a script in the current directory is “./caldate”.

The final step for your masterpiece is to put it into a directory listed on PATH so it can be run conveniently.  Only system administrators can add files to the system standard directories such as /bin, and you shouldn't put a script like this in those directories anyway.  Instead you should create a sub-directory in your home directory to hold scripts.  This directory name can then be added to PATH.  (Don't add “.” to your PATH.  That may seem convenient but can cause security problems!)

Most systems already list the directory $HOME/bin on PATH for you.  So all you need to do is to create this directory and move your script into it.  (If this directory isn't listed on PATH, you can always edit your login script to include the change.)

Move caldate to ~/bin.  Now, no matter what your current directory, entering “caldate” at the shell prompt should find your script!  However if you try this, although your script is found it won't be run.  This is because the system checks for execute permission before doing anything else.  So the system doesn't realize yet this is a script, and that it should instead run a shell with its input redirected.

To fix this you need to add execute permission for yourself to the script.  “chmod +x ~/bin/caldate” should do it.  Note that just because you added execute permission doesn't mean this isn't still just a text file of commands.  Marking a script as executable does not convert it to machine code (which is all the CPU can understand).  When the system tries to run your script, it will realize it isn't a compiled program but a script.  The system will then fire up a shell (the specific shell started depends on your system, currently Linux uses bash).

Remember all this convenience just hides the fact that you are really running “bash <caldate”.  Making a script executable does not convert it into machine code, which is all the CPU can understand.  The kernel has a default shell used when running scripts; on Linux it is /bin/bash.

Scripts are never executed; they are read by some interpreter such as bash, perl, python, or ruby.  Only read permission is required on scripts; adding execute permission makes them convenient to run, but is not required.

Never name a script the same as a shell built-in command (such as “test”), or you will only be able to run it by providing a pathname.

Now that you have mastered the skills and steps needed to create and run shell scripts, try a more complex script:

Create a script called “nusers” that displays the number of users currently logged into the system.

Show/Hide nusers.sh (+/-)
# Displays the number of users currently logged
# into this system.
#
# Written by Wayne Pollock, Tampa Florida USA
#
# Version history:
# 7/15/2008 - initial version
#             (not "production quality"!)

who | wc -l

Additional Shell Features Useful In Scripts

While useful scripts can be made with a series of simple commands and pipelines, the shell contains a large number of additional features to make scripts more powerful and useful.  Originally different shells used different features for this.  The C shell was popular to use interactively but a nightmare for most people wanting to create complex scripts.  Although all shells have useful scripting features, it pays to use only the subset of your shell features that are part of the POSIX standard.  Doing so will make your scripts portable (they will run on any system) and easier for others to read.  Have a very compelling reason before using non-POSIX features in any script your create.

Sourcing Scripts

When running shell scripts a new shell process is started.  So what happens when the script contains statements that modify the environment?  The changes only affect the environment of the shell process reading the script.  When that shell exits at the end of the script, the environment changes seem to disappear; they were never made in the login shell, so its environment is unchanged.

What is the environment?  In general it is where both people and processes (runnnng programs) live and work.  The environment is a collection of settings (in RAM) that a program can examine and change.

What settings form the environment?  The current umask value, the current working directory, the list of open files (including any redirections), the current signal handling settings (e.g., trap), and any variables with the export property set.  (Also process resource limits (ulimit -a), the user’s UID and GID, security information, and other OS-specific values; these are not specified by POSIX.)

Normally this is a good thing.  You can write a script that sets environment variables, changes settings such as PATH, umask, etc., and all those changes won't effect the environment of the user running the script.  However there are times when you want the commands in the script file to effect the login shell's environment.  An example is the login scripts.  These would be useless if the changes they made had no effect on your login shell!

A feature of the shell that supports this is the dot command, “.”.  (Many shells also allow “source” as a synonym, since a dot can be hard to see.)  When the shell command “script” is run, the current shell will read the commands from the file script.  No new shell is started.  When the end of the script is reached the shell continues as normal.  This is called sourcing a script.  (The script can be run with arguments as normal:  script arg1 arg2...)

You can see an example in the default ~/.bash_profile file.  Bash runs this login script for the login shell only, not for any sub-shells.  The file ~/.bashrc is run for all shells started except the login shell.  To have the ~/.bashrc commands run for the login script as well, that file is sourced from the login script.

Create a script to change the prompt that contains the single command “PS1='new prompt'”.  When you run this script normally your prompt won't change.  But if you source this script, you should see your prompt change.  (Don't worry, the next time you login you will get your standard prompt back.)

Sourcing scripts is also how the system administration scripts read files of settings (configuration files).  On a Red Hat like system you can see some of these in the /etc/sysconfig directory.  These are sourced by the scripts in the /etc/init.d/ directory.  (This explains the syntax of most configuration files; they're just shell scripts!)

Comments

A comment is simply a line of text that the shell ignores, but has meaning to humans reading the script.  Every well-written script will have at least some comments.  Sometimes comments are required by your organization and contain copyright and other legalese.

Your scripts should include some comments near the top of the file that include a brief statement of the purpose of the script (often the name alone isn't much of a clue), the author's name, and the date the script was written.  Additional comments can be placed throughout the script, clarifying the intent of complex and arcane commands and options.

Keep in mind the audience for comments: yourself in the future and others responsible for script maintenance, long after you've forgotten the details of your script.  They are not generally intended for the users of the script.

Sometimes (not in our class) scripts can be so complicated that they are developed in stages.  In such cases it can pay to write some comments before the script is developed fully, to express the overall design of the script.

Comments are easy to create.  Any word starting with an unquoted “#” character starts a comment, and the rest of that line is ignored.  A comment can be an entire line, or added at the end of a command.

As will all shell features, comments can be entered at the command line as well.  This is normally pointless, but useful for learning purposes.  Try to guess the output of the following, then try it to see if you are right:

echo "ABC #DEF" G#H #IJK; echo 123 # 456

Don't be afraid to experiment until you understand when a “#” begins a comment and when it is treated as an ordinary character by the shell.  Then go back to the scripts you've written and add the required comments.  (That is, required for this class, but a good idea in general.)  Also feel free to read the comments on the system scripts such as /etc/bashrc.  (You probably won't fully understand these scripts yet, but take a look anyway.)

She-bang (shell-bang)

Not every shell script is a Bash script.  Yet by default all scripts are read by bash on Linux (or some other shell on other Unix systems).  So what happens when you use a C shell or Z shell feature in your script that isn't supported in Bash?  You can minimize such problems by sticking to POSIX only features.  But sometimes these extra features are very useful and hard to avoid. 

In addition not all scripts are shell scripts.  There are many popular scripting languages in use, including Perl, Ruby, Python, and others.  Try this experiment:  Create a script named (for example) “say-hi” with the single command “print "hello\n";” and try to run it.  You should get an error message because this is not a shell script at all, but a Perl script.  Instead of running the “convenient” way, try the first way you learned to run a script:

perl < say-hi

And it should run fine.  The same would be true for a Z shell script or a C shell script, or for any other type of script.  You can just start the interpreter (the program that reads a script) with its input redirected from a script file.  But we're back at an inconvenient method to run scripts.  What is needed is a way to tell the operating system which interpreter (bash, zsh, perl, etc.) to start when the user attempts to run the script.  In other words, we don't want to rely on the system default shell.

Windows systems use predefined file extensions to identify the type of files, including script files.  However Unix systems don't pay any attention to the file name or extensions, so naming your scripts caldate.sh or say-hi.pl won't change which interpreter is used.

But adding “#!pathname” as the first line in a script file (and no leading spaces either) does tell a Unix or Linux system which interpreter to use.  When the kernel attempts to run a script, it realizes it is a script and not machine code, starts up the appropriate interpreter (identified by the pathname), and passes it the script as an argument.  The system checks the first line only for this special comment.  (You did notice the “#”, right?)  This line is known as a she-bang.

The “!” character is technically called an exclamation point, but that's quite a mouthful and many old-school hackers call this character a “bang”.  Since this comment tells the system what shell to use it was named the “shell-bang” line, which was shortened to “she-bang”.

Different systems have slightly different rules for using the she-bang line.  (This mechanism is not a POSIX feature!)  This line may need a space after the bang or such spaces may be forbidden.  The maxmum length of this line varies from 16 or 32 bytes on legacy systems, to 80, 127, or more on modern systems.  It depends on which Unix system you use.  For Linux, spaces here are optional.  These rules should be safe to follow on all modern systems:

When running a script with a she-bang line, the system starts the interpreter with the following arguments (in order):  the (optional) argument listed on the she-bang line, the name of the script, and any arguments supplied on the command line.

For example, suppose you run a shell script as follows:

myscript one two

and myscript has a she-bang line like this:

#!/usr/bin/perl -Tw

Then the kernel will build and run this command line:

/usr/bin/perl -Tw myscript one two

Anything else may result in strange error messages!  Especially avoid trailing spaces on this line; they can be hard to see but will still cause trouble on most systems.  DOS line endings will also cause an error.  This is because everything after the interpreter pathname (and any following blanks) up to a newline is treated as a single word (as if you had used single quotes), i.e. as a single argument even if it contains white-space characters.

Nearly every Unix (and Linux) system today installs a (mostly) POSIX compliant shell with the pathname /bin/sh, so that is the most common interpreter listed in a she-bang comment.  (That shell is usually Bourne shell compatible.)  In addition, the argument “-” is often used as a security measure to ensure the shell won't try to read any additional arguments.

Unfortunately there is no common location for any other shell or interpreter.  Interpreters such as perl or zsh may be found in /bin, /usr/bin, /use/local/bin, /opt/bin, or other directories.  Since you must list an absolute (complete) pathname and PATH isn't used, the wrong location is a common problem.  (I generally add symlinks to perl and other interpreters in several commonly used directories on my servers, so no matter what pathname is listed in the she-bang line, it will just work.)

A trick is to use “#!/usr/bin/env perl” (or whatever interpreter you want) to run a Perl script regardless of where perl is found.  The env command is used to run another command, and it will use PATH to find it.  However, no additional arguments can be used in this case.  And some systems may install /bin/env instead.

Example She-bang Lines
She-bang Line Notes
#!/bin/sh - Legal: most commonly used
#!/bin/bash - Legal
#!/usr/bin/perl -T -w Illegal: has two arguments
#!/usr/bin/perl -Tw Legal: only has a single argument (of two options)
#!zsh Illegal: no pathname used
  #!/bin/sh Illegal: line doesn't start with “#!
#! /bin/sh - Legal on Linux and some other systems
#!/bin/sh # the She-bang Line Illegal: Only one word is permitted as an argument

Positional Parameters

Shell scripts are useful when you have a complex task that you may need to do again in the future.  Often however some of the details change — a different filename must be processed, a different user name, or a different hostname.  Rather than edit the script for each minor change it is possible to craft a more general script to which you can pass arguments such as filenames and usernames.

Here's a simple illustration:  Suppose you create a script called “greet” that can be used to greet your friend named “Jim”.  Such a script would be essentially this line:

echo "hello Jim!"

And from now on when Jim walks by when you're logged in you can run greet and Jim will be amazed.  The problem is, what if a different friend named Kim walks by instead?  You'd like to greet all your friends but it would be inconvenient to create a different script for each one.  All these greet scripts would be identical except for the name.

You have the same issue with most utilities.  Consider ls.  It would be a pain if there were a different command to list the files in a directory, one command for each directory!  And that assumes you know in advance all directory names.

The solution is to allow the user to supply the data that is different each time as command line arguments.  In the case of ls, you can supply the name of a directory to list on the command line.  For our greet script it would make sense to have the user supply the name of the person to greet on the command line.  Then when Kim walks by you can type “greet Kim” and if Jim walks by instead you can type “greet Jim”.

When running any command or script, the shell puts any command line arguments in specially named environment variables in the command's (or script's) environment (and not your login shell's environment).  These variables can be used instead of “hard-coding” names into your script.

The first command line argument is put into “$1”, the second into “$2”, and so on.  (If more than nine arguments use the form “${10}” to refer to the tenth, etc.)  These are called the positional parameters because they are named by the position of the arguments on the command line (a parameter is just an argument with a fancier name).  Technically the actual name is “1” but I prefer to refer to these as “$1”, “$2”, etc.

The positional parameters get reset from the command line arguments every time a shell is started.

Normally you put scripts in a directory listed on PATH and run them as “script arg1 arg2 ...” or as “sh script arg1 arg2 ...”.  To supply command line arguments when running a script with I/O redirection you can use the “-s” option like this:
          sh -s arg1 ag2 ... < script

Each word after the command name becomes one argument, subject to quoting.  So the command line “myscript -a -bc yada "foo bar"” has four arguments:  “-a”, “-bc”, “yada”, and “foo bar”.  The first four positional parameters get set to these (and the rest are unset).

Change your greet script to this:

echo "hello $1!"

Now don't get confused by this!  If you try this command at the keyboard, you'll only see “hello !” (with no name showing)!  This happens because your login shell doesn't have any value set for the environment variable $1.  This should make sense; no command line arguments were given when you logged in and that shell was started.  But when you run a script a new shell is started and any command line arguments will be placed in that shell's environment.  Try running your new script with an argument like “greet Jim”.  Note this won't change the environment of your login script, just of the shell reading your script file.  So at the end of the script when that shell exits it takes its environment with it.

When the shell starts up a number of other related environment variables will be set based on the command line arguments.  Some of the more useful include:

$0
Expands to the command’s (path)name.  When sourcing a script, “$0” will be unchanged (e.g., “-bash”.  (With Bash you can use “$BASH_SOURCE” instead, which is set correctly in either case.)
$#
This variable is set to the number of command line arguments provided.  For your login shell it is probably zero.  As will be seen later you can use this value to test in a script if the user supplied any arguments.
$*
This gets set to a string of all the command line arguments.  It is roughly equivalent to “$1 $2 $3 $4 $5 $6 $7 $8 $9”.  $* can be useful when you don't know how many words the user has supplied on the command line.  For example what happens when you run “greet Hymie Piffl”?  If you replace the “$1” in your greet script with “$1 $2”, running “greet J. R. R. Tolkien” will greet “J. R.” (a character from an old soap-opera and not a famous author).  Or if two friends walk by together and you try to greet them with “greet Dick and Jane”?

Using “$1 $2 $3 $4” won't always work either.  What happens when Cher walks by and you wish to greet her?  Using “$*” will work in all cases.

$@
This variable is exactly the same as “$*” with one exception.  The two behave differently when used inside double-quotes:

"$*"   becomes   "$1 $2 $3 $4 ...", while
"$@"   becomes   "$1" "$2" "$3" "$4" ...

The difference is rarely important but sometimes it is.  For example, suppose you need to use grep inside a shell script to search for the current user's name inside of one or more files listed on the command line?  Inside the script you might have the line “grep "$USER" "$*"”.  This won't work correctly if two or more files are listed.  And not using quotes at all won't work if the filenames contain any spaces.  Using “"$@"” instead will solve this problem.

It is possible to reset the positional parameters in the current shell's environment by using the set command:

set -- this is a test
echo $#;            # displays: 4
echo $1             # displays: this
echo "$*"           # displays: this is a test
echo "$@"           # displays: this is a test
echo $3 $4 $1 $2    # displays: a test this is  (Yoda-speak)

The “--” in the set statement is not strictly needed but it's a good idea.  It prevents errors when the first command line argument starts with a dash.  (You can use this with other commands too, such as rm.)  Using “set -- args” from the command line will set the positional parameters for your login shell.  This means you can use the exact same commands at the command line prompt as you will eventually use in a script.

Command Substitution

Command Substitution is commonly used in shell scripts with expr and other utilities.  It allows one command's output to be included as a command line argument to another:

outer-command arg ... inner-command arg ... arg ...

The inner command runs first.  Then the outer command is run, with its command line modified to include the output of the inner command, substituted for that command:

outer-command arg ... output of inner-command arg ...

The inner command is surrounded by a pair of back-quotes, also known as the back-tic or grave accent character.  Since this can be hard to see (with many fonts it looks like a single quote a.k.a. apostrophe), POSIX also allows the inner command to be inside of “$(command)”.  Here are some simple illustrations:

echo today is $(date)!
echo today is `date`!
echo There are $(wc -l < /etc/passwd) user accounts.
sum=$(expr 2 + 2)

When used with the set command, the output of a command can be parsed into words.  This can be an effective way to access only a part of some command's output (if you don't care that the positional parameters get reset in the process):

set -- `date`
echo today is $1, $2 $3 $6.

Using this technique you can write a shell script that prints only the name and size of a file whose name is supplied as a command line argument.  The main part of this script will look something like this:

set -- `ls -l $1`
echo file \"$9\" contains $5 bytes.

The number of fields used for the date and time in the “ls -l” output will be two or three fields depending on the locale.  With “LC_TIME=POSIX” the date and time will take three fields.  For “LC_TIME=en_US” on Linux only two fields are used, so change “$9” above to “$8”.

The Colon Command

The colon command (“:”) is a shell built-in command that does nothing.  It is sometimes referred to as the no-op command (for no operation).  In some ancient shells there was no comment character, so you could use the colon instead:

: this is ignored

Today the only use of this command is to do nothing while making the shell evaluate the arguments (or as a place-holder when some command is required by the syntax of the shell).  This is useful with arithmetic expansion, which is discussed next.

Simple Arithmetic

There is sometimes a need to perform some simple math in a shell script.  You might want to count things, such as the total number of files/lines/users/whatever processed.  In this case you would like to add one to a variable (that is initialized to zero) for each item processed.  You might have a need to calculate a date 90 days in the future.  You might have a script that produces reports, which have columns that need totaling or averaging.  An interactive script might need to display a line number as a prompt.  Or you might be creating a script that plays a game and you need to translate a number from 1 to 52 to a suit (divide by 4) and rank (divide by 13).

The original Bourne shell contained no ability to do math.  However there are standard utilities such as expr for simple integer math, and other more powerful utilities such as bc for more complex calculations.  (I rarely will create a shell script when the task involves complex math, and prefer Perl or even a compiled language such as “C”.)

Most utilities that handle floating-point math default to either 0 or 14 decimal digits, so it pays to know how to control the format.  In these examples the output of 2÷3 (or any expression) is either rounded or truncated to 2 decimal places.  I generally prefer using Perl, however Perl isn't part of POSIX.  (Neither is the “-q” argument to bc, but it is needed with the Gnu version.)

perl -e 'printf "%4.2f\n", 2/3'    # output: 0.67
perl -e 'printf "%.2f\n", 2/3'     # output: .67
awk 'BEGIN{printf "%4.2f\n", 2/3}' # output: 0.67
echo 'scale=2; 2/3' |bc -q         # output: .66
dc -e '2k 2 3 / p'  # (dc uses RPN)  output: .66

POSIX has standardized some simple integer-only math in the shell, and modern shells have added extra math functions as well.  Also note that Perl, AWK, and bc include some standard math functions such as square-root, sine, cosine, etc.

Some system administration scripts don't use the POSIX shell arithmetic and often use expr instead.  (Perhaps because the script writers can't depend on the users to have a POSIX shell; older systems may only have Bourne or C Shell.)  To be able to understand and modify such scripts, you should become familiar with both ways of performing basic math calculations.

Using arithmetic expansion allows you to include the results of a calculation as an argument to some command:

echo "3 + 4 = $((3+4))"  # shows: 3 + 4 = 7

You can use environment variables inside the expression.  The ones that contain numbers don't even need a dollar sign:

num=8
echo $((num + 7))   # shows: 15
echo $(($num + 7))  # the same
echo $((num+7))     # spaces are optional
sum=$(( 3 + 4 ))
echo 3 + 4 = $sum   # shows: 3 + 4 = 7

It is also possible to assign a value to an environment variable within the expression.  This is called an assignment expression (note the single “=”):

echo $((num = 2 + 3))  # shows: 5
echo $num              # shows: 5

To assign a variable without doing anything else you can use the colon command.  This works because the colon and the arithmetic expansion is evaluated in the current environment.  Let's look as some examples, showing the basic math operators.  Some of the results may appear strange if you're not used to “computer math”:

: $((w = 2 + 3)); echo $w  # shows: 5
: $((x = w - 1)); echo $x  # shows: 4
: $((w = 5 + 2)); echo $w  # shows: 7
echo $x                    # shows: 4  (x doesn't change because w did!)

: $((y = 5 * 2)); echo $y  # shows: 10 (multiplication)
: $((z = 5 / 2)); echo $z  # shows: 2 (the quotient only), not 2.5!
echo $(( 2 / 3))           # shows: 0 (no rounding!)
echo $((15 % 4))           # shows: 3 ("%" is the remainder or modulus)

echo $(( 2 + 3 * 4))       # shows: 14, not 20
echo $(((2+3)*4))          # shows: 20 (but should add some spaces
                           # to make this more readable)

x=4
: $((x = x + 1)); echo $x  # shows: 5! (This is the common way to
                           # add one to a variable.)
num=2
echo $((2 = num))          # shows an error message
echo $((2 == num))         # shows: 1 (which means "true", unlike an exit status)
echo $((num == 3))         # shows: 0 (for "false")

Note that the assignment operator=” means to calculate the value of the expression on the right and then assign it to the variable on the left.  Unlike normal math, you can't reverse this!  (E.g., “2 +3 = x” is an error.)  The “==” operator is used for equality testing.  It means to compare the expression on the right with the one on the left.

Bash and some other shells have a built-in let command that works the same way but produces no output, so it's useful only for assignment expressions.  Here's an example:

let 'x = 2 + 3'

Not all shells support POSIX features so using the expr utility may be more portable.  It works very much like arithmetic expansion, only it supports some additional operators.  To save the result in an environment variable you use expr with command substitution.  One difference is that with arithmetic expansion, the expression is treated as if it were double-quoted, so you can use “*” or parenthesis without worry.  expr requires you to quote such operators.  Another difference is that expr expects each operator and operand to be a separate command line argument; that means you must use spaces between each item.  Here are a few examples:

expr -- 5 + 2       # shows: 7, returns exit status of 0 (for true)
expr -- '5 + 2'     # error, expression is only 1 argument, not 3
expr -- \( 4 + 3 \) \* 2  # shows: 14, exit status of 0
                          # (Don't forget to quote special characters!)
num=$(expr -- $num + 1) # adds 1 to num
num="-1"
expr $num \* 3          # may cause an error
expr -- $num \* 3       # shows: -3, exit status of 0

That last case explains why you should use the special end of options option of “--”.  Without it expr might get confused when the first argument starts with a hyphen.  (Some versions of expr will not get confused by this, it depends on which type of Unix system you have.)

Besides printing out a result, expr returns an exit status you can use with an if statement.  You may see code such as this:

if expr -- $num == 0 >/dev/null
then ...
fi

While you may come across expr in scripts you need to read, I prefer to use arithmetic expansions in scripts I write.  However there is one very useful operator that expr provides and isn't available in arithmetic expansions: “expr -- string : BRE”.  This matches a string of text against a Basic Regular Expression, and returns the number of characters matched (as well as setting an exit status appropriately), or the string matching the first group.  While you don't need to understand regular expressions in any detail in this course, keep this use in mind, it is handy!  For example:

first_char=$(expr -- "$foo" : "\(.\).*")

will return the first character in some variable named “$foo”.

Exit Status

When you enter commands interactively at the keyboard any errors are usually obvious.  Depending on the results you see you can do one thing or a different thing next.  But when you create a shell script of several commands you can't know what will happen in the future when the script is run.

To help out script writers all commands return an indication if they worked or failed.  Using some features of the shell you can test the results of running commands, and have your script do one thing or another.  This indication is known as an exit status.

The exit status is a small integer.  Commands return zero for success.  Any other exit status (the range is -128 to +127) indicates a failure.  (In the next section you'll see how to use the exit status with the shell's if statement.)  It is common (and recommended) to use “1” to indicate an error since some systems will behave strangely with a negative exit status.  Many standard utilities document the exit status values they might return in their man page, in a section named “exit status”, “return code”, “diagnostics”, or some such name.

The shell puts the exit status of the last command that ran in a special environment variable named “$?”.  Try the following set of commands:

who -e  # an error since who doesn't have an option "-e".
echo $? # shows: 1
echo $? # shows: 0

Can you see why the second echo command prints zero?  It's because at the point in time when the shell is expanding the $?, the last command was the first echo command and no longer the who command.  Since that first echo command worked successfully, $? gets set to 0 (zero) for success.

Recall the shell's exit command doesn't really log you out, it exits the shell.  (If that's your login shell then you get logged out automatically.)  Using exit in a shell script just causes the shell reading your script to quit.  You can use “exit status” to exit the script early with status exit status, for example “exit 0” or “exit 1”.  (When you are ready to end your session today try to log out using “exit 0”.)

Without an explicit exit command a shell terminates when it hits the end of the script (known as end of file or EOF).  In that case the exit status of the shell is set to the value of $?.

Finally, note the exit status of a pipeline is the exit status of the rightmost command, so running “who -e |sort” will show an exit status of 0 (zero) since even though the who command failed the sort command worked (it is not an error to have no data to sort).

The if Statement

The hard part of creating a script is to think about what might happen when a script is run.  You need to plan ahead for all the possibilities.  This can be especially hard when user input is involved since you can never tell what input (if any) the user will supply.  For example, consider the simple greet script created earlier.  What happens if the user fails to supply any command line arguments at all?  (Fortunately the various startup scripts users and administrators are likely to encounter are usually straight-forward.)

When you enter a command at the keyboard you can see the result before deciding what to do next.  In a script you must think ahead and decide in advance what to do next when a command returns an unexpected result.

A powerful tool for this is called an if statement.  Using an if statement you can run some command and test the result, doing one thing or another next depending on that result.  A simple if statement works this way:

if any_command
then
     success_command_1
     success_command_2
     ...
else
     failure_command_1
     failure_command_2
     ...
fi

First the any_command is run normally.  Next that command's exit status is examined.  If 0 (zero for success) then the success commands (the commands following the then keyword) are run in order.  However if the exit status is non-zero, the failure commands (those following the else keyword) are run instead.  Only one set or the other will be run.  The keyword “fi” (which is “if” spelled backwards) marks the end of the if statement.

The else clause is optional.  If omitted and the exit status of any_command is non-zero, the “success” commands are simply skipped.

There is no restriction on the number or type of commands used in the then clause or the else clause.  You can have one or more commands (Bash allows zero commands) and can even use one if statement inside of another.  Here's a trivial example:

if who -e
then
      echo who worked, yea!
else
      echo who failed, shucks!
fi

Commands such as grep and expr are used for any_command, although the test command (discussed next) is the most commonly used.  Also note that the indenting is optional as far as the shell is concerned.  Extra space makes a complex script easier to read by humans (including your instructor), but it is legal to omit the extra space.  In fact, you can put the whole if statement on one line as long as you remember to follow each command with a terminating semicolon:

if who | grep hpiffl; then echo "user hpiffl is logged in"; fi

(Here I omitted the else clause.)

Having one if statement inside another is called nested statements.  To have a series of if statements where you have multiple tests and only want to run one set of comments (for the first successful command), you can use “elif” clauses:

if command1
then
    success_set_1
elif command2
then
    success_set_2
elif command3
then
    success_set_3
...
else
    failure_set
fi

This is called an “if chain” or “if ladder”.

What if you only want to do something if any_command fails?  One way is to have a then clause containing only the colon (the do-nothing) command:

if who | grep hpiffl
then
    :
else
     echo "user hpiffl is NOT logged in!"
fi

However a more elegant approach is to use the logical NOT operator.  The shell allows you to run any command (or pipleline) as “! command”.  This reverses the sense of the exit status of the command.  So the above if statement can be written more clearly as:

if ! who | grep hpiffl
then
     echo "user hpiffl is NOT logged in!"
fi

A common mistake made with if statements is to try to compare two things, like this:

some_command
if "$? = 0"; then ...; fi         # won't work
if "$?" = "0"; then ...; fi       # won't work
if "$USER" = "root"; then ...; fi # won't work
if "$1" = "--help"; then ...; fi  # won't work
if "$#" > "0"; then ...; fi       # REALLY won't work

The problem is an if statement only runs a command and tests the resulting exit status.  The expr command does include some numerical comparison operators and the regular expression matching operator (for string comparisons), however in most cases the test command (discussed next) is easier to use.

The shell supports an alternative syntax for simple if statements that you might come across when reading scripts:

any_command && success_command

has a similar meaning to:

if any_command
then
     success_command
fi

And:

any_command || failure_command

has a similar meaning to:

if ! any_command
then
     failure_command
fi

The shorter versions are often more readable, once you get used to the syntax.  These can be combined:

any_command && success_command || failure_command

(This is not exactly the same as if-then-else-fi, but commonly used.)

The test Statement

Unlike most commands, the test command doesn't produce any output.  Instead it evaluates an expression given as command line arguments and returns an appropriate exit status: zero if the expression evaluates to true, and one if the expression evaluates to false.  This command is perfect to use with if statements, for example:

if test "$USER" = "root"
then
     ...
fi

All sort of tests can be done with this command:

Like the shell itself, test supports a “logical NOT” operator of “!”.  In fact test supports other logical (Boolean) operators too: logical AND and OR operators, and allows (quoted) parenthesis to be used for grouping.

Like the shell itself, test supports a “logical NOT” operator of “!”.  Although currently test has other logical operators (Boolean operators), and allows (quoted) parenthesis to be used for grouping, their use is discouraged (it is planned on removing these features from test at some point).  Until a new shell feature is available to support Boolean operators (likely “[[ expression ]]” will be adopted), you should use the shell’s Boolean operators, for example:

	test expression1 && test expression2

Without going over all the supported operators and other nuances, here are some examples to illustrate the more common cases.  (Consult the man page for a complete list.) 

test "x$1" = "x-h"      # true if "$1" equals "-h"; safe
test "$USER" = "root"   # true if $USER is root; common but unsafe
test "$USER" != "root"  # false if $USER is root
test ! "$USER" = "root" # the same
! test "$USER" = "root" # the same

test -z "$1"     # true if $1 is has zero length
test -n "$1"     # true if $1 is has non-zero length
test "$1"        # the same but unsafe

test -r ~/foo.txt    # true if the file exists and is readable
test -x ~/foo.txt    # true if the file exists and is executable
test -d ~/foo        # true if ~/foo is a directory
test ! -r ~/foo.txt  # false if the file exists and is readable
test "-r" = "$foo"   # a string test, not a file test!

test "$#" -eq "0"   # true if $# equals zero
test "$#" -gt "1"   # true if $# is greater than one

# Also -ne (not equal), -lt (less than),
# -le (less than or equal to), and -ge (greater than or equal to)

num="007"; test "$num" -eq 007  # true
num="007"; test "$num" -eq 7    # true
num="007"; test "$num" = "7"    # false

A common mistake is to use “>” for “greater than”, rather than “-gt”.  “>” and “<” are the shell's redirection characters.

It is important to quote environment variables, since if one isn't set at all test will not see anything and will report an error.  (For example “unset num; test $num -eq 5”.  Quoting prevents this error since empty quotes denotes a zero length string, which isn't the same thing as no string at all.)  Of course in some cases it is safe, if you know the value could never confuse test.  However rather than worry about when it is safe and when it isn't, a good practice is to always use quotes for variables and even numbers and words.  (I know of several script writers who also advocate using “${name}” instead of “$name”, because it is safer.)

Many programmers are used to using parenthesis with an if statement to indicate the test.  While you can't use regular parentheses for this in the shell language (since they have another use, to indicate a sub-shell) you can use square braces.  So the following are equivalent:

test EXPRESSION
[ EXPRESSION ]

Spaces around the braces and operators such as “=” and “-r” are required!  It is a common error to forget the spaces and try to use something such as “if ["$#"="0"]”.

Combining all these commands and shell features can lead to some elegant (some would say cryptic and confusing!) statements.  Many login and startup scripts contain statements such as the following; see if you can understand what it does:

[ -r ~/.bashrc ] && . ~/.bashrc
Show/Hide Answer (+/-)

The first part tests if the file .bashrc exists in your home directory and is readable.  If so the second part runs, which sources that file.  (This line is commonly seen in Bash login scripts.)

Think of all the uses for an if statement with test:

Now try to use some of these features:  Modify the greet script described previously (see Positional Parameters) so that if no command line arguments are given (that is, no name to greet), print an error message to standard error and exit with an error status.

Show/Hide greet.sh (+/-)
#!/bin/sh -
# Script to display a friendly greeting to the
# person named on the command line.
# Note how the error message was redirected
# to the standard error.
#
# Written 3/2008 by Wayne Pollock, Tampa FL USA
#
# Version history:
#   3/22/2008 - initial version
#   7/18/2008 - added check for arguments;
#     if none, display an error message and quit

if [ "$#" -eq "0" ]
then
   echo 'Error: you need to provide a name' >&2
   exit 1
fi

echo "Hello $*!"

Debugging Shell Scripts

When a script longer than a few lines has an error, it may be difficult to pin-point the problematic statement.  The shell supports a “debug” mode, where it first echoes a command just before running it.  The command is marked in the output with the special PS4 prompt (which defaults to “”).  Also, all command line arguments are processed first, so you can see the expanded results after substituting variables, commands, etc.

Add the command  “set -x” near the top, or at least before the part of the script you suspect has the problem.  (You can turn this mode off with the command  “set +x”.)

Summary

  1. Shell scripts are used for login, system startup, administration, and to save a solution to a complex task so it need not be solved each time.
  2. Anything that can be done at the keyboard can be done in a script, and vice-versa.
  3. The only differences between running commands from a script and from the keyboard (interactively) is that history expansion is not done, and no prompts are displayed.
  4. Start creating scripts by first solving the task at the keyboard.  The commands can later be saved in a script file.
  5. Shell (or other) scripts are not machine code but plain text files.  An interpreter is actually started, which then reads the commands from the script file.
  6. To run any script the file must be readable.  Marking it as executable too is just a convenience.
  7. If the script is in a directory not listed on your PATH (or if it has the same name as a shell built-in), you will need to enter a pathname to run the script.  For a script in the current directory use “./script”.
  8. To run a script in the current shell (to be able to change the environment for instance) you must source a script with the dot (“script”) command.  Many shells provide “source” as a synonym for the dot command.
  9. A word starting with an unquoted “#” starts a comment, which extends to the end of that line.
  10. All well written scripts will contain some comments, usually including a few lines near the top with an overview and author information.  Any potentially confusing commands in the script should also have comments, usually on the end of that line if there is room, or on the line above.
  11. To determine which shell or other interpreter to run you can have a she-bang line at the top.  This special comment line contains the absolute pathname of the interpreter and optionally a single word for an argument.  The line starts with “#!”.
  12. Any command line arguments (including options) supplied when the script is run are passed into the script's environment, in variables called the positional parameters.
  13. In addition to the numbered positional parameters (“$1”, “$2” ...) the shell will set additional related variables including “$*”, “$@”, and “$#”.
  14. One command line can be embedded in another.  The inner command runs first, and is replaced with its output as an argument in the outer command line.  This is called command substitution.
  15. The colon command (“:”) is a do nothing command, useful to have arguments expanded (e.g., arithmetic expansions) and as a place-holder.
  16. The shell supports arithmetic expansions of “$(( expression ))”.  Non-POSIX shells can use the expr command.  Either way only integer arithmetic is supported.  For more complex math use a utility such as bc, dc, awk, or perl.
  17. All commands report an exit status back to their parent shell.  The exit status is zero for success or some other value for failure.  Status values are in the range -128 to +127.  Generally “1” is used as the error exit status value.
  18. An if statement can be used to do one thing or another, depending on the exit status of some command.
  19. The test command (and the expr command) can be used to test some expression involving numbers, strings, or files.
  20. Error messages and prompts can be redirected to standard error (“stderr”) using code like this:  “echo 'message' >&2”.
  21. The command set -x turns on the shell's debugging mode.  It can be turned off with set +x.