Shell Scripting Course
Shell Scripting Course
Introduction
§ Anna Langley, Infrastructure Division, UIS
Scientists § Ben Harris, Infrastructure Division, UIS
• What:
Day Two § Simple Shell Scripting for Scientists course, Day One
§ Part of the Scientific Computing series of courses
• Contact (questions, etc):
§ scientific-computing@uis.cam.ac.uk
Anna Langley
Ben Harris • Health & Safety, etc:
§ Fire exits
University of Cambridge Information Services
• Please use mobiles considerately
17:00
(bash).
• Differences between versions of bash
• Very advanced shell scripting – try
one of these courses instead:
§ “Python 3: Introduction for Absolute Beginners”
§ “Python 3: Introduction for Those with
Programming Experience”
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 3 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 4
The course officially finishes at 17.00, so don't expect to bash is the most common shell on modern Unix/Linux systems – in
finish before then. If you need to leave before 17.00 fact, on most modern Linux distributions it will be the default shell (the
shell users get if they don’t specify a different one). Its home page on
you are free to do so, but don’t expect us to have the WWW is at:
covered all today's material by then. How quickly we https://www.gnu.org/software/bash/
get through the material varies depending on the
composition of the class, so whilst we may finish early We will be using bash 4.4 in this course, but everything we do should
you should not assume that we will. If you do have to work in bash 2.05 and later. Version 4, version 3 and version 2.05 (or
leave early, please leave quietly. 2.05a or 2.05b) are the versions of bash in most widespread use at
present. Most recent Linux distributions will have one of these
versions of bash as one of their standard packages. The latest
If, and only if, you will not be attending the next day version of bash (at the time of writing) is bash 5.0, which was
released in January 2019.
of the course then please make sure that you fill in
the Course Review form online, accessible under
“feedback” on the main MCS Linux menu, or via: For details of the “Python 3: Introduction for Absolute Beginners”
course, see:
http://feedback.training.cam.ac.uk/uis/ https://www.training.cam.ac.uk/ucs/course/ucs-python
For details of the “Python 3: Introduction for Those with Programming
Experience” course, see:
https://www.training.cam.ac.uk/ucs/course/ucs-python4progs
Version: Lent 2020 3 Version: Lent 2020 4
Outline of Course
1. Recap of day one
2. Shell functions
SHORT BREAK
3. Command substitution
4. The mktemp command
VERY SHORT BREAK
5. Handling data from standard input
§ Reading values from standard input
§ Pipelines
§ Loop constructs: while
SHORT BREAK
6. More while loops:
§ Shell arithmetic
§ Tests
Exercise
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 5
The course officially finishes at 17.00, but the As this is a shell scripting course, we are going to need to
intention is that the lectured part of the course will interact with the Unix shell.
be finished by about 16.30 or soon after, and the To start a shell, click on “Activities” in the top-left corner of
remaining time is for you to attempt an exercise the screen, then click on the “Terminal” icon in the
that will be provided. If you need to leave before desktop application bar.
17.00 (or even before 16.30), please do so, but A Terminal window will then appear.
don’t expect the course to have finished before
then. If you do have to leave early, please leave
quietly.
If, and only if, you will not be attending the next
day of the course then please make sure that you
fill in the Course Review form online, accessible
under “feedback” on the main MCS Linux menu, or
via:
http://feedback.training.cam.ac.uk/uis/
Version: Lent 2020 5 Version: Lent 2020 6
Recap: Day One
Recap: What is a shell script?
• Simple shell scripts: linear lists of
commands • Text file containing commands
• Simple use of shell variables and understood by the shell
parameters • Very first line is special:
• Simple command line processing #!/bin/bash
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 7 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 8
• Just the commands you’d type Shell parameters are special variables set by the
shell:
interactively put into a file § Positional parameter 0 holds the name of the shell script
§ Positional parameter 1 holds the first argument passed to the
script; positional parameter 2 holds the second argument passed
• Simplest shell scripts you’ll write to the script, etc
§ Special parameter @ expands to values of all positional
parameters (starting from 1)
§ Special parameter # expands to the number of positional
parameters (not including 0)
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 9 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 10
We create shell variables by simply assigning them a value (as above for the shell variable VAR). We can
access a the value of a shell variable using the construct $VARIABLE where VARIABLE is the name of the shell
variable. Note that there are no spaces between the name of the variable, the equal sign (=) and the variable’s
value in double quotes. This is very important as whitespace (spaces, tabs, etc) is significant in the names and
values of shell variables.
Also note that although we can assign the value of one shell variable to another shell variable, e.g. VAR1=$
{VAR}, the two shell variables are in fact completely separate from each other, i.e. each shell variable can be
changed independently of the other. Changing the value of one will not affect the other. So VAR1 (in this
example) is not a “pointer” to or an “alias” for VAR.
Shell parameters are special variables set by the shell. Many of them cannot be modified, or cannot be directly
modified, by the user or by a shell script. Amongst the most important parameters are the positional parameters
and the other shell parameters associated with them.
The positional parameters are set to the arguments that were given to the shell script when it was started, with
the exception of positional parameter 0, which is set to the name of the shell script. So, if myscript is a shell
script, and I ran it by typing:
./myscript argon hydrogen mercury
then positional parameter 0 = ./myscript
1 = argon
2 = hydrogen
3 = mercury
and all the other positional parameters are not set.
The special parameter @ is set to the value of all the positional parameters, starting from the first parameter,
passed to the shell script, each value being separated from the previous one by a space. You access the value
of this parameter using the construct ${@}. If you access it in double quotes – as in "${@}" – then the shell
will treat each of the positional parameters as a separate word (which is what you normally want).
The special parameter # is set to the number of positional parameters not counting positional parameter 0.
Thus it is set to the number of arguments passed to the shell script, i.e. the number of arguments on the
command line when the shell script was run.
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 11 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 12
In the examples subdirectory of your home directory there is a We can repeat a set of commands using a for loop. A for loop repeats a set of
commands once for each value in a collection of values it has been given. We use a
script called params. If you run this script with some command for loop like this:
line arguments it will illustrate how the positional parameters and for VARIABLE in <collection of values> ; do
related shell parameters work. Note that even if you type exactly <some commands>
the command line on the slide above your output will probably be done
different as the script will be in a different place for each user.
where <collection of values> is a set of one or more values (strings of
characters). Each time the for loop is executed the shell variable VARIABLE is set to
the next value in <collection of values>. The two most common ways of
The positional parameter 0 is the name of the shell script (it is the specifying this set of values is by putting them in a another shell variable and then using
name of the command that was given to execute the shell script). the ${} construct to get its value (note that this should not be in quotation marks), or by
using a wildcard or file name glob (e.g. *) to specify a collection of file names (pathname
expansion). <some commands> is a list of one or more commands to be executed.
The positional parameter 1 contains the first argument passed to
Note that you can put the do on a separate line, in which case you can omit the semi-
the shell script, the positional parameter 2 contains the second colon (;):
argument passed and so on. for VARIABLE in <collection of values>
do
The special parameter # contains the number of arguments that <some commands>
have been passed to the shell script. The special parameter @ done
contains all the arguments that have been passed to the shell
There are some examples of how to use it in the for1 and for2 scripts in the
script. examples directory of your home directory. Note that a for loop can contain another
for loop (the technical term for this is nesting).
work, e.g. repeatedly Natural death [and birth] rate (delta): 1.000000e-02
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 13 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 14
Recall the name of this course (“Simple Shell Scripting for Scientists”) The zombie program is in your home directory. It is a program written
and its purpose: to teach you, the scientist, how to write shell scripts that specially for this course, but we’ll be using it as an example program for
will be useful for your scientific work. pretty general tasks you might want to do with many different programs.
Think of zombie as just some program that takes some input on the
command line and then produces some output (on the screen, or in one or
As mentioned on the previous day of the course, one of the most more files, or both), e.g. a scientific simulation or data analysis program.
common (and best) uses of shell scripts is for automating repetitive
tasks. Apart from the sheer tediousness of typing the same commands The zombie program takes 5 numeric arguments on the command line: 4
over and over again, this is exactly the sort of thing that human beings positive floating-point numbers and 1 positive integer. It always writes its
aren’t very good at: the very fact that the task is repetitive increases the output to a file called zombie.dat in the current working directory, and
likelihood we’ll make a mistake (and not even notice at the time). So it’s also writes some informational messages to the screen.
much better to write (once) – and test – a shell script to do it for us.
Doing it via a shell script also makes it easy to reproduce and record The zombie program is not as well behaved as we might like (which,
what we’ve done, two very important aspects of any scientific endeavour. sadly, is also typical of many programs you will run). The particular way
that zombie is not well behaved is this: every time it runs it creates a file
called running-zombie in the current directory, and it will not run if this
So, the aim of this course is to equip you with the knowledge and skill file is already there (because it thinks that means it is already running).
you need to write shell scripts that will let you run some program (e.g. a Unfortunately, it doesn’t remove this file when it has finished running, so
simulation or data analysis program) over and over again with different we have to do it manually if we want to run it multiple times in the same
input data and organise the output sensibly. directory.
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 15 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 16
The zombie program uses a variant of the SIR model from epidemiology This page intentionally left blank: nothing to see here.
to simulate an outbreak of a zombie infection in a closed (i.e. no one
enters or leaves) population. Obviously, since zombies don’t actually
exist, it would be a mistake to try and take this program too seriously. You
should think of zombie as just a program that takes some input on the
command line and then produces some output on the screen and in a file,
and whose output can then be fed to yet other programs for further
processing (as we’ll see later this afternoon).
However, as it happens, the program is based on actual academic
modelling of the spread of disease, as found in Chapter 4 (pp. 133-150) of
Infectious Disease Modelling Research Progress (2009), which is entitled
“When Zombies Attack!: Mathematical Modelling of an Outbreak of Zombie
Infection”, and which you can find here:
http://mysite.science.uottawa.ca/rsmith43/zombies.pdf
And in case you are interested in the book from which that chapter is
taken, the ISBN of Disease Modelling Research Progress is 978-1-60741-
347-9, it’s edited by J. M. Tchuenche & C. Chiyaka and published by Nova
Science Publishers, Inc.
Note that the zombie program writes its output to a file of numbers rather
than producing graphical output. At the end of this afternoon we will see
how to produce graphs of its output.
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 17 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 18
We are specifically using the gnuplot program and the output of the If you want to get an idea of what we’re trying to do, you can try
zombie program we met on the previous day of the course. the following:
(gnuplot is a program that creates graphs, histograms, etc from $ cd
numeric data.) Think of this task as basically: I have some data sets $ scripts/multi-run 500
and I want to process them all in the same way. My processing might
produce graphical output, as here, or it might produce more data in $ cp gnuplot/zombie.gplt .
some other format. $ cp zombie-500.dat zombie.dat
$ ls zombie.png
If you haven’t met gnuplot before, you may wish to look at its WWW /bin/ls: zombie.png: No such file or directory
page: $ gnuplot zombie.gplt
http://www.gnuplot.info/ $ rm zombie.dat
$ ls zombie.png
If you think you might want to use the gnuplot program for creating
zombie.png
your own graphs, then you may find the “Introduction to Gnuplot”
course of interest – the course notes are on-line at: $ eog zombie.png &
https://www-uxsup.csx.cam.ac.uk/courses/moved.Gnuplot/
Note that the output of “ls zombie.png” may look slightly different – in
particular, the colours may be slightly different shades (assuming you are
reading these notes in colour).
The exercise set at the end of the previous day of the So here’s one solution to that exercise. This file
course was to create a shell script that does the above task. (multi-gnuplot1) is in the gnuplot directory.
Basically, for each of the .dat files produced by the
multi-run script, the shell script should run gnuplot on
it to create a graph (which will be stored as a .png file). It takes each file whose name is of the form
zombie-<something>.dat (where the <something> can
The zombie.gplt file provided will only work if the .dat
be any set of characters that can appear in a filename) in
file is called zombie.dat and is in the current directory.
turn and renames it to zombie.dat, runs gnuplot, then
Also, gnuplot should not be allowed to overwrite
renames the file back to its original name, and renames the
each .png file, so the shell script must rename each .png
zombie.png file to zombie-<something>.dat.png.
file after gnuplot has created it.
# Run gnuplot program once for each output file # Run gnuplot program once for each output file
for data_file in zombie-*.dat; do for data_file in zombie-*.dat ; do
# Copy output file to zombie.dat
cp -f "$data_file" zombie.dat # Create symbolic link called zombie.dat to output file
ln -s -f "$data_file" zombie.dat
# Run gnuplot
gnuplot zombie.gplt # Run gnuplot
gnuplot zombie.gplt
# Delete zombie.dat file
rm -f zombie.dat # Delete zombie.dat symbolic link
rm -f zombie.dat
# Rename zombie.png
mv zombie.png "$data_file.png" # Rename zombie.png
done mv zombie.png "${data_file}.png"
done
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 21 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 22
…and here’s another solution. This file (multi-gnuplot2) is in the …and here’s yet another solution. This file (multi-gnuplot3) is also in the
gnuplot directory. gnuplot directory.
It takes each file whose name is of the form zombie-<something>.dat It takes each file whose name is of the form zombie-<something>.dat
(where the <something> can be any set of characters that can appear in a (where the <something> can be any set of characters that can appear in a
filename) in turn and copies it to zombie.dat, runs gnuplot, then deletes filename) in turn and creates a symbolic link to it called zombie.dat, runs
the copy, and renames the zombie.png file to gnuplot, then deletes the symbolic link (not the original file), and renames
zombie-<something>.dat.png. the zombie.png file to zombie-<something>.dat.png.
These two shell scripts are functionally equivalent – you can use whichever This shell script is functionally equivalent to the previous two – you can use
you like and the results will be identical. whichever you like and the results will be identical.
Note that one purely cosmetic difference between them is that one has the do There is, though, one way in which this script is better than the previous two.
keyword on the same line as the for keyword (with a semi-colon (;) before Since it only creates a symbolic link to each file in turn rather than making a
the do) whilst the other has the do keyword on a separate line (and no semi- copy of the file (like multi-gnuplot2), it uses considerably less disk space
colon). Some people feel that it makes scripts more readable to put the do on (symbolic links take up almost no space on disk), which can be an issue if the
a separate line. files you are processing are large. Also, since it does not rename the original
file (like multi-gnuplot1), if it is interrupted part way through its execution
However, whether you put the do on the same line as the for (and use the
you don’t need to worry about potentially “losing” any output files. If multi-
semi-colon) or put it on a different line is entirely a matter of style and
gnuplot1 was interrupted after it had renamed a file to zombie.dat but
personal preference – well, you want some outlet for your individuality, don’t
you? ☺ before it had a chance to rename it back, then, unless the person running it
realised this had happened and dealt with it, the zombie.dat file would be
deleted next time the script was run(!).
Version: Lent 2020 21 Version: Lent 2020 22
Sample output: zombie-3000.dat.png Shell functions
$ cd
$ cat hello-function
#!/bin/bash
function greet()
{
# This is a shell function.
echo "Hello."
echo "I am function ${FUNCNAME}."
}
$ ./hello-function
$
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 23 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 24
You can try out one of these scripts if you want. First, create some output files for Shell functions are similar to functions in most high-level programming languages.
the script to process: Essentially they are “mini-shell scripts” (or bits of shell scripts) that are invoked (called)
by the main shell script to perform one or more tasks. When called they can be passed
$ cd
arguments (parameters), as we will see later, and when they are finished they return
$ rm –f *.dat stdout-* logfile control to the statement in the shell script immediately after they were called.
$ scripts/multi-run 50 100 500 1000 3000 5000 10000 50000
To define a function, you just write the following at the start of the function:
function function_name()
Now, make sure that the zombie.gplt file is in your current directory: {
$ cp gnuplot/zombie.gplt .
where function_name is the name of the function. Then, after the last line of the
function you put a line with just a closing curly brace (}) on it:
Now run one of the scripts, either multi-gnuplot1 or multi-gnuplot2 or }
multi-gnuplot3, it doesn’t matter which: Note that unlike function definitions in most high level languages you don’t list what
$ gnuplot/multi-gnuplot1 parameters (arguments) the function takes. This is not so surprising when you
remember that shell functions are like “mini-shell scripts” – you don’t explicitly define
what arguments a shell script takes either.
Now do an ls to see what files have been created, and then try viewing some of
them: Like functions in a high-level programming language, defining a shell function doesn’t
actually make the shell script do anything – the function has to be called by another
$ eog zombie-50000.dat.png &
part of the shell script before it will actually do anything.
FUNCNAME is a special shell variable (introduced in version 2.04 of bash) that the shell
Your solutions to this exercise (you did do it, didn’t you?) should have been similar
sets within a function to the name of that function. When not within a function, the
to the ones presented here. If they weren’t, or if you had problems with the
variable is unset.
exercise, please let the course giver or demonstrator know.
greet greet
$ ./hello-function $ ./hello-function Dave
Hello. Hello,
I am function greet. I am function greet.
$ $
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 25 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 26
Start your favourite editor (or gedit if you don’t have a preference) and modify Modify the file hello-function in your home directory as shown
the file hello-function in your home directory as shown above. Make sure above. Make sure you save the file after you’ve modified it or your
you save the file after you’ve modified it or your changes won’t take effect. changes won’t take effect.
You call a shell function by just giving its name (just as you would with any of
Apparently not. Maybe something’s wrong with out shell script?
the standard Unix commands (or shell builtin commands) that we’ve met). Note
that you don’t put brackets after the name of the function when you call it. You Maybe positional parameter 1 isn’t being set correctly? Let’s try
only do that when you first define the function. That’s one of the ways that the some debugging and see.
shell figures out that you are trying to define a shell function.
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 27 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 28
Modify the file hello-function in your home directory as shown above. Make Modify the file hello-function in your home directory
sure you save the file after you’ve modified it or your changes won’t take effect.
as shown above. Make sure you save the file after you’ve
This is a simple but useful debugging trick for shell scripts. When something isn’t
modified it or your changes won’t take effect.
working right, make the shell script print out the values of all the shell variables,
environment variables or shell parameters that you are interested in just before the
point where you think it is going wrong. So, if we call our function with an argument (in this case
the argument is “Hal”), then the value of the positional
In this case, what this shows us is that positional parameter 1 is being set correctly. parameter 1 is indeed set to that argument within the
So that’s not the problem.
function.
The problem is that within a function the positional parameters (from 1 onward, 0
doesn’t change) are set to the arguments that the function was given when it was
called. (Similarly, within a function the special parameters @ and # are set to all the So, if we want to our function to have the same first
arguments passed to the function, and the number of arguments passed to the argument as the shell script itself, then we need to call the
function, respectively.) Since we called the function hello without any arguments, function with the first argument with which the shell script
while the function greet is executing positional parameter 1 is unset, and so when
we try to print its value, nothing is printed. was invoked.
The way you call a shell function with arguments is to list those arguments
immediately after the name of the shell function, e.g. in our script:
hello Dave You can probably guess how we do this…
would call the function greet with one argument: “Dave”.
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 29 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 30
Modify the file hello-function in your home directory as If you’re familiar with computer programming, you’ll
shown above. Make sure you save the file after you’ve modified probably have already come across the concept of
it or your changes won’t take effect.
functions in whatever programming languages you are
familiar with. The advantages of using shell functions
Note that now we think we’ve cracked it, we can get rid of our
debugging effort. We could delete that line, but, if we were
are basically the same as the advantages of using
wrong, we’d only have to put it back in again as we tried to figure functions in a programming language, as you can
it out. So it is easier to just comment it out by inserting a hash probably tell from the slide above.
character (#) at the start of the line – recall that the shell treats
everything after a hash at the start of a line as a comment.
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 31 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 32
If you’ve implemented your shell script entirely as shell On the previous day of this course we met the scripts multi-run and run-
once. Together, these scripts gave us a nice way of running a program
functions, there is a really nice trick you can use when
several times with different parameter sets. However, they are not as
something goes wrong and you need to debug your script, or if versatile as we might hope. run-once requires that the program it runs
you want to re-use some of those functions in another script. (zombie) be in the current directory. Since, in this example, zombie is a
As you’ve implemented the script entirely as a series of special program for us (imagine it were your program that you had written
functions, you have to call one of those functions to start the from scratch), that’s not such a bad limitation, since we quite probably would
have a working copy of the program in the directory where we were going to
script actually doing anything. For the purposes of this store its output.
discussion, let’s call that function main. So your script looks
something like that shown on the slide above. (You can see multi-run, on the other hand, depends on the run-once script, and has
an example of a script like this in the examples directory in the location of that script hard-coded into it. If we move the run-once
script for some reason, then multi-run will immediately stop working.
the file function-script.) Wouldn’t it be nice if we could somehow avoid this problem, but still keep
the functionality of the two scripts somewhat separate?
By commenting out the call to the main function, you now
have a shell script that does nothing except define some One of way of doing exactly that would be to incorporate run-once into
multi-run as a shell function. That should be quite easy. We define a
functions. You can now easily call the function(s) you want to function in multi-run that does exactly the same thing as the run-
debug/use from another shell script using the source shell once script, and in our for loop, instead of calling the run-once
builtin command (as we’ll see on the optional final day of this script, we call our function.
course). This makes debugging much easier than it otherwise
might be, even of really long and complex scripts. So, let’s do that and see what happens.
The multi-run and run-once shell scripts are in the scripts directory of your One thing worth noticing from the exercise we’ve just done:
home directory.
Your task is to get the functionality of the run-once script into the multi-run
script as a shell function. (Hint: it's essentially a cut-and-paste exercise) The original script had the line:
Above I’ve given you the skeleton of what the modified script should look like. You "${HOME}/scripts/run-once" ${fixed_parameters} "${population}"
should be able to fill in the rest.
The new script has the line:
You can check that you’ve done it correctly by trying to run your modified multi- run_program ${fixed_parameters} "${population}"
run script (remember to save it after you’ve made your modifications!):
$ cd
Note that arguments that we are passing have not changed in the
$ rm –f *.dat stdout-* logfile slightest. In the original script we were calling another shell script with
$ ls some arguments. In our new script we are calling a shell function with
$ scripts/multi-run 50 100 500 1000 3000 5000 10000 50000 the same arguments. The syntax for these is almost identical: the main
$ ls change is the name (and location) of the things being called. See?, I told
you shell functions were like “mini-shell scripts”. ☺
This should be a quick exercise, so when you finish it, take a short break and then
we’ll start again with the solution.
run_program $fixed_parameters 80
$ cd /tmp
# Run zombie program once for each argument $ dir="$(pwd)"
# Note: *no* quotes around $fixed_parameters
# or they'll be interpreted as one argument! $ echo "I will use directory: $dir"
#for population in "$@" ; do
# run_program $fixed_parameters "$population"
I will use directory: /tmp
#done
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 35 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 36
One of the advantages of writing a shell script using shell functions should be immediately Command substitution is the process whereby the shell runs a command and substitutes
apparent. The main body of this shell script – the for loop – is nice and simple. It just calls a the command’s output for wherever the command originally appeared (in a shell script or
function over and over varying one parameter each time. Because we’ve hidden the commands on the command line).
that do the real work in a shell function, we can see this immediately just by looking at the script.
So, for example, the following line in a shell script:
If we’d put all the lines in the run_program function in the for loop it would have obscured the
starting_directory="$(pwd)"
script’s structure, and we might have spent a lot of time trying to figure out what the individual
lines of script did before realising what was going on. It also helps that we’ve chosen a would set the shell variable starting_directory to the full path of the current working
meaningful name for our shell function. So just by looking at the script we can immediately say directory. (We don’t have to surround the $(pwd) in quotes, but it is a good idea: the
“Aha! This script probably runs a program (run_program) several times, varying one of its path may contain spaces.) This is how it works:
parameters each time.” (Of course, at this point we’d be taking it on faith that the author of the
shell script wasn’t an evil troll who deliberately chose misleading names for his shell functions. 1. The shell runs the pwd command. The pwd command prints out the full path of the
Fortunately, most of those spend the majority of their time under bridges harassing goats.) current working directory, i.e. its output is the full path of the current working directory.
Let’s suppose we were in /tmp, so the output of the pwd command would be “/tmp”.
Another advantage is that we can easily test our shell function by just commenting the other 2. The shell takes this output (“/tmp”) and substitutes it for where the original expression
complicating bits of the shell script out (as above) and just running the function once with some
$(pwd) appeared. So what we now have is:
test arguments. This is worth doing every time you’ve written a new function (especially if it is
complicated) so that you know it behaves the way you expected it to. It also means that you starting_directory="/tmp"
know that, if there is an error, it is not in that part of the shell script (that shell function). That 3. As you probably know by now, this is just the normal way of assigning a value to a
makes it much easier to track down errors.
shell variable, and, sure enough, that’s exactly what the shell does: it assigns the
value “/tmp” to the shell variable starting_directory.
You can save the above modifications and try out the script if you want: it should just run the
run_program function once, producing two output files (zombie-80.dat and stdout-80)
and writing some information about what it is doing to the log file logfile. Instead of the $() construct you can also use backquotes, i.e. you can use `command`
instead of $(command), and you are likely to come across these in many shell scripts.
If you do try it out, make sure that you undo those modifications and return the shell script to its However, the use of backquotes is generally a very bad idea for two reasons: (1) it’s very
former state (and save it) as we will be using the shell script later. easy to misplace or overlook a backquote (with catastrophic results) as the backquote
character (`) is so small, and (2) it’s very difficult to use backquotes to do nested
command substitution (one command substitution inside another one).
Version: Lent 2020 35 Version: Lent 2020 36
Improving multi-run (2)
File: scripts/multi-run
#!/bin/bash The mktemp command
# Program to run: zombie Safely makes temporary files or directories for
program="$(pwd)/zombie" you
# Set up environment variables for program
export ZOMBIE_FORMAT="NORMAL" Options:
-d make a directory instead of a file
# Parameters that stay the same each run
fixed_parameters="0.005 0.0175 0.01 0.01" -t make file or directory in a temporary
directory (usually /tmp)
function run_program()
{
… $ mktemp -d -t zombie.XXXXXXXXXX
# Run program with passed arguments /tmp/zombie.khhcE30735
"$program" "$@" > "stdout-${5}"
…
}
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 37 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 38
Let’s make a small, but major, improvement to the multi-run script (this script is in the scripts directory of your home
The mktemp command is an extremely useful command that allows users to
directory). Change the lines:
safely create temporary files or directories on multi-user systems. It is very
# Run zombie with passed arguments
easy to unsafely create a temporary file or directory to work with from a shell
ZOMBIE_FORMAT="NORMAL" ./zombie "${@}" > "stdout-${5}"
script, and, indeed, if your shell script tries to create its own temporary files or
to:
directories using the normal Unix commands then it is almost certainly doing
# Run program with passed arguments
"${program}" "${@}" > "stdout-${5}"
so unsafely. Use the mktemp command instead.
Why is this such a major improvement? How do you use mktemp? You give it a “template” which consists of a name
Firstly, by replacing the hard-coded ./zombie with a shell variable, we have made it much easier to modify the script to use
other programs instead of zombie. (Not to mention making it much more obvious where we make such a modification. And by
with some number of X’s appended to it (note that is an UPPER CASE letter
explicitly setting the ZOMBIE_FORMAT environment variable in an adjacent part of the script we have also made it more obvious X), e.g. zombie.XXXXX. mktemp then replaces the X’s with random letters
where any environment variables the program uses should be changed, should we need to do so.)
Secondly, by obtaining the full path of the zombie program our shell script can now work in another directory than the one we
and numbers to make the name unique and creates the requested file or
start off in, as we now have a full path to the zombie program and so can run it from whatever directory we may be in. We’ll directory. It outputs the name of the file or directory it has created.
see why this is a good idea in a minute.
You should check that this modified multi-run script still works – remember to save it after you’ve made your modifications –
with the same sequence of commands given for this purpose on the page 33 of your notes.
Modify the multi-run script in the scripts directory as shown above. Now modify the multi-run script in the scripts directory as shown
above.
The improvement we’ve made here is to now do all our calculations in a temporary directory, and only
copy the output files (and log file) back to our working directory when we’ve finished.
We’ve made two improvements here. The first is to use a shell variable to
(You should understand what all the lines of shell script we’ve just added are doing – if you don’t
please ask the course giver or demonstrator to explain.) hold the location of our log file (so we only have to change its location in
Why is this an improvement? Well, if, as in this course, the directory we are working from (our home one place in the future). The second (and more important) is to make our
directory) is actually on a network filesystem, then this can have a major impact on performance, script write to the end of the existing logfile in the current directory
particularly when the network is busy (like when a whole classroom is doing this course). By working
in /tmp, which is usually a local filesystem (as it is for MCS Linux machines) we no longer have to when we run the script rather than overwriting logfile each time we run
deal with the network overheads and bottlenecks except right at the very end of the process. This the script. Since the log file is supposed to contain a record of all the runs
should make things much quicker. It also potentially makes things more reliable as well, as it of the script that we do for posterity (and debugging), we normally wouldn’t
minimises the opportunity for network problems to mess up our work. (Hurrah!)
want it to be replaced with a new log file each time we run the script.
One other important thing to note is that we’ve told our script to abort as soon as it hits an error. (You should understand what all the lines of shell script we’ve just added are
That’s what adding the “set -e” line immediately after “#!/bin/bash” at the start of the file does doing – if you don’t please ask the course giver or demonstrator to explain.)
(you did remember to make that modification, right?). (We can also get the same effect by starting the
bash shell with the -e option, for instance by changing the “#!/bin/bash” line at the start of the file
to “#!/bin/bash -e” although it is better to use “set -e”.) You can check that you’ve done it correctly by trying to run your modified
Why do this now? The reason is that our shell script is now doing something dangerous: it is multi-run script (remember to save it after you’ve made your
changing the working directory. Why is that dangerous? Well, imagine I tried to change to a modifications!):
directory and failed for some reason. Thinking I’m in a different directory than I actually am, I promptly
delete everything in it. Oops! $ cd
$ rm –f *.dat stdout-*
We have one more change to make (see the next slide) and then you can check that you’ve modified $ ls
your script correctly by trying to run your modified multi-run script (remember to save it after you’ve $ scripts/multi-run 50 100 500 1000 3000 5000 10000 50000
made your modifications!). $ ls
# My current directory
starting_directory="$(pwd)"
# Location of log file
log_file="$starting_directory/logfile"
Make the changes to multi-run indicated on
function run_program()
the previous slides (37, 39 & 40) and
# Write to logfile
… then try the improved script.
echo "" >> "$log_file"
date >> "$log_file" Then take a short break. We’ll start again
echo "Running $program with $@" >> "$log_file"
…
in 8 minutes or thereabouts.
# Write to logfile
echo "Output file: zombie-$5.dat" >> "$log_file"
echo "Standard output: stdout-$5" >> "$log_file"
… 8 minutes
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 41 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 42
Now modify the multi-run script in the scripts directory as shown The multi-run shell script is in the scripts directory of your
above. home directory. Make the modifications indicated on the
We’ve made two improvements here. The first is to use a shell variable to
previous slides (37, 39 & 40), if you haven’t already.
hold the location of our log file (so we only have to change its location in
one place in the future). The second (and more important) is to make our
script write to the end of the existing logfile in the current directory Now check that you’ve done it correctly by trying to run your
when we run the script rather than overwriting logfile each time we run modified multi-run script (remember to save it after you’ve
the script. Since the log file is supposed to contain a record of all the runs made your modifications!):
of the script that we do for posterity (and debugging), we normally wouldn’t $ cd
want it to be replaced with a new log file each time we run the script. $ rm –f *.dat stdout-*
(You should understand what all the lines of shell script we’ve just added are $ ls
doing – if you don’t please ask the course giver or demonstrator to explain.) $ scripts/multi-run 50 100 500 1000 3000 5000 10000 50000
$ ls
You can check that you’ve done it correctly by trying to run your modified
multi-run script (remember to save it after you’ve made your
modifications!): And when you finish doing this, please do take a quick break
$ cd
$ rm –f *.dat stdout-*
before we continue. (And that’s “break” as in “break from the
$ ls computer” not “break to check my e-mail”.)
$ scripts/multi-run 50 100 500 1000 3000 5000 10000 50000
$ ls
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 43 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 44
The read shell builtin command takes input from standard input (usually the In the scripts directory there is a shell script called
keyboard) and returns it in the specified shell variable. If you don’t specify a
shell variable, it will return it in a shell variable called REPLY.
run-once-using-read. Open this up with your
favourite editor (or gedit) and have a look at it.
The -p option gives read a string that it displays as a prompt for the user.
The first line (that doesn’t start with a # character) is a
You can give read more than one shell variable in which to return its input. read shell builtin command that reads some values from
What happens then is that the first word it reads goes into the first shell variable,
the second word into the second shell variable and so on.
standard input and puts them in some shell variables.
If there are more words than shell variables, the extra words all are put into the
(You should be able to work out how the rest of the script
last shell variable. has been modified to use these shell variables – if there
If there are more shell variables than words, each of the extra variables are set is anything you don’t understand, ask the course giver or
to the empty string. demonstrator.)
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 45 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 46
So, on first try it seemed to do what we’d expect. However, if we give it some Now it works better. If we give it more than 5 input parameters it doesn’t mangle
input that should be invalid something slightly strange happens. If we give it 6 the 5th argument that it passes to the zombie program.
input parameters instead of 5, instead of complaining, or only using the first 5 (Note that the output of the ls command may not exactly match what is shown above – in particular there may be other files
or directories show, and the colours may be slightly different.)
parameters, it puts the last two parameters together to form one argument
(“70 garbage”) in the above example and runs the zombie program with that
(we can see this is what is happening by inspecting the contents of the log file Now this may seem like a lot of trouble to go to for not much in the way of
logfile). This causes the zombie program to crash with an error message improvement to our script. After all, the original run-once script could perfectly
that is less clear than one might hope (an indication that the zombie program is well accept a single set of 5 parameters without all these problems – it just
(yet again) not as well written as we might like (an all too common complaint wanted them on the command line rather than from standard input.
with software)). (Also, as a result of zombie crashing, the mv command our
shell script uses to rename zombie.dat file then complains that there is no file So, what’s the big deal about standard input? After all, if I have lots of
for it to rename.) parameter sets to run I’m hardly going to sit there and type them all in one at a
time!
Regardless of how well or badly the zombie program handles invalid
parameters, that fact that our script gives it mangled input to work with is an Well, how many command line arguments can a shell script have? The answer
indication that our script is broken. What is the problem and how can we fix it? is quite a few but not an unlimited number. In fact, If I have thousands of
parameter sets, that’s definitely going to be too many for me to pass to my shell
script all in one go (or even a small number of goes) on the command line. So,
Recall how read works: if it reads more words (values) than it was given shell
how do we deal with situation?
variables, it puts all the extra ones together in the last shell variable. This is
what is happening here, and it is undesirable. We can fix this by giving read an
extra “dummy” shell variable that we never use, but that is simply there to hold Hmmmm, maybe if I could put all my thousands of parameter sets into a file,
any extra junk it may read in. and then could somehow get my shell script to read in that file, one parameter
set at at time, that might do it… we need to be able to do a few more things to
make that particular idea fly, so let’s have a look at some of them now…
Modify the run-once-using-read shell script in the scripts directory as
shown above (remember to save it when you’ve finished).
Version: Lent 2020 45 Version: Lent 2020 46
Pipes Using pipes
A pipe takes the $ cd
…output of one command… $ rm -f *.dat stdout-* logfile
$ cat scripts/basic_param_set
…and passes it to another command as 0.005
0.005
0.0175
0.0175
0.01
0.01
0.01
0.01
50
100
input… 0.005
0.005
0.0175
0.0175
0.01
0.01
0.01
0.01
500
1000
0.005 0.0175 0.01 0.01 3000
0.005 0.0175 0.01 0.01 5000
command1 | command2 0.005 0.0175 0.01 0.01 10000
0.005 0.0175 0.01 0.01 30000
Pipes can be combined: 0.005
0.005
0.0175
0.0175
0.01
0.01
0.01
0.01
50000
500000
command1 | command2 | command3 $ cat scripts/basic_param_set | scripts/run-once-using-read
$ ls
answers gnuplot zombie.gplt scripts
A set of one or more pipes is known as a bin hello-function logfile source
pipeline Desktop
examples
hello
zombie
zombie-50.dat
run-zombie
stdout-50
treasure.txt
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 47 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 48
A pipe takes the output of one command and feeds it to another In the scripts directory there is a file called basic_param_set that
command as input. We tell the shell to do this using the | contains a number of parameter sets. We can use the cat command to
symbol. So: display the contents of this file. In fact, if we use the cat command on this
file, the output of the cat command will be a list of parameter sets…
ls | less
takes the the output of the ls command and passes it to the …and our run-once-using-read shell script will accept a complete
more command, which displays the output of the ls command parameter set as its input, so…
one screenful at a time. We can combine several pipes by
taking the output of the last command of each pipe and passing
…if we connect the output of the cat command to the input of our shell
it to the first command in the next pipe, e.g.
script – by, say, using a pipe – maybe that will give us what we want? Let’s
ls | grep 'fred' | less try it!
takes the output of ls and passes it to grep, which searches for
lines with the string “fred” in them, and then the output of grep Well, it almost does!, i.e. it does it for the first parameter set, but none of the
is passed to the less command to display one screenful at a others. If we try running it again and again it will still only do it for the first
parameter set in the file, so we’re not quite there, but close. What we want
time. A set of one or more pipes is known as a pipeline. This is some way of telling the script to keep reading until there is no more stuff
pipeline would show us all the files with the string “fred” in their to read.
name, one screenful at a time.
In fact, what we want is for the script to do some sort of loop: reading in a
set of values, then running the zombie program, then reading in the next
set of values, and so on. How can we get it to do that? Before we look at
that, we need to understand something else first…
Version: Lent 2020 47 Version: Lent 2020 48
Exit Status (1) Exit Status (2)
• Every program (or shell builtin command) $ ls
returns an exit status when it completes answers gnuplot zombie.gplt scripts
bin hello-function logfile source
• Number between 0 and 255 Desktop hello zombie-70.dat stdout-70
examples zombie run-zombie treasure.txt
• Not the same as the program’s (or shell builtin
command’s) output $ echo $?
• By convention: 0
§ 0 means the command succeeded $ ls zzzz
§ Non-zero value means the command failed /bin/ls: zzzz: No such file or directory
• Exit status of the last command run stored in $ echo $?
special shell parameter named ? 2
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 49 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 50
The exit status of a program is also called its exit code, You get the value of the special parameter ? by using the
return code, return status, error code, error status, construct ${?}, as in the above example.
errorlevel or error level.
Please note that the output of the ls command may not exactly
match what is shown on this slide – in particular, the colours may
be slightly different shades and there may be additional files and/or
directories shown (and/or – if you’ve recently cleaned up your
home directory – you may not have all of the files shown here).
$ false
$ echo $?
1 rest
of script
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 51 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 52
It’s worth introducing a couple of commands at this point which do nothing. Now that we know about the exit status of a command we
(No, really.)
are ready to meet the loop structure alluded to earlier:
true does nothing and always succeeds, i.e. its exit status of 0.
We can repeat a collection of one or more commands using
false does nothing and always fails, i.e. its exit status is non-zero. a while loop. A while loop repeats a collection of
You may be wondering what possible use there could be for such commands as long as the result of some command is
commands. The most obvious use is for debugging: suppose you have a successful. The result of a command is considered to be
script that runs a program that take a long time, and you want to test the successful if it returns an exit status of 0 (i.e. if the command
script to make sure it works. You could replace the program that takes a succeeded). (The command we use in a while loop could
long time with true to see what your script does if it thinks the program has
succeeded. Similarly, you could replace the program your script is calling also be a test of whether some expression is true. We’ll see
with false if you want to see what your script will do if it thinks the program how to do that shortly.)
has failed.
Another use for true is when you want the shell to do nothing (this is known Note that even if set -e is in effect, or the first line of our
as a NOP or no-op command): for instance, shell functions and for loops shell script is
must contain at least one command. If, for some reason, you want a shell
function or a for loop that does nothing (maybe because you haven’t gotten
#!/bin/bash -e
around to writing it yet but you want to be able to test the rest of your script) the shell script will not exit if the result of the command the
you can use true. Then the shell won’t complain about the definition of
your function or the syntax of your for loop being incorrect, but they won’t
while loop depends on fails, since if it did, this would make
actually do anything. while loops unusable(!).
Version: Lent 2020 51 Version: Lent 2020 52
while while
Keywords
Repeat while <command> returns true
while <command> ; do
while <command> ; do
<some commands> <some commands>
done
Commands to
repeat
Keyword indicating
done
end of loop
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 53 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 54
We use a while loop like this: To recap: we can repeat a collection of commands using a while loop. A while
loop repeats a collection of commands as long as the result of some command is
while <command> ; do true. The result of a command is considered to be true if it returns an exit status
<some commands> of 0 (i.e. if the command succeeded). (The command we use in a while loop
could also be a test of whether some expression is true. We’ll see how to do that
done shortly.) We use a while loop like this:
where <command> is a command (which could be a test; more on while <command> ; do
tests later), and <some commands> is a collection of one or more <some commands>
commands. Note that if <command> is false the shell script will not done
exit, even if set -e is in effect or the first line of the shell script is
where <command> is a command (which could be a test), and <some
#!/bin/bash -e commands> is a collection of one or more commands. Note that even if set -e
is in effect, or the first line of the shell script is #!/bin/bash -e, the shell script
As with a for loop, you can put the do on a separate line, in which will not exit if the result of <command> is not true.
case you can omit the semi-colon (;). As with a for loop, you can put the do on a separate line, in which case you can
omit the semi-colon (;).
There are some examples of how to use while loops in the
following files in the examples directory: There are some examples of how to use while loops in the following files in the
while1 examples directory:
while2 while1
while2
…but don’t look at those files just yet as we need to meet a few more
things first… …but don’t look at those files just yet as we need to meet a few more things first…
Version: Lent 2020 53 Version: Lent 2020 54
Third exercise
Using while (1) Make a copy of multi-run and make it
$ cd read all the arguments for zombie in from
$ cp -p scripts/run-once-using-read scripts/run-while-read standard input using a while loop:
$ gedit scripts/run-while-read &
File: scripts/run-while-read $ cd
#!/bin/bash $ cp -p scripts/multi-run scripts/multi-run-while
Create a copy of the run-once-using-read shell script in the scripts directory The multi-run shell scripts is in the scripts directory of your home directory. Make a copy
called run-while-read. Open this up with your favourite editor (or gedit) and of it called multi-run-while, also in the scripts directory, and work on that. Your task is
to get multi-run-while to read in all the arguments for zombie from standard input (all its
modify it as shown above.
arguments, not just the fifth one) using a while loop.
Basically, replace the line: Start by deleting the following two lines:
read -p "Input parameters for zombie: " alpha beta zeta delta population # Parameters that stay the same each run
junk
myFIXED_PARAMS="0.005 0.0175 0.01 0.01"
with: …and you should also get rid of any other references to the shell variable myFIXED_PARAMS –
# and then run zombie with them you won’t be using it in this script.
# and run it again and again until there are no more
while read alpha beta zeta delta population ; do We have gone through everything you need to do this exercise. You should comment the
modifications you make to your shell script, preferably as you are writing it.
And at the very end of the file add the following line: And when you finish this exercise, please do take a short break before we start again with the
done solution. (And that’s “break” as in “break from the computer” not “break to check my e-mail”.)
Remember to save the script when you’ve finished. You can check that you’ve done it correctly by trying to run your multi-run-while script
(remember to save it after you’ve made your modifications!):
$ cd
Now let’s try this script out and see if it does what we want:
$ rm –f *.dat stdout-* logfile
$ cd
$ ls
$ rm –f *.dat stdout-* logfile $ cat scripts/basic_param_set | scripts/multi-run-while
$ cat scripts/basic_param_set | scripts/run-while-read $ ls
$ ls
Hint: Try copying the run-while-read script…
Version: Lent 2020 55 Version: Lent 2020 56
Recap: standard input/while loops Tests
• Command substitution $(command) can be used to get
the output of a command into a shell variable
Test to see if something
• Use mktemp to make temporary files and directories
is true:
• read gets values from standard input [[ <expression> ]]
• Pipes connect one command’s output to another’s input
• The command true does nothing but is considered to
be true (its exit status is 0); the command false does [[ $a –eq $b ]]
nothing but is not considered to be true (non-zero exit
status). [[ $a –le 42 ]]
• while loops repeat some commands while something [[ $a -gt 0 ]]
is true – can be used to read in multiple lines of input
with read [[ $str = "aardvark" ]]
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 57 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 58
A test is basically the way in which the shell evaluates an expression to see if it is true.
Note that while loops can contain other while There are many different tests that you can do, and we only list a few here:
loops, and they can also contain for loops (or [[ $a –lt $b ]] true if and only if the integer a is less than the integer b
both). Similarly, for loops can contain while [[ $a –le $b ]] true if and only if the integer a is less than or equal to the integer b
loops or other for loops (or both). [[ $a –eq $b ]] true if and only if the integer a is equal to the integer b
[[ $a –ne $b ]] true if and only if the integer a is not equal to the integer b
[[ $a –ge $b ]] true if and only if the integer a is greater than or equal to the integer b
[[ $a –gt $b ]] true if and only if the integer a is greater than the integer b
In the above tests, a and b can be any integers. Recall that shell variables can hold pretty
much any value we like – they can certainly hold integer values, so a and/or b in the above
expressions could come from shell variables, e.g.
[[ $VAR –eq 5 ]]
Or, equivalently:
test "${VAR}" –eq "5"
is true if and only if the shell variable VAR contains the value “5”.
Note that you must have a space between the square brackets [[ ]]
N.B.: Use -eq for testing integers, and use == or = for testing the equality of
strings.
N.B. Use help test | less to list the available tests and what they do.
Example:
$(( VAR + 56 ))
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Three 16 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 60
The shell can also do (primitive) integer arithmetic, which can be When we put together arithmetic tests, while loops and
very useful. arithmetic expansion, we can construct a while loop that
counts for us, as in the above example. Can you figure
The construct $((<arithmetic-expression>)) means replace out what the above loop will do?
$((<arithmetic-expression>)) with the result of the integer
arithmetic expression <arithmetic-expression>. This
is known as arithmetic expansion. (The arithmetic expression is When you think you know, try running the script while2
evaluated as integer arithmetic.) in the examples directory of your home directory. That
will show you the output of the above while loop,
Note that C syntax is used within the brackets, therefore you immediately followed by the output of a very similar
should use the bare variable name (This is, alas, inconsistent while loop where counter starts off with the value 0
with the shell’s behaviour elsewhere) We can put quotes
rather than 1.
around the entire arithmetic expansion construct, though,
although this should not be necessary because the output
should be numeric. Note that while loops can (and often do) contain other
Use help let on the bash command line to find out what while loops (or for loops). We say that one loop is
operations are available and how to use them. nested inside the other one.
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 61 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 62
Examine the file called generate-params in the scripts Modify the multi-run-while script in the scripts directory as shown
directory of your home directory (shown above). above.
(Remember to save it when you’ve finished.)
If you want to do the exercise outside of class, the files you’ll need can be
found at:
https://help.uis.cam.ac.uk/help-support/training/downloads/
course-files/programming-student-files/shellscriptingsci/
shellscripting-files/exercises/day-two
http://feedback.training.cam.ac.uk/uis/
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 63 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 64
If you are coming to further sessions of this This exercise should be fairly straightforward. One sensible way of approaching it
would be as follows:
course, then you should fill in the feedback
form at the last session you attend. 1. Figure out the full path of the zombie.gplt file. Store it in a shell variable
(maybe called something like gnuplot_file).
3. Rename the zombie.png file produced by gnuplot along the same lines as
the zombie.dat file produced by zombie is renamed.
Make sure you test the script after you’ve modified it and check that it does
what you would expect.
This exercise highlights one of the advantages of using functions: we can improve
or change our functions whilst leaving the rest of the script unchanged. In
particular, the structure of the script remains unchanged. This means two things:
(1) if there are any errors after changing the script they are almost certainly in the
function we changed, and (2) the script is still doing the same kind of thing (as we
can see at a glance) – we’ve just changed the particulars of one of its functions.
Version: Lent 2020 63 Version: Lent 2020 64
Final exercise – Part Two Final exercise – Part Three
Now create a new shell script, based on the script you created in the previous
Now create a new shell script based on part of the exercise, that does the following:
multi-run-while that will run zombie three times Instead of running zombie three times for each parameter set it reads in, this
for each parameter set the script reads in on standard script should accept a set of values on the command line, and use those
input, changing the fifth parameter each time as instead of the hard-coded 50, 500, 5000 previously used.
follows: Thus, for each parameter set it reads in on standard input, it should run zombie
substituting, in turn, the values from the command line for the fifth parameter
For a given parameter set a b c d e, first your script in the parameter set it has read in.
should run zombie with the parameter set:
a b c d 50 So, if the script from the previous part of the exercise was called
multi-50-500-5000, and we called this new script
…then with the parameter set:
multi-sizes (and stored both in the scripts directory of our home
a b c d 500 directory), then running the new script like this:
…and then with the parameter set: $ cd
$ cat scripts/param_set | scripts/multi-sizes 50 500 5000
a b c d 5000 should produce exactly the same output as running the old script with the
same input file:
$ cd
$ cat scripts/param_set | scripts/multi-50-500-5000
scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 65 scientific-computing@uis.cam.ac.uk Simple Shell Scripting for Scientists: Day Two 66
An example may help to make this task clearer. Suppose your script reads in the parameter set: The first thing to do is to make a copy of the previous script (which I suggested you call
0.005 0.0175 0.01 0.01 70 multi-50-500-5000) and work on the copy – I suggest you call your copy something
…it should then run the zombie program 3 times, once for each of the following parameter sets: like multi-sizes:
0.005 0.0175 0.01 0.01 50 $ cd
0.005 0.0175 0.01 0.01 500 $ cp –p scripts/multi-50-500-5000 scripts/multi-sizes
0.005 0.0175 0.01 0.01 5000
You may be wondering what the point of the previous script and this script are.
The first thing to do is to make a copy of the multi-run-while script and work on the copy – I Consider what these scripts actually do: they take a parameter set, vary one of its
suggest you call your copy something like multi-50-500-5000: parameters and then run some program with the modified parameter sets. Why would
$ cd we want to do this?
$ cp –p scripts/multi-run-while scripts/multi-50-500-5000 Well, in this example the parameter we are varying specifies the size of the population
which our program will model. You can easily imagine that we might have a simulation
Now, currently the script will read in a parameter set and then call the run_program function to or calculation for which, for any given parameter set, interesting things happened in
process that parameter set. Clearly, instead of passing all five parameters that the script reads in, various population sizes. These scripts allow us to take each parameter set and run it
your new script will now only be passing the first (alpha), second (beta), third (zeta), and fourth several times for different sizes of populations. We can then look at each parameter set
(delta) parameters that it has read in. However, the zombie program requires 5 parameters (and it and see how varying the size of the population affects the program’s output for that
cares about the order in which you give them to it), so your script still needs to give it 5 parameters, it parameter set.
is just going to ignore the fifth parameter it has read (population) and substitute values of its own
instead. If we were using the parameter sets in the scripts/param_set file, we might notice
that these parameters are the same except for the second parameter which varies. So
if we pipe those parameter sets into one of these scripts, we are now investigating how
There are two approaches you could take. One would be to call the run_program function 3 times,
once with 50 as the fifth parameter, once with 500 as the fifth parameter and once with 5000 as the
the output of the zombie program varies as we vary two of its input parameters, which
fifth parameter. The other would be to use some sort of loop that calls the run_program function, is kinda neat, doncha think? ☺
using the appropriate value (50, 500 or 5000) for the fifth parameter on each pass of the loop. I want
you to use the loop approach. Hint: Modify the loop you used in the previous script to loop over all the command line arguments rather than some hard
coded values. If you don’t remember the construct that gives you all the command line arguments have a look at the recap of
the previous day of this course.
Hint: Use a for loop.
Version: Lent 2020 65 Version: Lent 2020 66
Final exercise – Files
All the files (scripts, zombie program, etc)
used in this course are available on-line at:
https://help.uis.cam.ac.uk/
help-support/training/downloads/
course-files/programming-student-files/
shellscriptingsci/shellscripting-files/
exercises/day-two