
\documentclass[11pt]{article}

\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textwidth}{6.5in}
\setlength{\textheight}{9.0in}
\setlength{\parindent}{0in}
\setlength{\parskip}{0.12in}

\usepackage{times}    
\usepackage{html}
\usepackage{fancyvrb}
\usepackage{relsize}

\begin{rawhtml}
<body bgcolor=white>
\end{rawhtml}

\begin{document}

 %\newcommand{\bfs}[1]{{\bf #1}}

\title{Unix Shell Scripts}

\author{Norman Matloff}

\date{July 30, 2008}

\maketitle

\tableofcontents

\section{Introduction}

In previous discussions we have talked about many of the facilities 
of the C shell, such as command aliasing, job control, etc.  In addition, 
any collection of {\bf csh} commands may be stored in a file, and 
{\bf csh} can be invoked to execute the commands in that file.  Such a 
file is known as a shell {\bf script} file.  The language used in that file 
is called shell script language.  Like other programming languages it 
has variables and flow control statements (e.g. if-then-else, while, for, 
goto).

In Unix there are several shells that can be used, the C shell ({\bf
csh} and its extension, the T C shell {\bf tcsh}), the Bourne Shell
({\bf sh} and its extensions the Bourne Again Shell {\bf bash} and the
highly programmable Korn shell {\bf ksh} ) being the more commonly used.

Note that you can run any shell simply by typing its name.  For example,
if I am now running {\bf csh} and wish to switch to {\bf ksh}, I simply
type
{\bf ksh}, and a Korn shell will start up for me.  All my commands
from that point on will be read and processed by the Korn shell
(though when I eventually want to log off, exiting the Korn shell
will still leave me in the C shell, so I will have to exit from it
too).

\section{Invoking Shell Scripts}

There are two ways to invoke a shell script file.

\subsection{Direct Interpretation}  

In direct interpretation, the command 

\begin{verbatim}
csh filename [arg ...] 
\end{verbatim}

invokes the program {\bf csh} to interpret the script contained in the file 
`filename'.

\subsection{Indirect Interpretation}  

In indirect interpretation, we must insert as the first line of the file

\begin{verbatim}
#! /bin/csh
\end{verbatim}

or

\begin{verbatim}
#! /bin/csh -f
\end{verbatim}

(there are situations in which this is not necessary, but it won't hurt
to have it), and the file must be made {\em executable} using {\bf chmod} 
(see previous discussion).  Then it can be invoked in the same way as any 
other command, i.e., by typing the script file name on the command line.

The -f option says that we want fast startup, which it will achieve by
\underline{not} reading or executing the commands in .cshrc  Thus for
example, we won't have the `set' values for shell variables or the
aliases from that file, but if we don't need them, this will be much
faster (if we need a few of them, we can simply add them to the script
file itself).

\section{Shell Variables}

Like other programming languages the {\bf csh} language has variables.
Some variables are used to control the operation of the shell, such as
\$path and \$history, which we discussed earlier.  Other
variables can be created and used to control the operation of a shell 
script file.

\subsection{Setting Variables}

Values of shell variable are all character-based:  A value is formally
defined to be a {\bf list} of zero or more {\bf elements}, and an
element is formally defined to be a character string.  In other words,
a shell variable consists of an array of strings.

For example,

\begin{verbatim}
     set X 
\end{verbatim}

will set the variable \$X to have an empty list as its value.  
The command

\begin{verbatim}
     set V = abc
\end{verbatim}

will set V to have the string `abc' as its value.  The command
 
\begin{verbatim}
     set V = (123 def ghi)
\end{verbatim}

will set V to a list of three elements, which are the strings 
`123', `def' and `ghi'.  

The several elements of a list can be treated like array elements.
Thus for V in the last example above, \$V[2] is the string `def'.
We could change it, say to `abc', by the command

\begin{verbatim}
    set V[2] = abc 
\end{verbatim}

\subsection{Referencing and Testing Shell Variables}

The value of a shell variable can be referenced by placing a {\bf\$}
before the name of the variable.  The command

\begin{verbatim}
    echo $path
\end{verbatim}

will output the value of the variable \$path.  Or you can access the 
variable by enclosing the variable name in curly brace characters, and 
then prefixing it with a {\bf\$}.  The command

\begin{verbatim}
    echo ${path}
\end{verbatim}

would have the same result as the last example.  The second method is 
used when something is to be appended to the contents of the variable.  
For example, consider the commands

\begin{verbatim}
    set fname = prog1
    rm ${fname}.c
\end{verbatim}

These would delete the file `prog1.c'.

To see how many elements are in a variable's list, we prefix with a 
{\bf\#} then a {\bf\$}.  The command

\begin{verbatim}
    echo $#V
\end{verbatim}

above would print 3 to the screen, while

\begin{verbatim}
    echo $#path
\end{verbatim}

would reveal the number of directories in your search path.

The @ command can be used for computations.  For example, if you
have shell variables \$X and \$Y, you can set a third variable \$Z
to their sum by

\begin{verbatim}
@Z = $X + $Y
\end{verbatim}

\section{Command Arguments}

Most commands have arguments (parameters), and these are accessible
via the shell variable \$argv.  The first parameter will be \$argv[1],
the second \$argv[2], and so on.  You can also refer to them as \$1,
\$2, etc.  The number of such arguments (analogous to argc in the C
language) is \$\#argv.

For example, consider the following script file, say named Swap:

\begin{verbatim}
#! /bin/csh -f

set tmp = $argv[1]
cp $argv[2] $argv[1]
cp $tmp $argv[2]
\end{verbatim}

This would do what its name implies, i.e. swap two files.  If, say,
I have files x and y, and I type

\begin{verbatim}
Swap x y
\end{verbatim}

then the new contents of x would be what used to be y, and the new
contents of y would be what used to be x.

\section{Language Constructs}

The shell script language, like other programming languages, has
constructs for conditional execution (if-then-else; while), 
iterative execution (for loop), a switch statement, and a goto 
statement:

{\bf 1. if-then-else}

The syntax of the if-then-else construct is

\begin{verbatim}
     if ( expr ) simple-command 
\end{verbatim}

or 

\begin{verbatim}
     if ( expr ) then
           commandlist-1
     [else
           commandlist-2]
     endif
\end{verbatim}

The expression {\em expr} will be evaluated and according to its value, the 
{\em commandlist-1} or the {\em commandlist-2} will be executed.  The portion
of the construct enclosed in '[' and ']' is optional.\footnote{This is
standard notation in the software world, so remember it.}

As an example, suppose we write a shell script which is supposed to have
two parameters, and that the code will set up two variables, `name1' and
`name2' from those two parameters, i.e.

\begin{verbatim}
     set name1 = $argv[1]
     set name2 = $argv[2]
\end{verbatim}

(which presumably it would make use of later on).  But suppose we also
wish to do error-checking, emitting an error message if the user
gives fewer than two, or more than two, parameters.  We could use the
following code

\begin{verbatim}
     if ($#argv <> 2) then
         echo "you must give exactly two parameters"
     else
         set name1 = $argv[1]
         set name2 = $argv[2]
     endif
\end{verbatim}

{\bf 2. while}

The syntax of {\em while} loop construct is

\begin{verbatim}
     while ( expr ) 
           commandlist
     end
\end{verbatim}

The {\em commandlist } will be executed until the {\em expr } 
evaluates to false.

{\bf 3.  foreach}

The syntax of {\em foreach} loop construct is

\begin{verbatim}
     foreach var ( worddlist )
         commandlist
     end
\end{verbatim}

The {\em commandlist} is executed once for each word in the {\em wordlist}, 
and each time the variable {\em var} will contain the value of that word.
For example, the following script can search all immediate subdirectories 
of the current directory for a given file (and then quit if it finds one):

\begin{verbatim}
#! /bin/csh -f
     set f = $1
     foreach d (*)
         if (-e $d/$f) then
               echo FOUND: $d/$f
               exit(0)
         endif
     end
     echo $f not found in subdirectories
\end{verbatim}

For example, say I call this script FindImm, and my current directory
consists of files s, t and u, with s and t being subdirectories, and
with t having a file x.  Typing

\begin{verbatim}
FindImm x
\end{verbatim}

would yield the message

\begin{verbatim}
FOUND: t/x
\end{verbatim}
 
Here is how it works:  In the line

\begin{verbatim}
     foreach d (*)
\end{verbatim}

the `*' is a wild card, so it would expand to a list of all files in
my current directory, i.e. the list (s t u).  So, the for-loop will
first set d = s, then d = t and finally d = u.

In the line

\begin{verbatim}
         if (-e $d/$f) then
\end{verbatim}

the -e means existence; in other words, we are asking if the file
\$d/\$f exists.  If we type `FindImm x' as in the example above,
\$f would be x, and \$d would start out as s, so we would be asking
if the file s/x exists (the answer would be no).

{\bf 4.  switch}

The switch command provides a multiple branch similar to the switch
statement in C.  The general form of switch is:

\begin{verbatim}
    switch ( str )
        case string1:
             commandlist1
             breaksw
        case string2:
             commandlist2
             breaksw
        default
             commandlist
    endsw
\end{verbatim}

The given string {\em str} is successively matched against the case
patterns.  Control flow is switched to where the first match occurs.
As in file name expansion, a case label may be a literal string, or
contain variable substitution, or contain wild-card character such as
*,?, etc. 


5. {\bf Goto}

The {\em goto} command provides a way to branch unconditionally to a line
identified by a label.

\begin{verbatim}
goto lab
\end{verbatim}

where {\em lab} is a label on a line (by itself) somewhere in the script in the form

\begin{verbatim}
lab:
\end{verbatim}

\section{Escape Characters}

If you download files from the Web, they may have been created under
Windows, with names inconsistent with Unix.  Here are a couple of tips
for handling this:

\begin{itemize}

\item The most common problem is file names with embedded spaces, say a
file named {\bf before July}.  To reference such a file from a C shell
command line, simply precede each space by a backslash.  For instance,
to remove the file {\bf before July}, type

\begin{Verbatim}[fontsize=\relsize{-2}]
rm before\ July
\end{Verbatim}

\item Suppose you have a file whose name begins with the character `-'.
The problem here is that most Unix commands use that character to
signify options to the commands.  For example, 

\begin{Verbatim}[fontsize=\relsize{-2}]
ls -ul
\end{Verbatim}

is the command to list the files and their latest access times.

Say you have a file named {\bf -trendy}, which you want to copy to {\bf
xyz}.  You could not simply type

\begin{Verbatim}[fontsize=\relsize{-2}]
cp -trendy xyz
\end{Verbatim}

but could type

\begin{Verbatim}[fontsize=\relsize{-2}]
cp -- -trendy xyz
\end{Verbatim}

The double hyphen tells the shell that there will be no more options on
this line.

\end{itemize}

\section{Examples}

\subsection{A Shell Script For Deleting Files}

This code, which we will call Del, will delete files like {\bf rm} does, 
prompting for your confirmation for each file to be deleted, including 
directory files (which the -i option of {\bf rm} won't do).

\begin{verbatim}
#! /bin/csh -f

   foreach name ($argv)
      if ( -f $name ) then
         echo -n "delete the file '${name}' (y/n/q)?"
      else
         echo -n "delete the entire directory '${name}' (y/n/q)? "
      endif
      set ans = $<
      switch ($ans)
            case n: 
               continue
            case q: 
               exit
            case y: 
               rm -r $name
               continue
      endsw
   end
\end{verbatim}

(Before reading further, try this program yourself.  Set up a test
directory, with several files in it, at least one of which is a
subdirectory, with at least one file there.  Then type `Del *'.)

The line 

\begin{verbatim}
      if ( -f $name ) then
\end{verbatim}

tests to see if the file whose name is in \$name is an ordinary file,
as opposed to a directory file.

The -n option of {\bf echo} tells the shell not to print the newline
character, so that our answer, y/n/q, will be on the same line.

In the line

\begin{verbatim}
      set ans = $<
\end{verbatim}

the symbol `\$$<$' means the input from the keyboard.

The keyword `continue' means to go to the top of the enclosing
loop.

The -r option of the {\bf rm} command means that if an argument is a
directory, then remove that directory, and all the files (and subdirectories,
etc.) within it.

\section{Further Information}

There are several books dealing with the C shell, but you should first
read the man page for {\bf csh}.  You will find all kinds of features
not mentioned here.

\end{document}







