Полезная информация



TOC
BACK
FORWARD
HOME

UNIX Unleashed, System Administrator's Edition

- 12 -

The C Shell

by John Valley and Sean Drew

As a UNIX user, you have a wide variety of shells available to you: the Bourne shell, Bourne Again shell, POSIX shell, C shell, TC shell, Z shell, and Korn shell. Although this is not an all encompassing list of available shells, it does cover the more commonly used shells. Most UNIX systems come pre-installed with some subset of the shells mentioned. If you wish to use a shell that was not pre-installed on your system, you will more than likely find the one you want on the Internet. The C shell--the subject of this chapter--is one of the more popular and widely available shells in UNIX. It was developed after the Bourne shell but before the Korn shell. The C shell incorporates many features of the Bourne shell and adds many new ones that make your UNIX sessions more efficient and convenient.

Each shell has certain advantages and disadvantages. You might want to review Chapter 13, "Shell Comparison," to help you decide which one to use.

The C shell, written by Bill Joy (also the author of the vi text editor), was not patterned after the Bourne shell. Bill chose the C programming language as a syntax model. The C shell commands--especially if, while, and the other structured programming statements--are somewhat similar in syntax to the equivalent statements in C. A shell is quite a different animal from a compiler, though, so the C programming language served only as a model; many forms and structures in the C shell have nothing to do with the C programming language.

Because the C shell is not just an extension of the Bourne shell syntax, this chapter will cover all aspects of C shell operation. You therefore can read it independently of Chapter 9, "The Bourne Shell," Chapter 10, "The Bourne Again Shell," and Chapter 11, "The Korn Shell."

Invoking the C Shell

Each time you log onto UNIX, you're placed in an interactive shell referred to as your logon shell. If your logon shell is using the default prompts, you can tell if your logon shell is the C shell by its command-line prompt: the percent sign (%). The default C shell prompt differs from the default dollar-sign prompt ($) of the Bourne shell to remind you that you're using the C shell. You can customize your command-line prompt when using the C shell; for more information, see the definition of prompt in "Variables," later in this chapter.

The most foolproof way to determine your logon shell is to query the passwd file. The seventh field contains the path to your logon shell. The command

grep 'whoami' /etc/passwd |cut -f7 -d:

will print the path of your logon shell. If you are using Network Information Service (NIS) to manage user information, the /etc/passwd file is not accessible, so the command

ypcat passwd |grep 'whoami' |cut -f7 -d:

will do the trick.

If your logon shell is not the C shell, and the C shell is available on your system, you can invoke it as an interactive shell from the command-line. Even when you're already running the C shell, there will be times when you want to launch the C shell again--for example, to run a shell script or to temporarily change the shell's options. To invoke the C shell interactively, use this command:

$ csh
%


NOTE: The csh command usually is located in the /bin or /usr/bin directory. Because both directories are usually in your search path, you shouldn't have any trouble finding the csh command if your system has it. If you don't find it right away, you might look in the directory /usr/ucb (the standard home for BSD components in a UNIX System V system) or in /usr/local/bin. /usr/local/bin is a home for programs your site has acquired that were not provided with the original system software. Remember that, for many years, the C shell was available only to those sites using the BSD variant of UNIX; unlike the Bourne shell, there is no guarantee that you will have the csh command on your system.

The csh command also supports a number of options and arguments (described later in this chapter in "Shell Options"), but most options are not relevant to running an interactive shell.

Whenever csh is invoked, whether as the logon shell or as a subshell, it loads and executes a profile script named .cshrc. If it is a logon shell, the C shell also executes a profile script on startup named .login and another on exit named .logout. Note that the .login script is executed after .cshrc--not before. For additional information about C shell profile scripts, see "Customizing Your Shell Environment," later in this chapter.

Most versions of the C shell import environment variables such as PATH into local array variables at startup. The C shell does not refer to the public environment variables (including PATH) for its own operation. This means that you'll usually want to maintain the path variable for directory searches--not PATH. Some versions of the C shell do not properly import environment variables, which can give you confusing results. If it appears that you have no search path set, but the PATH variable is set and accurate (as shown by echo $PATH), check that the variable path has a matching value. If not, you'll need to import critical environment variables into local variables yourself.


NOTE: If you are familiar with the Bourne shell, you won't notice much difference when working with the C shell unless you use advanced shell features such as variables, command replacement, and so on.
Important differences do exist, however. Among these are the set of punctuation characters that have a special meaning to the shell (often called metacharacters). The C shell is sensitive to all the special characters of the Bourne shell, as well as the tilde (~), the commercial at sign (@), and the exclamation point (!). Don't forget to quote or escape these characters when writing commands unless you intend to use their special shell meaning. (See "Quoting and Escaping from Special Characters," later in this chapter, for a discussion of the details.)

Shell Basics

When you enter commands at the shell prompt, you are providing input to the shell. The shell sees a line of input as a string of characters terminated with a newline character; the newline is usually the result of pressing Enter on your keyboard. Input to the C shell can be anything from a single, simple command to multiple commands joined with command operators. Each command line you enter is actually a shell statement. In addition to providing input to the shell manually by entering shell statements on the command line, you can provide input to the shell by putting shell statements into a file and executing the file. Files of shell statements commonly are known as shell scripts.

This section covers the basics of interacting with the shell by entering shell statements on the command line. (Of course, anything you can enter on the command line also can be put into a file for later, "canned" execution.) The subsection "Shell Statements: A Closer Look," provides a more detailed, technical look at the components of shell statements. If you plan to write shell scripts, you'll definitely want to read this section.

When you finish this section, you will feel like you know a good deal about the C shell, but this is really just the beginning. In addition to the C shell's basic service of providing a means to instruct the computer, the C shell also provides a number of tools you can use to expedite your work flow. These tools or features of the shell are described in subsequent sections of this chapter.

Executing Commands: The Basics

The C shell accepts several types of commands as input: UNIX commands, built-in shell commands, user-written commands, and command aliases. This section describes the types of commands you can execute and the ways you can execute them.

Command Names as Shell Input A command is executed by entering the command's name on the command-line. The C shell supports any of the following as command names:

  • Built-in C shell command. The shell provides a number of commands implemented within the shell program. When you invoke a built-in command, it therefore executes very quickly because no program files need to be loaded. A built-in command is always invoked by a simple name--never by a pathname (for example, never by /usr/bin/command).

  • Because the shell first checks a command name for built-in commands before searching for a file of the same name, you cannot redefine a built-in command with a shell script. You can use aliases, however, to redirect a built-in command to a shell script. The next subsection, "Built-In Shell Commands," briefly describes each built-in command. Detailed descriptions of built-in commands with examples are presented in the task-oriented sections of this chapter.

  • Filename. You can specify the filename (% filename), a relative pathname (% ../filename), or an absolute pathname (% /bin/filename) of a file as a command. The file must be marked as executable and must be a binary load file or a shell script in the C shell language. Additionally, if pathnames are not used, the file must exist in one of the directories listed in your path shell variable. The C shell cannot process shell scripts written for the other shells unless your UNIX variant supports the #! notation for specifying the correct command processor. (See "Shell Programming," later in this chapter, for notes about using shell scripts with the C shell.)

  • All UNIX commands are provided as executable files in the /bin or /usr/bin directory. A UNIX command generally is invoked by entering its filename or full pathname.

  • Command alias. A command alias is a name you define by using the alias built-in shell command.

  • An alias can have the same name as a built-in shell command or an executable file. You always can invoke an executable file that has the same name as an alias by using the file's full pathname. An alias that has the same name as a built-in command effectively hides the built-in command, however. Aliases are described in detail in "Aliases," later in this chapter.

Built-In Shell Commands C shell provides a number of commands implemented within the shell program. Built-in commands execute very quickly, because no external program file needs to be loaded. Table 12.1 lists the built-in C shell commands. The remainder of this chapter groups these commands into subsections dedicated to particular tasks you'll perform in the shell and describes how to use each command.

Table 12.1. Built-in C shell commands.

Command Description
alias Defines or lists a command alias
bg Switches a job to background execution
break Breaks out of a loop
breaksw Exits from a switch statement
case Begins a case in switch
cd Changes directory
chdir Changes directory
continue Begins the next loop iteration immediately
default Specifies the default case in switch
dirs Lists the directory stack
echo Echoes arguments to standard output
eval Rescans a line for substitutions
exec Replaces the current process with a new process
exit Exits from the current shell
fg Switches a job to foreground execution
foreach Specifies a looping control statement
glob Echoes arguments to standard output
goto Alters the order of command execution
hashstat Prints hash table statistics
history Lists the command history
if Specifies conditional execution
jobs Lists active jobs
kill Signals a process
limit Respecifies maximum resource limits
login Invokes the system logon procedure
logout Exits from a logon shell
newgrp Changes your Group ID
nice Controls background process dispatch priority
nohup Prevents termination on logout
notify Requests notification of background job status changes
onintr Processes an interrupt within a shell script
popd Returns to a previous directory
pushd Changes directory with pushdown stack
rehash Rehashes the directory search path
repeat Executes a command repeatedly
set Displays or changes a shell variable
setenv Sets environment variable
shift Shifts parameters
source Interprets a script in the current shell
stop Stops a background job
suspend Stops the current shell
switch Specifies conditional execution
time Times a command
umask Displays or sets the process file-creation mask
unalias Deletes a command alias
unhash Disables use of the hash table
unlimit Cancels a previous limit command
unset Deletes shell variables
unsetenv Deletes environment variables
wait Waits for background jobs to finish
while Specifies a looping control
%job Specifies foreground execution
@ Specifies expression evaluation

Executing Simple Commands The most common form of input to the shell is the simple command, where a command name is followed by any number of arguments. In the following command line, for example, ftp is the command and hostname is the argument:

% ftp hostname

It is the responsibility of the command, not the shell, to interpret the arguments. Many commands, but certainly not all, take this form:

% command -options filenames

Although the shell does not interpret the arguments of the command, the shell does interpret some of the input line before passing the arguments to the command. Special characters entered on a command line tell the shell to redirect input and output, start a different command, search the directories for filename patterns, substitute variable data, and substitute the output of other commands.

Entering Multiple Commands on One Line Ordinarily, the shell interprets the first word of command input as the command name and the rest of the input as arguments to that command. The semicolon (;) directs the shell to interpret the word following the symbol as a new command, with the rest of the input as arguments to the new command. For example, the command line

% echo "<h1>" ; getTitle; echo "</h1>"

is the equivalent of

% echo "<h1>"
% getTitle
% echo "</h1>"

except that, in the second case, the results of each command appear between the command input lines.

When the semicolon is used to separate commands on a line, the commands are executed in sequence. The shell waits until one command is complete before executing the next command. You also can execute commands simultaneously (see "Executing Commands in the Background," later in this chapter) or execute them conditionally, which means that the shell executes the next command if the command's return status matches the condition (see "Executing Commands Conditionally," later in this chapter).

Entering Commands Too Long for One Line Command lines can get quite lengthy. Editing and printing scripts is easier if command lines are less than 80 characters, the standard terminal width. Entering commands that span multiple lines is accomplished by escaping the newline character, as in the following command, which translates some common HTML sequences back to a readable format:

% sed -e "s/%3A/:/" -e "s@%2F@/@g" -e "s@%3C@<@g" \
-e 's/%5C/\\/g' -e "s/%23/#/g" -e "s/%28/(/g" \
-e "s/%29/)/g" -e "s/%27/'/g" -e 's/%22/\"/g' infile > outfile

The shell sees a line of input as a statement terminated with a newline character; however, the newline character also is considered to be a white-space character. If you end a line with a backslash (\), the next character--the newline character--is treated literally, which means that the shell does not interpret the newline character as the end of the line of input.

Executing Commands in the Background Normally, when you execute commands, they are executed in the foreground. This means that the C shell will not process any other commands, and you cannot do anything else until the command finishes executing. If waiting for long commands to complete is not in your top 10 list of things to do, you can have your current shell handle more commands without waiting for a command to finish. You can execute the command in the background by putting an ampersand (&) at the end of the command:

% find . -name "*.c" -print &
[2] 13802
%

You also can run multiple commands in the background simultaneously:

% xterm & xclock & xload &

A command executing in the background is referred to as a job, and each job is assigned a job number--the bracketed number in the previous example. The C shell provides you with several commands for managing background jobs; see "Job Control," later in this chapter, for more information.

Repeatedly Executing a Command: repeat You can use the repeat command to execute some other command a specified number of times. Although the repeat command isn't used frequently, it can be quite handy on occasion. If you are writing a shell script to print a document, for example, you might use the command

repeat 5 echo ################################

to mark its first page clearly as the start of the document.

The syntax of the repeat command follows:

repeat count command

For count, specify a decimal integer number. A count of zero is valid and suppresses execution of the command.

For command, specify a simple command that is subject to the same restrictions as the first format of the if statement. The command is scanned for variable, command, and history substitutions; filename patterns; and quoting. It cannot be a compound command (foo;bar), a pipeline (foo|bar), a statement group (using {}), or a parenthesized command list ( (foo;bar|bas) ).

Any I/O redirections are performed only once, regardless of the value of count. For example,

repeat 10 echo Hello >hello.list

results in 10 lines of Hello in a file named hello.list.

Executing Commands in a Subshell: () A command (or a list of commands separated by semicolons) enclosed in parentheses groups the command or commands for execution in a subshell. A subshell is a secondary invocation of the shell, so any change to shell variables, the current directory, or other such process information lasts only while executing the commands in the subshell. This is a handy way, for example, to switch to another directory, execute a command or two, and then switch back without having to restore your current directory:

% (cd /usr/local/etc/httpd/htdocs; cp *.html /users/dylan/docs)

Without the parentheses, you would have to write this:

% cd /usr/local/etc/httpd/htdocs
% cp *.html /users/dylan/docs
% cd /previous/directory

The syntax for grouping commands follows:

( commands )

Enclosing a list of commands in parentheses is a way to override the default precedence rules for the &&, ||, and | operators, at the expense of invoking a subshell and losing any environmental effects of the commands' execution. For example, (grep || echo) | pr pipes the output of the grep command, and possibly that of echo if grep sets a nonzero exit code, to the pr command.

I/O redirections can be appended to the subshell just as for a simple command; the redirections are in effect for all the commands within the subshell. For example,

(cat; echo; date) > out

writes the output of the cat, echo, and date commands to a file named out without any breaks. If you look at the file afterward, first you'll see the lines written by cat, followed by the lines written by echo, and finally the lines written by date. Similarly, input redirections apply to all commands in the subshell, so that each command in turn reads lines from the redirected file, starting with the line following those read by any previously executed commands in the subshell.

Executing Commands Conditionally Compound commands are actually two or more commands combined so that the shell executes all of them before prompting for (or, in the case of shell scripts, reading) more input.

Compound commands are not often needed for interactive work. Compound commands form a very useful extension to the C shell's syntax, however, especially in shell scripts. Some compound command formats, such as & (background job) and | (the pipe operator) are essential to work effectively with UNIX.

Conditional Execution on Success: && (And) You use the double ampersand operator (read and) to join two commands: command1 && command2. It causes the shell to execute command2 only if command1 is successful (that is, command1 has an exit code of zero).

For command1 or command2, you can write a simple command or a compound command. The && operator has higher precedence than || but lower precedence than |. For example,

grep mailto *.html | pr && echo OK

echoes OK only if the pipeline grep | pr sets a zero exit code. (For pipelines, the exit code is the exit code of the last command in the pipeline.)

The compound command

tar cvf docs.tar docs && rm -rf docs

shows one possible benefit of using &&: The rm command deletes the docs directory only if it first is backed up successfully in a tar file.

Conditional Execution on Failure: || (Or) You use the or operator to join two commands: command1 || command2. It causes the shell to execute command2 only if command1 failed (that is, returned a nonzero exit code).

For command1 or command2, you can write a simple command or a compound command. The || operator has lower precedence than both the && and | operators. For example, in the following command

grep mailto *.html || echo No mailto found | pr

either grep succeeds and its output is placed to standard output, or the words No mailto found are piped to the pr command.

Use the || operator to provide an alternative action. In the following case, if the mkdir command fails, the exit command prevents further execution of the shell script:

mkdir $tmpfile || exit

Shell Statements: A Closer Look

A command is a basic command or a basic command embellished with one or more I/O redirections.

A basic command is a series of words, each subject to replacements by the C shell, which, when fully resolved, specifies an action to be executed and provides zero or more options and arguments to modify or control the action taken. The first word of a basic command, sometimes called the command name, must specify the required action.

In plainer terms, a statement is the smallest executable unit. When the shell is operating in interactive mode, it displays its prompt when it requires a statement. You must continue to enter shell statement components, using multiple lines if necessary, until you complete a full statement. If the statement is not completed on one line, the shell continues to prompt you, without executing the line or lines you have entered, until it receives a full statement.

Shell statements are formed from a number of tokens. A token is a basic syntactic element and can be any of the following:

  • Comments. A comment begins with any word having a pound sign (#) as its first character and extends to the end of the line. This interpretation can be avoided by enclosing the pound sign (or the entire word) in quotes. (See "Quoting and Escaping Special Characters," later in this chapter.) The # is considered a comment for non-interactive C shell sessions only.

  • White space. White space consists of blanks and tabs, and sometimes the newline character. White space is used to separate other tokens which, if run together, would lose their separate identity. Units of text separated by white space are generically called words.

  • Statement delimiters. Statement delimiters include the semicolon (;) and the newline character (generated when you press Return). You can use the semicolon to place commands together on the same line. The shell treats the commands as if they had been entered on separate lines.

  • Normally, every command or shell statement ends at the end of the line. The Return (or Enter) key you press to end the line generates a character distinct from printable characters, blanks, and tabs, which the shell sees as a newline character. Some statements require more than one line of input, such as the if and while commands. The syntax description for these commands shows how they should be split over lines; the line boundaries must be observed, and you must end each line at the indicated place, or you will get a syntax error.

  • Operators. An operator is a special character, or a combination of special characters, to which the shell attaches special syntactic significance. Operators shown as a combination of special characters must be written without white space between them, or they will be seen as two single operators instead of a two-character operator. For example, the increment operator ++ cannot be written as + +.

  • Punctuation characters that have special significance to the shell must be enclosed in quotes to avoid their special interpretation. For example, the command grep '(' *.cc uses quotes to hide the right parenthesis from the shell so that the right parenthesis can be passed to grep as an argument. See "Quoting and Escaping from Special Characters," later in this chapter, for details about using quotes.

  • Words. A word is any consecutive sequence of characters occurring between white space, statement delimiters, or operators. A word can be a single group of ordinary characters, a quoted string, a variable reference, a command substitution, a history substitution, or a filename pattern; it also can be any combination of these elements. The final form of the word is the result of all substitutions and replacements, together with all ordinary characters, run together to form a single string. The string then is used as the command name or command argument during command execution.

Filename Substitutions (Globbing)

Filename generation using patterns is an important facility of the Bourne shell. The C shell supports the filename patterns of the Bourne shell and adds the use of {} (braces) to allow greater flexibility. Globbing also is known as wildcarding.

Several shell commands and contexts allow the use of pattern-matching strings, such as the case statement of switch and the =~ and !~ expression operators. In these cases, pattern strings are formed using the same rules as for filename generation, except that the patterns are matched to another string.

When any of the pattern expressions described in Table 12.2 are used as arguments of a command, the entire pattern string is replaced with the filenames or pathnames that match the pattern. By default, the shell searches the current directory for matching filenames, but if the pattern string contains slashes (/), it searches the specified directory or directories instead. Note that several directories can be searched for matching files in a single pattern string: a pattern of the form dir/*/*.cc searches all the directories contained in dir for files ending with .cc.

Table 12.2. Pattern expressions.

Expression Definition
* The asterisk, also known as a star or splat, matches any string of characters, including a null string (the asterisk matches zero or more characters). When the asterisk is used by itself, it matches all filenames. When the asterisk is used at the beginning of a pattern string, leading prefixes of the filename pattern are ignored: *.cc matches any filename ending with .cc. When the asterisk is used at the end of a pattern string, trailing suffixes of the filename pattern are ignored: foo* matches foo.cc, foobar.html, and any filename beginning with foo. An asterisk in the middle of a pattern means that matching filenames must begin and end as shown but can contain any character sequences in the middle: pay*.cc matches filenames beginning with pay and ending with .cc, such as payroll.cc, paymast.cc, and paycheck.cc. Multiple asterisks can be used in a pattern: *s* matches any filename containing an s, such as sean.txt or apps.hh.
? The question mark matches any one character. For example, ? as a complete word matches all filenames one character long in the current directory. The pattern pay?.cc matches pay1.cc and pay2.cc but not payroll.cc. Multiple question marks can be used to indicate a specific number of don't-care positions in the filename: pay.?? matches filenames beginning with pay. and ending in any two characters, such as pay.cc and pay.hh, but does not match pay.o.
[] The square brackets enclose a list of characters. Matching filenames contain one of the indicated characters in the corresponding position of the filename. For example, [abc]* matches any filename beginning with the letter a, b, or c. Because of the asterisk, the first character can be followed by any sequence of characters.
Use a hyphen (-) to indicate a range of characters. For example, pay[1-3].c matches filenames pay1.cc, pay2.cc, and pay3.cc, but not pay4.cc or pay11.cc. Multiple ranges can be used in a single bracketed list. For example, [A-Za-z0-9]* matches any filename beginning with a letter or a digit. To match a hyphen, list the hyphen at the beginning or end of the character list: [-abc] or [abc-] matches an a, b, c, or hyphen.
~ The tilde (~) can be used at the beginning of a word to invoke directory substitution of your home directory. The (~) is substituted with the full pathname of your home directory. Also used in the form ~/path to refer to a file or directory under your home directory. If the tilde does not appear by itself as a word and is not followed by a letter or a slash, or it appears in any position other than the first, it is not replaced with the user's home directory. Thus, /usr/rothse/file.cc~ is a reference to the file file.cc~ in the directory /usr/rothse.
~name Substituted with the full pathname of user name's home directory. For example, ~ken/bin refers to /usr/ken/bin if the home directory for user ken is /usr/ken. The password file /etc/passwd is searched for name to determine the directory pathname; if name is not found, the shell generates an error message and stops.
{} Braces enclose a list of patterns separated by commas. The brace expression matches filenames having any one of the listed patterns in the corresponding position of the name. For example, the pattern
/usr/home/{kookla,fran,ollie}/.cshrc
expands to the path list
/usr/home/kookla/.cshrc /usr/home/fran/.cshrc /usr/home/ollie/.cshrc
Unlike *, ?, and [], brace-enclosed lists are not matched against existing filenames; they simply are expanded into words subject to further substitution regardless of whether the corresponding files exist. Brace-enclosed lists can be nested--for example, /usr/{bin,lib,home/{john,bill}} refers to any of the directories /usr/bin, /usr/lib, /usr/home/john, and /usr/home/bill.

It is important to realize that filename generation using pattern strings can cause a replacement of one word with many. A filename pattern must be a single word. The ordinary characters and pattern-matching characters in the word describe a rule for choosing filenames from the current or specified directory. The word is replaced with each filename or pathname found that matches the pattern. If you had three files in your current directory (ch1.txt, ch2.txt and chlast.txt), then the pattern *.txt would expand to match those three files:

% echo Files: *.txt
Files: ch1.txt ch2.txt chlast.txt

You can use pattern expansion in many ways; for example, the expansion below is used to set a shell variable to be an array of three items. The array is then queried for its length and second element. See "Using Array Variables" later in this chapter, for more information about C shell arrays.


% set files=(*.txt)
% echo Found $#files files
Found 3 files
% echo $files[2]
ch2.txt


TIP: Another powerful C shell feature for determining files in a directory is the command/filename viewing feature that uses control D (Ctrl-D). This feature enables you to determine the files available for a command without aborting the command. For example, if you type cp ~sdrew/docs/ind and want to see which files match the specification, pressing Ctrl-D displays a list of matching files in a multicolumn format. Certain files have a character appended to indicate the file type (this behavior is similar to the output of ls -F): executables are marked with an asterisk (*), directories are marked with a slash (/), and links are marked with an at sign (@). After the column-sorted list is displayed, the command is redisplayed so that you can continue typing. The files listed will be those that match the pattern ~sdrew/docs/ind*. Note that ~(Ctrl-D) prints a list of all users who have accounts on the system.
Commands can be completed in a similar manner if the C shell is expecting a command as part of your current input. The current command pattern is looked for in each directory specified in the PATH environment variable. Note that aliases are not expanded by Ctrl-D. If my PATH is set to /bin:~/bin:/usr/bin and I complete the command pri using Ctrl-D, the output is roughly the same as ls /bin/pri* ~/bin/pri* /usr/bin/pri*.
In addition to getting lists of commands and filenames, the Escape (ESC) key can be used to complete partially typed commands and filenames. The automatic completions are known as command completion and filename completion, depending on whether you are completing a filename or a command. The pattern matching is done as in the Ctrl-D viewing utility. If the partial name is unique, the name is completed; otherwise, the terminal bell is sounded. If ambiguities are encountered (that is, more than one file matches), the name is completed to the ambiguity, and the terminal bell is sounded. Suppose you had two files in a directory named veryLongName.txt and veryLong_Name.txt and you wanted to edit the file veryLong_Name.txt, you can save yourself a great deal of typing by using filename completion. You can type vi ve(Esc), which completes to vi veryLong and rings the bell. Then, if you type _(Esc), the name completes to vi veryLong_Name.txt, at which point you can press Enter and begin your vi session.

Redirecting Input and Output

The C shell provides several commands for redirecting the input and output of commands. You might be familiar with the input (<) or output (>) redirection characters from earlier chapters. The C shell provides you with these and more.

An I/O redirection is an instruction to the shell you append to a command. It causes one of the standard file descriptors to be assigned to a specific file. You might have encountered standard files in the discussion of the Bourne shell in Chapter 9. The UNIX operating system defines three standard file descriptors: standard input (stdin), standard output (stdin), and standard error (stderr).


NOTE: The UNIX operating system actually provides at least 25 file descriptors for use by a command. It is only by convention that the first three are set aside for reading input, writing output, and printing error messages. Unless you instruct otherwise, the shell always opens these three file descriptors before executing a command and assigns them all to your terminal.

A file descriptor is not the file itself; it is a channel, much like the audio jack on the back of your stereo--you can connect it to any audio source you want. Similarly, a file descriptor such as standard input must be connected to a file--your terminal by default, or the disk file or readable device of your choice.

You can change the location where a command reads data, writes output, and prints error messages by using one or more of the I/O redirection operators. Table 12.3 lists the operators.

Table 12.3. I/O redirection operators.

Format Effect

Input Redirection

< filename Uses the contents of filename as input to a command.
<< word Provides shell input lines as command input. Lines of the shell input that follow the line containing this redirection operator are read and saved by the shell in a temporary file. Reading stops when the shell finds a line beginning with word. The saved lines then become the input to the command. The lines read and saved are effectively deleted from the shell input and are not executed as commands; they are "eaten" by the << operator. Shell execution continues with the line following the line beginning with word. If you use the << operator on a command you type at the terminal, be careful: Lines you type afterward are gobbled up by the shell--not executed--until you enter a line beginning with whatever you specified as word. The << operator most often is used in shell scripts. This technique is known as providing a here document.

Output Redirection

> filename

Writes command output to filename.

>! filename

Writes command output to filename and ignores the noclobber option. The noclobber option is fully explained in "Using Predefined Variables," later in this chapter. Briefly, noclobber causes the shell to disallow the > filename redirection when filename already exists; noclobber is therefore a safety you can use to prevent accidentally destroying an existing file. Sometimes, you will want to redirect output to a file even though it already exists. In such a case, you must use the >! operator to tell the shell you really want to proceed with the redirection. If you don't set the noclobber option, you don't need to use the >! operator.

>& filename

Writes both the command output and error messages to filename.

>&! filename

Writes both the command output and error messages to filename and ignores the noclobber option.

>> filename

Writes command output at the end of filename (Append mode).

>>! filename

Writes command output at the end of filename (Append mode) and ignores the noclobber option.

>>& filename

Writes command output and error messages at the end of filename (Append mode).

>>&! filename

Writes command output and error messages at the end of the filename (Append mode) and ignores the noclobber option.

In Table 12.3, filename represents any ordinary filename or pathname; or any filename or pathname resulting after variable substitution, command substitution, or filename generation.

I/O redirection operators are appended to a command; for example, date >curdate writes the current date to the file curdate instead of to your terminal. You also can use more than one redirection per command: Simply list them one after another at the end of the command. The order doesn't matter: for example, both cat <infile >outfile and cat >outfile <infile have the same effect.

Input Redirection Some commands make no special use of the standard input file, such as the date and the ls system commands; others require an input file to function properly, such as the cat and awk commands. You can use the < redirection operator in the form command < filename to designate a file as the source of input for commands such as cat and awk; if you do not, these commands read data from your keyboard--sometimes useful, but usually not. If you provide an input redirection, but the command does not read data (such as ls), the I/O redirection still is performed by the shell; it is just ignored by the command. Note that it is an error to redirect standard input to a file that doesn't exist.

The redirection << word is a special form of the input-redirection operator. Instead of taking input from a file, input to the command comes from the current shell input stream--your keyboard, if you append << to a command you type in, or your shell script if you use << on a command in a shell script.

For word, you choose an arbitrary string to delimit the lines of input. Then write the lines to be provided to the command as input immediately following the command line, and follow the last line of desired input with a line beginning with word. The shell reads the lines up to word, stores the lines in a temporary file, and sets up the temporary file as standard input for the command.

The << word form of input redirection is called a here document, because it is located here, in line with your shell commands. Here documents are useful when you want to provide predefined data to a command, and they save you from having to create a file to hold the data.

Unlike the filename part of other I/O redirection operators, word for the here document is not scanned for variable references, command substitutions, or filename patterns; it is used as is. All the following shell input lines are checked for the presence of word as the only word on the line before any substitutions or replacements are performed on the line.

Normally, lines of the here document are checked for variable references and command replacements; this enables you to encode variable information in the here document. If you quote any part of word, however, the lines are read and passed to the command without modification. The redirection << STOP reads lines up to STOP and performs substitutions on the lines it reads, for example. The redirection << "STOP" reads lines up to the line beginning with STOP and passes the lines directly to the command, as they are, without substitutions or replacements of any kind.

The line beginning with word is discarded; it is not passed to the command in the here document or executed by the shell.

The following example shows the use of a here document to create an HTML form:

cat <<HERE
 <FORM method=post action=http://host.com/cgi-bin/addTime.sh>
 <select NAME=username>
 ´´./doUserQuery;./parseList.sh users_$$.txt "$userName"´´
 </select>
 <input type=submit value=Submit>
 </form>
HERE

The line containing the word HERE will not appear in the output; it is simply a mark to let the shell know where the redirected lines end.

Output Redirection Output redirections have the general form > filename and >> filename. The first operator creates a new file of the specified filename. The file is opened before command execution begins, so even if the command fails or cannot be found, or if the shell finds an error on the command line and stops, the output file still is created.


NOTE: For purposes of understanding shell syntax, you should note that appending an I/O redirection to a simple command yields a simple command. Except where specifically prohibited, a command with redirections appended can be used wherever a simple command is allowed, such as on the single-line if statement.

If you've set the noclobber option (with set noclobber), the shell refuses to create the named output file if it already exists; doing so would destroy the file's current contents. If you want to perform the output redirection even if the file filename already exists, use the redirection operator >! instead; it overrides the noclobber option.

The >> command arranges for command output to be added to the end of the named file. For this redirection operator, the noclobber option requires that the named file already exist. If you use the alternative form >>!, or if you use >> and the noclobber option is not set, the shell creates the named file if necessary.

The >& and >>& operators redirect both the standard output and standard error files to filename. The Bourne shell enables you to redirect the standard output and standard error files separately; the C shell does not.


TIP: Although the C shell offers no direct means for redirecting standard error and standard output at the same time, you can achieve the net result at the expense of a subshell. In the subshell, redirect standard output via the > operator to the desired location for non-error messages and then redirect standard error and standard output from the subshell via the >& operator to the desired location for error messages. Because the standard output was redirected in the subshell, the standard output and standard error redirection from the subshell will contain only the standard error.
Suppose that you want to run a script, buildSystem, that builds a large software class library and generates nearly 1MB of output, of which a few messages might be errors. The following command places standard output in a file named build.log and error messages from standard error in buildErr.log:

			
% (buildSystem -version 5.1 > build.log) >& buildErr.log

			

Quoting or Escaping from Special Characters

As you saw in previous sections, certain characters have special meanings for the shell. When the shell encounters a special character, the shell performs the action defined by the special character. The following punctuation characters available on the standard keyboard are special to the shell and disrupt the scanning of ordinary words:

~ ' ! @ # $ % ^ & * ( ) \ | { } [ ] ; ' " < > ?

In some contexts, particularly within the switch statement, the colon (:) is also a special character. The colon is recognized as a special character only when expected, in a case or default statement, and as a statement label. It does not need to be quoted except to avoid these specific interpretations.

To use one of these characters as a part of a word without its special significance, you can escape the character by placing a backslash (\) immediately in front of the character. Note that a backslash intended as an ordinary character must be written as two backslashes in succession: \\. To escape a two-character operator such as >>, you must insert a backslash in front of each character: \>\>. The $ character can be escaped if followed by white space:

% echo escaped $ sign
escaped $ sign

Alternatively, you can enclose the special character or any portion of a word containing the special character in quotes. The C shell recognizes three kinds of quotes: the apostrophe ('), the quote ("), and the backquote (´´). The C shell does not consider the enclosing quotes as part of the input passed to commands. The output for

% echo "Enter name>"
Enter name>

does not contain quotes. Use two apostrophes (also called single quotes, foreticks, or just simply ticks) to enclose a character sequence and avoid all interpretation by the shell. I often call a string enclosed in apostrophes a hard-quoted string, because the shell performs absolutely no substitution, replacement, or special interpretation of characters that appear between the apostrophes (except for history substitutions). Even the backslash character is treated as an ordinary character, so there are no escapes (except for \ newline and \!) within an apostrophe-enclosed string. As a result, you cannot embed an apostrophe in such a string. That is, the string 'who's there' causes a shell error; the C shell interprets the string as who concatenated with an s, followed by a white-space delimiter, followed by there, and then the starting apostrophe of another string. When the shell does not find the matching apostrophe, an error is generated: Unmatched '..

One of the uses of quoted strings is to specify a single word containing blanks, tabs, and newline characters. The following code, for example, shows the use of a single echo command to print two lines of output:

% echo -n 'Hello.\
Please enter your name: '
Hello.
Please enter your name:

The double apostrophe or quote (") also provides a special bracket for character strings. The quote hides most special characters from the shell's observation. Quoted strings are subject to two kinds of scan and replacement: variable references and command substitutions.

Any of the reference forms for shell variables ($1, $name, ${name}, $name[index], $*, and others) are recognized inside quoted strings and are replaced with the corresponding string value. The replacement occurs inside the quoted string, leaving its unity as a single word intact (even if the substituted value includes blanks, tabs, or newline characters).

Command substitution occurs for strings enclosed in backquotes (´´). The entire string enclosed between matching backquotes (also known as backticks) is extracted and executed by the current shell as if it were an independent command. The command can be two or more commands separated by semicolons, a pipeline, or any form of compound statement. Any data written to standard output by the command is captured by the shell and becomes the string value of the backquoted command. The string value is parsed into words, and the series of words replaces the entire backquoted string. Using backquotes to perform command substitution can be thought of as an I/O redirection to the command line.


TIP: Although command substitution is a powerful feature of the C shell, it does have limitations. Some commands generate more output than a command line can hold, for example. (Command-line length is determined by the LINE_MAX and ARG_MAX system parameters; consult your limits man page or look over /usr/include/limits.h.) Additionally, at times, you will need to process each item of output individually, in which case command substitution is not of much use.
Suppose that you want to find all your C++ source files (*.{hh,cc}), starting from your home directory down your entire directory tree, and search the files found for the use of a certain class (RWString). The command

% grep RWString ´´find $home -name "*.[ch][ch]" -print -follow´´




			

generates the message /bin/grep: Arg list too long on my system. The UNIX command xargs was tailor made to solve this problem. The general use of xargs follows:


			
xargs [options] [command]



			

xargs reads from the standard input and places that input on the command-line of command. As many arguments as possible are passed to command on its command line. As a result, the command executed by xargs may be called multiple times in order to use up all the input read from standard input. Transforming the preceding command to use xargs results in this command:

% find $home -name "*.[ch][ch]" -print -follow | xargs grep RWString

This command produces the desired results. (Note that xargs is more efficient than the -exec option of find, because command is executed as few times as possible with as many arguments as possible. xargs -i is equivalent to the -exec option.) xargs also can be set to process each line of input individually by using the -i option, which is useful for commands that take only one argument, such as basename. When using the -i option, a replacement string--{} is the default replacement string--must be added to the command supplied for xargs to execute. The following command finds all directories that contain C++ source files:

% find $home -name "*.[ch][ch]" -print -follow | xargs -i dirname {} | sort -u



			

The xargs command has a few other nifty options and is worth a perusal of your friendly local man page.


All forms of shell substitution occur inside backquoted command strings, including variable replacement, nested command executions, history substitutions, and filename patterns.

A backquoted command string (or any number of them) can appear inside a quoted string and will have its normal effect; this is the second form of substitution performed on "-quoted strings. A quoted command substitution (echo "xxx´´commands´´xxx") generates new words only at the end of each line, except at the end of the last line. If the executed command prints only one line of text, the text replaces the backquoted expression without introducing any word breaks.

Both quoting forms '...' and "..." suppress filename generation. For example, note the difference in the following echo commands:

% echo *.cc
main.cc io.cc parse.cc math.cc
% echo "*.cc"
*.cc

Apostrophes can appear inside a double-quoted string. The apostrophe has no special significance when appearing inside a double-quoted string and does not need to be backslashed. The following example shows quotes inside quoted strings:

% echo '<input type=submit value="Return to Tracking Screen">'
<input type=submit value="Return to Tracking Screen">
% echo "Your shell: '$SHELL'"
Your shell: '/bin/csh'

A backslash that appears inside an apostrophe-quoted string is retained and appears in the string's value, because no substitutions occur inside an apostrophe-quoted string, as in the example below.

% echo 'Single \' quote
Single \ quote

Inside a double-quoted string or a command substitution using ', or in a normal unquoted word, a backslash has the effect of suppressing shell interpretation of the character that follows it. The backslash then is removed from the string. The following examples show the effect of a backslash removing shell interpretation of quoting characters :

% echo Double \" quote
Double " quote
% echo Single \' quote
Single ' quote


TIP: For some particularly complicated shell commands, it is necessary to get many instances of quotes and apostrophes and still have desired variable and command substitution. Simple awk commands are loaded with special shell characters, for example, and must be hard quoted:

			
ls -l | awk '{printf("\t%s\t\t%s\n", $9, $5)}')

If you want to turn this command into an alias, the combination of quotes seems impossible, because there are only two types of quotes and three levels of nesting. The following alias command yields incorrect results (note that the first command sets the alias, and the second command displays the alias):

% alias myls "ls -l | awk '{printf("\t%s\t\t%s\n", $9, $5)}'"
% alias myls
ls -l | awk '{printf(t%stt%sn, , )}'

The solution is to alternate quoting methods as needed to get the desired results. The following command alternates between using double quotes and single quotes (the portion of the command enclosed in double quotes is shown in bold, and the portion enclosed in single quotes is italicized):

alias myls "ls -l | awk '"'{printf("\t%s\t\t%s\n", $9, $5)}'"'"
ls -l | awk '{printf("\t%s\t\t%s\n", $9, $5)}'


			

Working with Directories and the Directory Stack

The C shell provides you with several built-in commands for working with directories. The cd, chdir, pushd, and popd commands all change the current working directory.

The pushd and popd commands provide a pushdown stack mechanism for changing directories, and the dirs command displays the contents of the stack. If you switch to another directory by using pushd instead of cd, the pathname of your previous directory is "saved" in the directory stack. A subsequent popd then returns you to the previous directory. Be aware that the cd command does not maintain the directory stack; you cannot use popd to return to a directory that you left using cd.

Changing Directories: cd and chdir In the C shell, you can choose from two commands for changing your current working directory: cd and chdir. The chdir command is equivalent to cd in every way. The syntax for these commands follows:

cd [ pathname ]
chdir [ pathname ]

If you omit the pathname argument, the command attempts to change to the directory whose pathname is given by the value of the C shell variable home. See "Using Predefined Variables," later in this chapter, for more information about home.

If you specify a name, the cd or chdir command uses a search hierarchy to attempt to locate the referenced directory. It follows this process:

1. If pathname has a /, ./, or ../ as the first character, the command attempts to switch to the named directory; failure terminates the command immediately. In other words, if you use a relative or absolute pathname, the specified directory must exist and must be accessible to you; otherwise, the command fails.

2. The command searches your current directory. A partial pathname of the form name1/name2/namen implies searching your current directory for the entire subtree.

3. If pathname cannot be found in your current directory, the command checks to see whether the shell variable cdpath exists and has a value. If it does, each of the directories named in cdpath is checked to see whether it contains pathname. If successful, the command changes to the pathname in that directory and prints the full pathname of the new current directory.

4. If no variable cdpath exists, or if pathname cannot be found in any of the directories listed in cdpath, the command checks to see whether pathname is a variable name and has a value with / as the first character. If so, the command changes to that directory.

5. If the name still cannot be found, the command fails.

For more information about the cdpath variable, see "Using Predefined Variables," later in this chapter.

The cd and chdir commands as implemented by the C shell provide a great deal of flexibility in generating shortcuts for directory names. There is nothing more painful than having to repeatedly supply long directory names to the cd command. The purpose of the cd command's search hierarchy is to provide some mechanisms you can use for shortening a reference to a directory name. The cdpath variable is your principal tool. If you set it to a list of directories you often reference, you can switch to one of those directories just by giving the base directory name. If cdpath is not sufficiently flexible to suit your needs, you can define a shell variable as an alias for a directory's full pathname, and cd varname switches you to that directory for the price of a few keystrokes.


NOTE: When using a shell variable as a pseudonym for a directory path, you do not need to include $ in front of the variable name. Doing so is permitted and also works because of the shell's variable substitution mechanism but is not required. Only shell variables (use the set command) work as a directory alias--not environment variables (the setenv command).

Listing the Directory Stack: dirs The directory stack is a mechanism you can use to store and recall directories to which you have changed by using the special change-directory commands pushd and popd, which are discussed in the next two sections. The dirs command lists the directories in the directory stack:

% dirs
/usr/local/bin ~/html/manuals /users/wadams/bin

Three directories are on the directory stack in this example. The first directory listed is the current directory (the one you see if you enter the pwd command). Directories to the right are previous directories, and the farthest to the right are the least recent. In this example, the directory /users/wadams/bin was the first directory to be changed to--that is, "pushed" onto the pushdown directory stack; ~/html/manuals was the next directory, and /usr/local/bin was the directory most recently changed to (the current directory).

Changing to a Directory by Using the Directory Stack: pushd To save the pathname of a directory on the directory stack, you can use the pushd command to change to another directory. Using pushd saves the pathname of your previous directory on the directory stack so that you can return to it quickly and easily by using the popd command. Use dirs to display the directories currently saved on the pushdown stack.

Three forms of the pushd command exist:

pushd
pushd name
pushd +n

Used in the form pushd, the command exchanges the top two directory-stack elements, making your previous directory the current and your current directory the previous. Successive pushd commands used without an argument switch you back and forth between the top two directories.

Used in the form pushd name, the command changes to directory name in the same way as cd would have; pushd uses the cdpath directory list to resolve name and succeeds or fails in the same cases as cd. The pathname of the current directory is saved in a directory stack prior to the change. The directory stack is an implicit array variable maintained by the shell (which you cannot access directly), and each pushd adds the current directory to the left and pushes all existing entries to the right. The top (or first) element is always your current directory, and subsequent entries are the pathnames of your previous directories in reverse order. The popd command discards the top stack entry and changes to the new top entry, reducing the total number of items on the stack by one.

Use the form pushd +n to perform a circular shift of the directory stack by n positions, changing to the new top directory. A circular shift treats the list of elements as if they were in a ring, with the first preceded by the last and the last followed by the first. The shift changes your position in the ring without deleting any of the elements. Consider the following example:

% dirs
/home/john /home/mary /home/doggie /home/witherspoon
% pushd +2
/home/doggie
% dirs
/home/doggie /home/witherspoon /home/john /home/mary

Note that both before and after the pushd, /home/john precedes /home/mary, and /home/doggie precedes /home/witherspoon. The example also shows that, for the purpose of the pushd +n command form, /home/witherspoon (the last entry) is effectively followed by /home/john (the first entry).

Returning to a Previous Directory by Using the Directory Stack: popd After you have saved directories on the directory stack with pushd, you can use popd to return to a previous directory. The syntax for the popd command follows:

popd [ +n ]

Used in the form popd +n, the command deletes the nth entry in the stack. Stack entries are numbered from 0, which is your current directory. The following example shows the use of pushd, dirs, and popd together:

% pwd
/usr/home/john
% pushd /usr/spool
% pushd uucppublic
% pushd receive
% dirs
/usr/spool/uucppublic/receive /usr/spool/uucppublic /usr/spool
_/usr/home/john
% popd
/usr/spool/uucppublic
% dirs
/usr/spool/uucppublic /usr/spool /usr/home/john
% popd +1
/usr/spool/uucppublic /usr/home/john
% popd
/usr/home/john
% dirs
/usr/home/john

Changing the Active Shell

The C shell provides a number of commands for changing the active shell. Although your logon shell may be the C shell, you are not limited to it; you can change your shell to the Bourne shell or the Korn shell at any time by using the exec command. The exit and logout commands also change the active shell by returning you to the shell that was active before your current shell. When you issue these commands from your logon shell, they return you to the logon screen, which is itself a kind of shell (of somewhat limited functionality).

Other commands, such as umask and nohup, change the manner in which UNIX treats the shell.

Invoking a New Process: exec The exec command transfers control to the specified command, replacing the current shell. The command you specify becomes your new current shell. The syntax of the exec command follows:

exec command

Control cannot be returned to the invoking environment, because it is replaced by the new environment. Shell variables exported with the setenv command are passed to the new shell in the usual manner; all other command contexts, including local variables and aliases, are lost.

The exec command is used mainly in C shell scripts. The normal C shell behavior for executing commands that are C shell scripts uses two child processes: one process is a new C shell to interpret the script, and the other process is command(s) being executed. The exec command causes the C shell to eliminate one of the processes by having the C shell process replaced by the command process. exec most often is used if a script is used to set up an execution environment for a command (to set environment variables and check arguments, for example) and then run the command.

The exec command is equivalent to the Bourne shell exec.

Exiting from the Current Shell: exit The exit command causes the current shell invocation to be exited. Its syntax follows:

exit [ (exitExpression) ]

If the exit command is issued from within a shell script, the shell script is terminated, and control returns to the invoking shell. If exit is issued from your logon shell, the .logout script in your home directory is executed before the shell exits. Normally, the UNIX operating system redisplays a logon screen after an exit from the logon shell.

If you provide the optional exitExpression argument (which must be enclosed in parentheses), the argument is evaluated as an arithmetic expression, and the resulting value is used as the shell's exit code; otherwise, the current value of the status variable is taken as the shell's exit code. The status variable is described in "Using Predefined Variables," later in this chapter.

Invoking the System Logon Procedure: login

You can use the login command to log out from your current shell and to immediately log on under the same or a different User ID. Its syntax follows:

login name [ arg ... ]

Using this built-in shell command is not quite equivalent to logging out in the normal manner and then logging on. If you use the login command from a remote terminal, the line connection is not dropped, whereas logging out in the normal manner drops the line and requires you to reestablish the connection before you can log on again.

You cannot execute the login built-in command from a subshell; it is legal only for your logon shell.

For name, specify the user name with which you want to log on. Any arguments you specify after name are passed to the /bin/login command and are defined by /bin/login--not by the shell.

Exiting from a Logon Shell: logout You can use the logout command to log out from your logon shell:


logout

You also can terminate the logon shell (or any subshell) with the exit command. If you have the ignoreeof option set, you cannot use the EOF key (usually, Ctrl-D) to exit from the shell; in such a case, use logout or exit. See "Using Predefined Variables," later in this chapter, for a definition of the ignoreeof option.

Preventing a Command from Terminating Execution After Logout: nohup You can use the nohup command to run a command that is insensitive to the hang-up signal:

nohup [ command ]

The UNIX operating system always sends a hang-up signal (signal 1) to a process when its process group leader logs out. The net effect is that, generally, any command you are running when you log out is terminated. (Although you can't ordinarily issue the logout or exit command or enter an EOF character while you are running a command, you always can force a logout by turning off your terminal; or, if you are using a remote terminal connection, you can hang up the line.)

When you invoke command with nohup, the shell effectively disables the hang-up signal so that command cannot receive it, which enables command to continue to execute after you log out. Use nohup command to run command with the hang-up signal disabled.

You can disable the hang-up signal for your interactive shell or from within a shell script by using the trap built-in command. Programs written in the C or C++ languages also can disable or ignore the hang-up signal. Not all commands are able to ignore a hang-up signal, however. If you use nohup to invoke the command, you are assured that the hang-up signal will be ignored regardless of whether the command disables the signal.

Use nohup with no arguments from within a shell script to disable the hang-up signal for the duration of the script. A job placed in the background (see "Executing Jobs in the Background: &," later in this chapter) using the & operator has nohup automatically applied to it.

Displaying and Setting the File-Creation Mask: umask

The file-creation mask (commonly called the umask) is an attribute of the shell process, just like the current directory is an attribute. The file-creation mask specifies the default permissions assigned to new files you create. When redirecting the output of a command to a file with the > operator, for example, it would be extremely inconvenient if the system prompted you for file permissions every time it created a file. Prompting would be especially annoying because, most of the time, you would assign the same permissions to all new files.

If you're not familiar with file permissions, you might want to review Chapter 4, "The UNIX File System." Briefly, file permissions indicate who may read, write, or execute the file.

The file-creation mask is a device you use to indicate what permissions UNIX is to assign to a new file by default. If you want some other access permissions for a file, the usual approach is to first create the file and then change the file's permissions with the chmod command.

The file-creation mask itself is a binary value consisting of nine bits; each bit corresponds to the permission bits for a file. As a matter of convention, the nine bits are represented by three octal digits; each digit represents three bits. Using octal number representation for the file-creation mask is a matter of convention, not necessity, yet the umask command does not enable you to use any other number form for displaying or setting the file-creation mask. You must use octal to set the mask, and you must interpret octal values to understand the mask when displayed.

As for the mask itself, each of the bits in the mask indicate whether the corresponding bit of the file permission should be set to off, (set to zero). By default, virtually all UNIX commands attempt to set reasonable permission bits to 1 when creating a file. A command that creates a data file (such as a text file) tries to create the file with permissions of 666. In octal, this grants read and write permissions to you (the file's owner), to other members of your UNIX group, and to all other system users; however, it leaves the Execute permission unset. Commands that create executable files (such as cc and ld) attempt to set the file's permissions to 777, which, in octal, set the Read, Write, and Execute bits for all users.

Because of this default action by UNIX commands, it is the function of the file-creation mask to specify permissions you don't want set. When you set a bit in the file-creation mask, it causes the corresponding bit of the file's permissions to be forced to 0. Bits not set in the file-creation mask are interpreted as don't care: the file-permission bit remains unchanged.

The bits of the file permissions are written as rwxrwxrwx. The first three bits represent Read, Write, and Execute permissions for the file's owner; the second set of three bits represents Read, Write, and Execute permissions for the file's group; and the third set of three bits specifies the permissions for other users. To grant Read and Write permissions to the file's owner but only Read access to other users, the appropriate file-permission setting is the bits 110100100. Writing this in octal, you arrive at the familiar permissions value of 644, which you already may have seen in the output of the ls command.

Remember that UNIX commands try to create files with all reasonable permissions set. For a data file, these bits are 110110110, corresponding to rw-rw-rw-. To get the permissions switched to rw-r--r--, you need to set off the fifth and eighth bits. A file-creation mask of 000010010 (in octal 022) would do the trick. When the file is created, UNIX lines up the bits in the file permissions requested by the command and your file-creation mask like this:

1 1 0 1 1 0 1 1 0       attempted file permissions
0 0 0 0 1 0 0 1 0       file creation mask
----------------
1 1 0 1 0 0 1 0 0       actual file permissions

What you have to do when using the umask command, therefore, is to first decide what file permissions you want to assign to your new files by default and then write a bit mask as an octal number that sets the appropriate file-permission bits to 0.

As it happens, most UNIX users want to reserve Write permission for their files to themselves, but they are willing to let other people look at the files. The appropriate file-creation mask for this is 022 in octal. In many cases, the system administrator sets up the system so that the umask 022 command is executed for you when you log on. If the administrator has not set up a default, or you want to use another file-creation mask, you can set a new mask in your logon profile.

The actual syntax of the umask command is straightforward. To display the current process file-creation mask, use the umask command like this:

% umask
022

You also can use umask to set the process file-creation mask by specifying the octal argument:

% umask octal

The process file-creation mask is set to the bit pattern corresponding to the low-order three bits of each digit in the octal number octal.

Echoing Arguments to Standard Output

C shell provides two commands for echoing arguments to standard output: echo and glob. The only difference between them is the delimiter used to separate words in the output line.

The echo command, although most often used when writing shell scripts, also comes in handy in a number of keyboard situations--for example, when constructing a pipe to a non-interactive command (echo arg1 | command). One of the best examples of the echo command is using it to display the value of a shell variable:

% echo $path
/usr/bin /bin /usr/local/bin /users/chen/bin

In this case, the variable substitution expression $path does the real work; the echo command provides only the step of printing the value on the terminal. Nonetheless, without the echo command, it would be cumbersome to check the value of a variable. The set command not only prints a single variable, but set can be used to print all variables. Using set to print all shell variables can produce a lengthy list that takes time to search through for the entry you want.

The glob command, on the other hand, rarely is used in any context. Originally, it was intended to be called from a C program (not a shell script) to get the shell to expand a filename wildcard expression. Most C programmers don't use this technique, though, because it relies on the existence of the C shell.

Using the echo Command The echo command prints a line containing its arguments to standard output. The syntax for the command follows:

echo [ -n ] wordlist

The arguments are printed with one intervening blank between them and a newline character after the last one. The echo command does not modify the words in wordlist in any way, but the arguments as seen by echo might differ from those on the original command because of variable, command, and history replacement and filename globbing. For example, the command

echo Directory $cwd contains these files: *.cc

might generate the following line to standard output:

Directory /usr/lib1 contains these files: myprog.cc bigprog.cc

Specify option -n to suppress printing a newline character; this enables the next input or output to occur on the same line as the output of the echo command.

Using the glob Command The glob command also prints a line containing its arguments to standard output. The syntax for the command follows:

glob [ wordlist ]

Use glob to print the words in wordlist to standard output. The words are printed with a null character between each (not white space, as with echo). The last word in wordlist is not followed by a newline character.

The words in wordlist are subject to variable, command, and history substitution and filename expansion in the usual manner. After scanning for substitutions, the resulting strings are redivided into words, which then are written using the null character delimiter.

The glob command is similar to echo and differs only in the delimiter used to separate words in the output line. Because most terminals cannot print a null character, glob generally is not used to generate terminal output. It is intended to be called from a C language program, in the form

system("/bin/csh -c 'glob *.doc'");

to invoke the shell substitution and filename-expansion mechanisms.


TIP: The C shell provides no direct means of logging messages to standard error. The lack of direct support from the C shell to log specifically to standard error can be very problematic when writing scripts--especially scripts intended to be part of a pipeline (e.g., command1 | yourScript | command2) or otherwise have the script's output redirected in some fashion (e.g., yourScript > outputFile). Error messages not placed on the standard error will be redirected, while error messages placed on the standard error will be seen unless the user specifically redirects the standard error. In short, placing a message on standard error ensures that the user will see the message. The following code shows an alias named stderr that places a message on the standard error. It is a bit cumbersome because it requires three extra processes to accomplish the task, but this should not be an issue because logging to standard error is infrequent and occurs only in error situations.
% alias stderr 'echo \!*|sh -c '"'cat 1>&2'"
% stderr Unable to locate file $file in directory $cwd.
Unable to locate file main.cc in directory /users/ziya
% stderr 'multi line output \
line two'
multi line output
line two
%



			

The alias saves a process over using a script file but suffers from two drawbacks. It does not accept input from standard input (cat errFile | stderr), and the alias does not permit redirection on the same command line (stderr message > errFile). The following script file, at the expense of an extra process, provides command-line redirection and handles input from standard input or command-line arguments:

#!/bin/csh
# echo to standard error
alias say 'echo "$*"'; if ($#argv == 0) alias say cat
say | sh -c 'cat 1>&2'


			

Rescanning a Line for Substitutions: eval

eval  [arg ... ]

You can use eval to rescan the arguments arg for variable, command, and history substitutions; filename expansion; and quote removal;, and then execute the resulting words as a command. For example, if eval were passed the argument 'ls foo*' and the files foo.txt and foo.doc matched the pattern foo*, eval would expand the foo* expression and then execute the following command:

ls foo.txt foo.doc

With eval, you essentially can write shell script lines with a shell script and execute the resulting generated commands. Remember that to embed variable symbols in a string, however, you must hide the leading dollar sign from earlier shell substitutions.

eval is useful when used with commands that generate commands, such as resize or tset. For example, resize generates a series of setenv commands. Without eval, using resize is a three-step task:

resize > /tmp/out; source /tmp/out; rm /tmp/out

With eval, using resize is considerably simpler:

eval ´´resize´´

The eval command implemented by the C shell is equivalent to the Bourne shell eval command.

Changing Your Group ID: newgrp

The newgrp command is the same as the UNIX newgrp command:

newgrp groupname

Although issued from your logon shell (not to be confused with a logon shell script--the logon shell is simply the shell started up for you when you log on), newgrp causes the current shell to be replaced by a new shell with the real and effective Group IDs both changed to the specified group groupname. Because the shell is replaced, all context, including exported variables and aliases, is lost.

Use the newgrp command when you have been authorized by the system administrator for membership in two or more user groups, and you want to change your Group ID from your current or logon group to another group. Your Group ID is used by the system when determining whether to grant you access to files.

Timing the Execution of a Command: time

You can use time with no argument to display the amount of CPU time in seconds used by the current shell and all commands and subshells invoked since its start. This form of the command is usually of interest only to folks who are being billed for the amount of machine time they use, as might be the case if you are renting time on a commercial machine. By occasionally entering the command with no arguments, you can monitor how much machine time you have used and limit your online time accordingly.

time [ command ]

Only for your logon shell will this be the amount of machine time used since you logged on. Also, note that the time reported is not elapsed wall-clock time--it is only the machine (or CPU) time used.

Use the form time command to execute command and report the amount of CPU time used by the command's execution. The command must be a simple command--not a compound command, statement group, or parenthesized statement--and cannot be a pipeline.

You might be interested in timing the execution of a command if you are a production operations manager and you want to find out how much time a new application is adding to your daily workload. A development programmer would use the time command to determine whether a new program has a performance problem. The average interactive user, however, would have infrequent occasion to use the time command.

Aliases

One of the handier features of the C shell is the alias feature. An alias is a shorthand method of referring to a command or part of a command. If you have several favorite options that you always supply to the ls command, for example, instead of having to type the whole command every time, you can create a two-character alias. Then you can type the two-character alias, and the shell executes the alias definition. In addition to providing shortcuts, aliases are a convenient way of handling common typos. I often type mroe for more or jbos for jobs, for example. Setting up aliases for mroe and jbos therefore saves me time, because I don't have to retype those commands.

An alias can represent not only a command name, but also leading options and arguments of the command line. Any words you type following the alias name are considered to follow options and arguments included in the alias definition, enabling you to customize the command with key options and arguments.

You can achieve more complex processing by using shell scripts, where the function performed by the shell script file's name used as a command can be arbitrarily complex. The command alias feature was provided only for use as a keyboard shortcut, and anything that can be achieved by using an alias can be done with shell scripts.

You should add command aliases you use often to your .login file, so that the alias is defined every time you log on. It is often handy, however, to define command aliases at the keyboard for special commands you'll be using during your current session. If you don't incorporate the alias into your .login file, it is lost when you log out.

Defining, Listing, and Changing Command Aliases: alias

The alias command enables you to list currently defined aliases, to define a new command alias, or to change an existing alias. The command format follows:

alias [ name [ definition ... ]]

For name, choose a word consisting of upper- and lowercase letters and digits. For definition, write any sequence of words that defines the command string for which you want name to stand. The following defines two aliases for the rlogin command, each providing a different host. It's shorter to type the alias name for the destination host than it is to type the rlogin command and options.

alias druid rlogin druid -l root
alias ducati rlogin ducati.moto.com

If you want to change the definition of an alias, just define the alias again.

After you define aliases, you can display a list of their names and definitions by entering the alias command without arguments, as in this example:

% alias
druid   (rlogin druid -l root)
ducati  (rlogin ducati.moto.com)

You also can display the definition of a specific alias by specifying its name as an argument:

% alias druid
rlogin druid -l root

Alias substitution occurs early in the shell's processing cycle for commands, thereby enabling you to use globbing (filename replacement), variable substitution, command substitution, and command-history substitution in the wordlist. You therefore often will need to quote at least one of the words of definition and perhaps the entire alias definition. Some people always enclose the alias definition in quotes to avoid surprises. Consider the following alias:

alias lc ls *.{cc,hh}

For a C++ language programmer, the alias would be rather natural: Simply by typing lc, you get a listing of all source program files in the current directory, devoid of any other file clutter.


NOTE: Note that substitutions occur when the alias command is processed unless you quote all or part of the wordlist.

The preceding alias definition does not work as expected, however. The filename pattern *.{cc,hh} is substituted on the alias command itself, and the actual alias stored (depending on the actual directory contents when you enter the alias command) follows:

% alias lc
ls CIM_EnvImp.cc CIM_Util.hh EventManager.cc LogInstances.cc

Because the filename pattern is replaced before the alias definition is stored by the shell, the lc alias doesn't list all files ending in .cc or .hh. It attempts to list the files CIM_EnvImp.cc, CIM_Util.hh, EventManager.cc, and LogInstances.cc, whether or not they exist in the current directory.

The alias should have been defined as this:

% alias lc ls '*.{cc,hh}'

An alias definition also can use command aliases. During alias substitution, the alias definition is scanned repeatedly until no further substitutions can be made. An alias definition for name, however, cannot invoke the name alias within itself; a reference to name in the definition is taken as a reference to the built-in shell command or executable file name, not as a reference to the alias. This enables you to use an alias to redefine a system command or a built-in shell command. For example,

% alias pg pg -cns -p"Page %d:"

You can refer to arguments of the original command line--before any substitutions are made--by using the command-history substitution syntax (see "Command History," later in this chapter). For example, the command

alias print 'pr \!* | lp'

defines an alias named print that executes the pr command using all the arguments of the original command line (\!*) and then pipes the output to lp for printing.

To properly understand and use the alias command, you must be clear about the way an alias is used. When you define an alias by entering the alias command, the only thing that happens at that time is that the system stores the alias in computer memory. Later, when you enter a command with the same name as the alias, the C shell does a little magic. The command you typed is not executed in the form in which you typed it. Instead, the command name (which is an alias name) is replaced by the value of the alias. The result is a new command text--the first part is the alias definition, and the rest consists of any other arguments you typed.

Suppose that you define an alias for the ls command as this:

% alias lax ls -ax

If you later enter the command

% lax big*.txt

the command actually executed is

ls -ax big*.txt

The command alias (lax) is replaced by its definition (ls -ax). Remaining arguments on the command line (big*.txt) simply are tacked on after the alias substitution to yield the command the computer actually executes.

Using history substitutions (for example, !*, !^, !:2 ...) in an alias provides additional flexibility by enabling the executed command to use arguments in a different order or a different form than entered; this requires a little extra work from the shell. Consider the following alias definition:

alias lsp 'ls \!* | lp'

Entering the command lsp *.cc *.csh results in alias substitution for lsp. The symbol !* causes the arguments you entered on the line *.cc *.csh to be inserted into the alias definition instead of being tacked on after the alias definition. In other words, if an alias definition contains a history substitution, the shell suspends its normal action of tacking on command arguments after the alias value. The command actually executed is ls *.cc *.csh | lp. Without this special mechanism, the executed command would have been ls | lp *.cc *.csh, with the final *.cc *.csh tacked on in the usual manner. This would lead to an undesirable result: Instead of printing a directory listing, the lp command would print the full contents of the files.

When writing an alias, you therefore need to visualize what will happen when the alias is substituted in later commands.

Deleting a Command Alias: unalias

You can use unalias to delete one or more aliases. You can delete a specific alias by specifying its name as an argument, or you can delete multiple aliases by using pattern-matching:

unalias name
unalias pattern

If you specify a specific alias name, only that alias definition is deleted. If you specify a pattern, all those currently defined aliases whose names match the pattern are deleted. pattern can contain the pattern-matching characters *, ?, and [...]. In the following example, the first line deletes the lx alias, and the second line deletes all currently defined aliases:

unalias lx
unalias *

Shell Options

The C shell command-line options provide a convenient way of modifying the behavior of a C shell script to suit your needs. Options can be specified on a command line, such as this:

% csh -f

Or, if your UNIX version supports the #! notation, you can specify an alias on the first line of a script, such as this:

#!/bin/csh -f
echo *

If an option is needed temporarily, the command line is the best place to supply the option. If the option is needed permanently, place it on the #! line.

Unless one of the -c, -i, -s, or -t options is set, the C shell assumes that the first argument on the command line is the command to be executed and that each additional argument is intended for the command being executed. For example, the command

% csh command arg1 arg2 arg3

causes the C shell to execute command with three arguments (arg1, arg2, arg3), which will be assigned to the argv array variable. When the -i, -s, or -t option is set, the shell assigns all arguments, including the first to the argv array variable. The -c option allows only one command-line argument and takes it as a list of commands to be executed; after execution of the argument string, csh exits.

Command-line options are used by the C shell itself--not by the command to be executed. Command-line options are indicated with a dash (-) followed by this option: csh -v. If multiple options are needed, the options may be preceded by only one dash (csh -fv) or by using one dash per option (csh -f -v). Mixing option-specification methods is allowed (csh -f -vx). The following command shows the mixing of command-line options with normal command execution:

% csh -fv command arg1 arg2 arg3

Table 12.4 provides a summary of C shell command-line options.

Table 12.4. C shell command-line options.

Option Name Description
-b Break Delimits a break in command-line option processing between arguments intended for the C shell and arguments intended for a C shell script. All command options before -b are interpreted as C shell arguments, and all command options after -b are passed on to the C shell script. Note: The -b option is not available on all UNIX platforms.
-c commandString Commands Executes commands from the commandString parameter that immediately follows the -c option. The commands in commandString may be delimited by newlines or semicolons. All command-line arguments after -c are placed in the argv variable. The -c option is used when calling the C shell from a C or C++ program. -c is the only option that requires a parameter.
-e Exit Exits the current shell if any command returns a nonzero exit status or otherwise terminates abnormally. Setting this option is easier than checking the return status of each command executed.
-f Fast Uses fast-start execution. The C shell does not execute the .cshrc or .login file, which speeds up the execution of a C shell script. This is a good optimization if a C shell script does not need any of the variables or aliases set up in the initialization files.
-i Interactive Forces interactive-mode processing. If shell input does not appear to be from a terminal, command-line prompts are not issued.
-n Not Parses shell syntax but does not execute commands. The -n option is useful for debugging shell syntax without actually having to execute resultant commands after all shell substitutions are made (for example, aliases, variables, and so on).
-s Standard Reads command input from standard input. All command-line arguments after -s are placed in the argv variable. Using -s can prevent unnecessary temporary files by piping output of commands directly into the shell (genCshCmds | csh -s).
-t execuTe Reads and executes a single line of input. You can use the backslash (\)to escape the newline to continue the input on the next line.
-v Verbose Sets the predefined verbose variable. When verbose is set, all commands are echoed to standard output after history substitutions are made but before other substitutions are made. -v is useful for debugging C shell scripts.
-V Very Sets the predefined verbose variable before the .cshrc is executed.
-x eXecution Sets the predefined echo variable. Commands are echoed to standard output right before execution but after all substitutions are made.
-X eXtra Performs extra command echoes. The -X option sets the predefined echo variable before .cshrc is executed.

The shell supports additional options that you can switch on or off during shell operation. These options are controlled by variables; if the variable is set, the corresponding option is activated; if it is not, the option is off. These options are described in "Using Predefined Variables," later in this chapter. Their names are echo, ignoreeof, noclobber, noglob, nonomatch, notify, and verbose.

Additionally, the shell variables cdpath, history, mail, path, prompt, and shell, although not options as such, enable you to control certain shell behaviors such as searching for commands and checking for mail. See "Using Predefined Variables," later in this chapter, for further information.

Command History

The C shell's command-history service maintains a list of previously executed commands. You can use the command history for two purposes:

  • As a reference to determine what you've already done

  • With history substitution, as a shorthand method to reuse all or part of a previous command to enter a new command

Displaying the Command History

The history command enables you to print all or selected lines of the current command history:

history [ -r ] [-h] [ n ]

To display all the lines currently held in the history list, simply enter the history command (it takes no arguments):

% history
1  cd src
2  ls
3  vi foo.cc
4  cc foo.cc
5  grep '#include' foo.cc

The C shell displays each line preceded with a line number. You can use the line number to refer to commands with the history-substitution mechanism. Line numbers start with 1 at the beginning of your session, assuming that no previous saved history exists. (See "Using Predefined Variables," later in this chapter, for more information on the shell variable savehist.)

The amount of history a shell maintains depends on the amount of memory available to the shell. History is not saved in an external disk file until after the session exits if savehist is set, so capacity is somewhat limited. You can set the history variable to a value indicating the number of lines of history you want the shell to maintain; it keeps that number of lines and more if possible, but your specification is only advisory. The value of history must be a simple number to be effective. For example, set history=25 retains at least 25 lines of history.


CAUTION: The history service retains command lines--not commands. As the history area becomes full, the shell discards old lines. This might result in some lines containing incomplete, partial commands. You need to use caution with the history-substitution facility to avoid calling for the execution of an incomplete command.

To limit the number of lines displayed, specify a positive integer for n to limit the number of lines displayed to the last n lines of history.

Specify the -r option to print history lines in reverse order, from the most recent to the oldest.

The -h option lists the history buffer without the line numbers. This can be useful for creating scripts based on past input (history -h > script.csh) or for cutting and pasting a series of commands by using your mouse.

Using History Substitutions to Execute Commands

History substitutions are introduced into a command with the ! (exclamation point, usually called the bang operator). You append one or more characters to ! to define the particular kind of history substitution you want. If followed by a blank, tab, newline, equal sign (=), or open parenthesis (, the exclamation point is treated as an ordinary character.


NOTE: The exclamation point is an ordinary character to other shells, but it is special to the C shell. You must precede it with \ (backslash) to avoid its special meaning, even inside hard-quoted strings (for example, echo '!!' does not echo !! but the previous command). The shell attempts a history substitution wherever it finds an exclamation point in the command line, without regard to any quoting; only the backslash can avoid interpretation of ! as a history-substitution mark.

You can write a history substitution anywhere in the current shell input line, as part or all of the command. When you enter a command containing one or more history substitutions, the shell echoes the command after performing the substitutions so that you can see the command that actually will be executed. (You do not have an opportunity to correct the command; it is executed immediately after being displayed.)

The simplest forms of history substitution are !! and !number. The !! symbol is replaced with the entire previous command line. The expression !number is replaced with a line number from the command-history list.

Suppose that the command history currently contains the following lines:

1  cd src
2  ls
3  vi foo.cc
4  cc foo.cc
5  grep '#include' foo.cc

If you now enter the command !!, the shell repeats the grep command in its entirety. Press Return to execute the grep command, or type additional words to add to the end of the grep command:

% !! sna.hh
grep '#include' foo.cc sna.hh

Now suppose that, after running grep, you want to edit the foo.cc file again. You could type the vi command as usual, but it already appears in the command history as line 3. A history substitution provides a handy shortcut:

% !3
vi foo.cc

That's almost all there is to basic history substitution. Actually, the shell supports any of the forms listed in Table 12.5 for referring to command-history lines.

Table 12.5. Forms for referring to command-history lines.

Form Replaced With
!! The preceding command line (the last line of command history).
!number The line number of the command history.
!-number The history line number lines back; !-1 is equivalent to !!.
!string The most recent history line that has a command beginning with string. For example, use !v to refer to a previous vi command.
!?string? The most recent history line containing string anywhere in the line. For example, use !?foo? to repeat a previous vi foo.cc command. Most C shell versions support not supplying the trailing question mark, so !?foo would execute the same vi command.

You can do more with history substitutions than merely reuse a previous command. The shell also provides extensions to the history operator that enable you to select individual words or a group of words from a history line, inserting the selected word or words into the current command. These extensions are in the form of a suffix beginning with a colon (:). For example, !vi:1 is replaced not with the most recent vi command, but with its first argument word. Similarly, !3:3-4 is replaced with arguments 3 and 4 of history line 3. You can use any of the expressions listed in Table 12.6 as word selectors by appending the expression to a line reference preceded by a colon.

Table 12.6. Using command history word selectors.

Expression Specifies
0 First word of the command (usually, the command name).
n nth argument of the command. Arguments are numbered from 1. Note that 0 refers to the command name, which is actually the first word of the line, whereas 1 refers to the second word of the line.
^ Same as :1, the first argument.
$ Last argument word of the command.
% For the !?string? format, the word matched by string. Use this word selector only with the !?string? history reference. Its value is the entire word-matching string, even though string might have matched only a part of the word.
m-n Multiple word substitution. Replaced with words m through n of the history line. For m and n, specify an integer number or one of these special symbols: ^, $, or %.
m- Substitution of words beginning with the mth word and extending up to but not including the last word.
-n Same as 0-n; substitutes words beginning with the first word of the history line (the command name) through the nth word.
m* Same as m-$; substitutes words beginning with the mth word and extending through the last word of the line.
* Same as ^-$; substitutes all argument words of the line.

If the word selector expression you want to write begins with ^, $, *, -, or %, you can omit the colon between the line selector and the word selector. For example, !vi* refers to all the arguments of the previous vi command and is the same as !vi:* or !vi:^-$.


NOTE: Some versions of the C shell require the : between the ! operator and the selector if you are not using a line number. The command !3^ would give the first argument of command 3, but !vi^ would return an error.

You can use any number of word selectors in the same command line. By combining multiple word selectors, you can reuse arguments of a previous command in a different order (cp foo.cc ~/project/src/new; chmod -w !$/!^) and use arguments that originally appear on different commands. For example, the command rm !115^ !117^ removes files that were named on two earlier commands.

When counting words of a previous command line, the shell takes quoting into consideration but uses the line as it appears in the history list. Words generated by variable or command substitution or filename generation are not accessible. The following example demonstrates the effects of quoting and command substitution:

% echo "one two three" four
one two three four
% echo !^
echo "one two three"
one two three
% echo ´´ls *.cc´´
bar.cc foo.cc
%  echo !^
echo ´´ls *.cc´´
bar.cc foo.cc

You can append modifiers to a word selector to alter the form of the word before insertion in the new command. A modifier is written in the form :x, where x is a letter specifying how the word should be modified. For example, !vi^:t substitutes the tail of the first argument of the vi command: for the argument /usr/X/lib/samples/xclock.c, the value of :t is xclock.c.

Table 12.7 lists the modifiers that can be appended to a word selector to alter the selected word before substitution.

Table 12.7. History substitution modifiers.

Modifier Function
:e Removes all but the filename suffix. For the argument foo.sh, :e returns .sh.
:h Removes a trailing path component. Successive :h modifiers remove path components one at a time, right to left. Thus, for the argument

/usr/local/etc/httpd/htdocs/index.html

:h
returns

/usr/local/etc/httpd/htdocs

whereas :h:h returns

/usr/local/etc/httpd
:p When used in any history-substitution expression on the command line, causes the shell to print the command after substitutions but not to execute it. Use :p to try the effect of a history substitution before executing it.
:q Encloses the substituted word or words in quotes to prevent further substitutions.
:r Removes a filename suffix of the form .string. For example, for the argument foo.cc, :r returns foo. Successive :r operators remove one suffix at a time. For example, :r:r applied to arch.tar.Z returns arch.
:s/x/y/ Replaces the string x in the selected word with the string y. String x cannot be a regular expression. The symbol & appearing in y is replaced with the search string x--for example, :s/bill/&et/ substitutes billet for bill. Any character can be used in place of the slash--for example, :s?/usr?/user?. The final / can be omitted if followed by a newline. You can use the delimiter (/ or your delimiter) or & as a text character by escaping it with a backslash (\)--for example, :s/\/usr/\/user/. The search string x can be omitted, in which case the search string of the previous :s on the same line is used. Or, if no previous :s occurred, the string of !?string? is used.
:t Removes all leading components of a path, returning just the filename part. For the word /usr/bin/ls, the value of :t is ls.
:x Breaks the selected word or words at blanks, tabs, and newlines.
:& Reuses the previous string-substitution modifier. For example, if :s appears in the same command line,

!grep:2:s/bill/marty/ !:3:&

is the same as

!grep:2:s/bill/marty/ !3:s/bill/marty/

Normally, a modifier affects only the first selected word. When selecting multiple words, such as with !12:2*, you can apply a modifier to all the selected words by inserting a g in front of the modifier letter. For example, !12:2*:gh applies the :h modifier to all the words. The g is not valid with the :p, :q, and :x modifiers.

You can omit the command identifier from a history substitution when using two or more ! expressions in the same line; successive history references then refer to the same command as the first. For example,

% vi %grep^:t %:3:t %:4:t

all refer to the same grep command but select the first, third, and fourth arguments.

The history mechanism supports the special abbreviation ^, which is useful for correcting a keying error in the preceding line. The general form of the abbreviation is ^x^y, where x and y are strings. The preceding command line is selected and searched for string x; if found, it is replaced with y and then executed. After the command cd /usr/ban, for example, enter the line ^ban^bin (or ^a^i) to execute the command as cd /usr/bin. The caret(^) must be the first nonblank character of the line to be recognized as a line-editing substitution. This abbreviation is available only for the immediately preceding command line; you must use the full history expression !line:s/x/y/ to edit any line other than the last.

One final, important provision of the history-substitution mechanism is that you can enclose any history reference in braces {} to isolate it from characters following it. Thus, !{vi^:h}.cc forms a word beginning with the selected history reference and ending in .cc.


TIP: Use the history substitution !* to prevent unintentional file removal. When creating a file expression to delete files, use the ls command to see the results. After you are satisfied that only the intended files are being removed, issue the rm command with the !* substitution. In the following example, the user is trying to delete all publish-related C++ source files and a Makefile.

			
% ls pub* Makefile
Makefile pubList.txt publish.cc publish.hh
% ls publ* Makefile
Makefile publish.cc publish.hh
% rm !*


			

Variables

You can use shell variables to hold temporary values, and shell scripts can use variables to manage changeable information. The shell itself also has variables of its own that you can use to customize features of the C shell and your C shell environment.

A variable is actually an area of the shell's memory set aside to hold a string of characters. The string value is dereferenced by using a variable name. You assign the name of a variable when you define it with the built-in set command. You can change the value of a variable in several ways.

The shell provides a complex syntax set for referring to the value of a variable. Any variable reference, when scanned in a command line, is replaced by the corresponding value of the reference before the command is executed. In its simplest form, a variable reference simply replaces the name of a variable with its string value.

This section looks at the kinds of variables the C shell supports and the rules for naming variables and referring to variable values.

Variable Names

The C shell imposes no set limit on the size of variable names. People commonly use variable names of six to eight characters, and names consisting of up to 16 characters are not unusual.

A variable name can consist of only letters (uppercase and lowercase), underscores (_), and digits. A variable name cannot begin with a digit, because names beginning with a digit are reserved for use by the C shell. Generally, all capital letters are used for the names of environment variables, and all lowercase letters are used for local variables, although the C shell imposes no such restriction.

You assign a value to a variable by using the set or setenv built-in commands, depending on the type of variable you are setting.


NOTE: The C shell does not support the assignment statement name=value, which might be familiar to you from the Bourne and Korn shells.

Creating Shell Variables

You can use the set statement to create new local variables and, optionally, to assign a value to them. Local variables are known only to the current shell and are not passed to shell scripts or invoked commands.

Use the setenv statement to create new environment variables. Environment variables are passed to shell scripts and invoked commands, which can reference the variables without first defining them (no setenv statement is required or should be used in a shell script for passed environment variables you want to access). See the next section, "Displaying and Setting Global Environment Variables," for more about environment variables.

A shell variable can contain any characters, including unprintable characters, as part of its value. A shell variable also can have a null value, which is a zero-length string containing no characters. A variable with a null value differs from an unset variable. A reference to the null value merely deletes the variable reference, because it is replaced with a zero-length string. A reference to an unset variable is an error; it generates an error message and causes the shell interpretation of commands to stop.

Displaying and Setting Local Shell Variables: set You can use the set command to display or set local variables:

set
set name=word
set name=(wordlist)
set name[index]=word

You can use set with no arguments to list the currently defined variables and their respective values. The listing includes exported variables as well as local variables, although many versions of the C shell print only nonexported variables.

Any of the operand formats can be combined with a single set statement. Each statement assigns a value to a single shell variable or element of an array variable (for example, set var1=(foo bar bas) var2=value2 var1[1]=phou). Note that no white space can separate the variable name, equal sign, or value when writing an assignment; any white space appearing in word or wordlist must be hidden with quotes.

You can use set name to define a variable name and to initialize it with a null string. You can use this form to set a number of shell options (such as set ignoreeof). A variable with a null value is not the same as an unset variable. A variable with a null value exists but has no value, whereas an unset variable does not exist. A reference to an unset variable results in a shell error message; a reference to a null variable results in substitution of the null string.

You can use set name=word to assign the string word as the current value of variable name. The string replaces the current value of name if the variable already is defined; otherwise, a new variable called name is created. If word contains characters special to the shell (including blanks or tabs), it must be enclosed in single or double quotes.

You can use the form set name=(wordlist) to assign each word in wordlist to successive elements of the array variable name. After the assignment, the expression $name[1] refers to the first word in wordlist, $name[2] refers to the second word, and so on. Any word in wordlist must be quoted if it contains characters special to the shell (including blanks or tabs).

You can use the form set name[i]=word to assign the string word as the current value of the ith element of the array variable name. For i, specify a positive integer number not less than 1. Note that you do not have to assign a value to every element of an array. The number of elements in an array is effectively the highest-numbered element to which a value has been assigned. Elements to which no values have been assigned have effective values of the null (zero-length) string. Also note that you cannot assign a (wordlist) to an array element; an array variable can have multiple values, but each element can represent only one string value.


NOTE: Many versions of the C shell do not support sparse arrays. A sparse array is an array that does not have a value for every index. An array element cannot be addressed unless the array already has an element in that position. A
set: Subscript out of range



			

message is generated when the assignment,


			
set name[4]=foo



			

or reference


			
echo $name[4]



			

is attempted on a three-element array.


Deleting Local Shell Variables: unset You can use the unset command to delete one or more shell variables from the shell's memory:

unset pattern

The unset command is effective for variables defined with the set command only; use the unsetenv command to delete variables defined with setenv.

For pattern, specify a string that might optionally contain one or more occurrences of the pattern-matching characters *, ?, or [...]. All local variables known to the shell whose names match the specified pattern are deleted. You receive no warning message if nothing matches pattern and no confirmation that the variables were deleted.

Displaying and Setting Global Environment Variables: setenv You can use the setenv statement to create new environment variables. Environment variables are passed to shell scripts and invoked commands, which can reference the variables without first defining them (no setenv statement is required or should be used in a shell script for passed environment variables you want to access). See "Customizing Your Shell Environment," later in this chapter, for more about environment variables.

The format of the setenv command follows:

setenv [name value]

When issued without arguments, the setenv command lists all global environment variables currently in effect and their values. When used in the form setenv name value , the shell creates a new global variable with the specified name and assigns the string value as its initial value. If the value contains characters such as a space or tab, be sure to enclose the value string in quotes. See "Quoting and Escaping Special Characters," later in this chapter, for information about C shell special characters and using quoting techniques.

UNIX also provides a command (env) for displaying the current list of environment variables and their values. The env command supports a number of options and arguments for modifying the current environment.

The section "Using Predefined Variables," later in this chapter, provides a list of all variables (local and environment) that are defined by the C shell. Environment variables defined by other UNIX components are defined in the documentation for those components. Unfortunately, no comprehensive list of environment variables exists, because some variables are defined by non-shell programs. The mailx command, for example, defines some variables, and the vi command looks for some variables of its own. Altogether, the environment-variable pool is optional, anyway: If you don't know of a variable a UNIX command uses, the command still works without it. At any rate, be aware that the C shell is not responsible for defining all environment variables; the shell merely provides a means for manipulating and accessing variables.

Deleting Global Environment Variables: unsetenv To delete global environment variables, you use the unsetenv command:

unsetenv variablename
unsetenv pattern

Use the unsetenv command to delete one or more environment variables from the shell's memory. The unsetenv command is effective only for variables defined with the setenv command; use the unset command to delete variables defined with set.

To delete a particular variable definition, specify its name as variablename. To delete multiple variable definitions, use pattern to specify a string that might optionally contain one or more occurrences of the pattern-matching characters *, ?, or [...]. All environment variables known to the shell whose names match the specified pattern are deleted. You receive no warning message if nothing matches pattern and no confirmation that the variables were deleted.

Obtaining Variable Values with Reference Expressions

You obtain the value of a shell variable by writing a variable reference on the command line. A variable reference results in the replacement of the entire reference expression--including the $ that introduces the reference, the variable's name, and any other characters that might adorn the reference--with a string value of the reference.

A variable reference does not itself define the start or end of a word; the reference can be a complete word or part of a word. If the reference is part of a word, the substituted string is combined with other characters in the word to yield the substituted word. If the reference value substitutes one or more blanks or tabs into the word, though, the word is split into two or more words unless it is quoted. If the value of shell variable var is "two words," for example, the reference expression $var appears as two words after substitution, but the quoted string "$var" appears as the one token "two words" afterward.

A variable reference can result in the substitution of the value of a local or a global variable. A local variable is used if it exists; otherwise, the value of an environment variable is taken. If a shell variable and an environment variable have the same name, the shell variable effectively hides the value of the environment variable. If the value of the environment variable is needed, the shell variable must be unset.

You can use any of the variable reference forms shown in Table 12.8 in a word.

Table 12.8. Shell variable references.

Syntax Meaning
${name}$name Replaced with the value of name. It is an error if the $name variable name is not defined.
${name[n]}$name[n] Replaced with the value of elements of array variable name. For n, use an element number or a range of element numbers in the form m-n. Use -n to substitute elements 1-n, and use m- to substitute elements m through the end of the array.
${#name}$#name Replaced with the number of elements in array variable name.
${?name}$?name Replaced with 1 if the variable name is set; otherwise, replaced with 0.

Variable names are terminated by the first illegal variable name character--in other words, any character that is not a digit, letter or underscore (_). As a result, variable references can be used without braces when the next character is not a legal variable name character. If the shell variable var is set to foo, the variable references $var.cc, $var$var, and $var"bar" resolve to foo.cc, foofoo, and foobar, respectively.

The reference forms using braces (for example, ${name} and ${#name}) are useful when the variable name would run onto the remainder of the current word, yielding an undefined variable name. If the variable dir, for example, contains the path prefix /usr/bin/, the word ${dir}name.cc forms the full pathname /usr/bin/name.cc upon expansion. The simpler form $dirname.cc, however, is taken as a reference to variable dirname, which is not at all what was intended. The net effect of the braces is to set off the variable reference from the remainder of the word.

A reference to an unset variable generates a shell error message and, if the reference occurs inside a shell script, causes reading of the shell script to terminate. You can use the $?name or ${?name} forms to handle the case where a variable might not be set. For example,

if ($?nfiles) echo "File count is $nfiles"

Using Array Variables

Unless you provide otherwise, a variable can have only one value. An array variable, on the other hand, can have any number of values (as long as the shell has sufficient memory available to store the values). The path variable, for example, which is used by the shell as a list of directories to search for commands, is an array variable in which each element is a directory path.

You can assign values to an array variable in one of two ways: all at once or one at a time. Not all C shell versions allow the one-member-at-a-time method of assignment, though. To assign many values at once, use a wordlist argument to the set command. A wordlist is a parenthesized list of words. For example, the following array contains four values:

set path=(/bin /usr/bin ~/bin .)

Each of the words in a wordlist is assigned to the next available element of the array variable. Assigning a wordlist to a variable automatically defines the variable as an array.

To assign values individually to elements of an array, you must use array subscript notation. Written in the form name[index], the index must be a number designating an array element; elements are numbered starting with 1, so $name[1] is a reference to the first element of an array. The following example assigns three values to the array planets and then prints one of them using an array reference:

% set planets[1]=Mercury
% set planets[2]=Venus
% set planets[3]=Earth
% echo Planet 3 is $planet[3]
Planet 3 is Earth

If you reference the array variable name without an index, the shell replaces the reference with a wordlist:

% echo The planets are $planets
The planets are (Mercury Venus Earth)

Many versions of the C shell will not put the parenthesis in the output. If your C shell adds the parenthesis, you can also use the reference $name[*] to obtain all the words of the array without the surrounding parentheses:

% echo The planets are: $planets[*]
The planets are: Mercury Venus Earth

You can reference a specific range of elements by using the notation $name[m-n], where m and n are the beginning and ending index numbers of the elements you want. For example, the following lists only the earth-like planets:

% set planets=(Mercury Venus Earth Mars Jupiter Saturn Uranus Neptune Pluto)
% echo The terraform planets are: $planets[2-4]
The terraform planets are: Venus Earth Mars

The special form $name[-n] refers to elements of the array, beginning with the first and extending through n:

% echo The inner planets are: $planets[-4]
The inner planets are: Mercury Venus Earth Mars

The special form $name[n-] refers to the elements of the array, beginning with n and extending through the last:

% echo The outer planets are: $planets[5-]
The outer planets are: Jupiter Saturn Uranus Neptune Pluto

One of the primary reasons for using array variables is to permit looping through the array, inspecting and manipulating each array element in turn. This programming technique, often used in shell scripts, can be used at the keyboard as well:

% set files=(main io math dbase)
% foreach file ($files)
? cp $file.cc $file.cc.bak
? end

This example first assigns the root names of a list of files to an array variable and then uses the foreach shell statement to process each of the files in turn by copying the file to a backup file and changing its filename in the process. In the preceding example, the question mark (?) is the shell's prompt when it requires additional lines to complete an outstanding statement; it signals that you haven't finished the command yet.

Arrays can be used as a more efficient way of parsing output from commands. Using cut or awk requires an extra process, and the shell script more than likely will be harder to read. If different fields are needed, multiple calls to cut and awk can be avoided, which provides even more efficiency. Suppose that you need to repeatedly refer to the month and year in a shell script. The script snippet that follows shows the use of cut and awk to do this. (The use of both cut and awk is for illustrative purposes. Normally, only cut or awk would be used.)

set month=´´date | cut -f2 -d" "´´
set year=´´date | awk '{print $6}'´´

This snippet requires two calls to date and two calls to parse the output: one for awk and one for cut. That means that four processes are run to extract the needed date information. You can reduce the number of processes down to two with some awk trickery:

eval ´´date | awk '{print "set month=" $2, "year=" $6}'´´

Although the command is elegantly done on one line, it is not very readable and therefore not very maintainable. Using a C shell array to parse the output improves efficiency by using only one process--that of the date program. The use of an array also increases readability:

set current_date=(´´date´´)
set month=$current_date[2] year=$current_date[6]

The array parsing is similar to the default parsing done by awk, because all white space is ignored; cut, however, treats each white space character as a field delimiter. Parsing output that has a variable amount of white space between fields, such as output from ls or date, is not a good candidate for cut, but it is a good candidate for awk or a C shell array. If non-white space field delimiters are needed or the parsing is part of a pipeline, awk or cut must be used.


TIP: A good rule to follow when programming is the Once and Only Once rule. This rule states that script elements such as commands and values should appear at most once in any given script. Variables and aliases provide a powerful and convenient way to accomplish the goal of "oneness." Consider this script snippet:
echo Jim and I, the twins > /tmp/foo.$$
ls > /tmp/foo.$$



			

Here, the filename /tmp/foo.$$ appears twice. If that filename needs to be changed, multiple locations must be changed. If the filename is stored in the variable


			
set filename=/tmp/foo.$$



			

only one place must be updated. Using a variable for the filename helps prevent errors. If you accidentally forget to type the dot (.) and enter the filename /tmp/foo$$, an extra temporary file is created. If the variable name is mistyped, the shell reports an error.
An extension of the Once and Only Once rule is that command-line arguments should be dereferenced at most once by using the $n or $argv[n] syntax. Always create a new variable if the command-line argument must be looked at multiple times. The script


			
if (-e $1 && -d $2) then
mv $1 $2
echo $1 moved to $2
endif



			

is not only hard to read because $1 has little semantic meaning--it also is hard to maintain if the argument use changes (new arguments are introduced, arguments are removed, or arguments are reordered). Transform the script by referencing the command-line arguments directly only once:

#argument checking omitted for brevity
set srcFile="$1" targetDir="$2"
if (-e $srcFile && -d $targetDir) then
mv $srcFile $targetDir
echo $srcFile moved to $targetDir
endif


			

Using Special Read-Only Variables

In addition to ordinary variables you define with the set and setenv commands, a number of variables are defined by the shell and have preset values. Often, the value of a special variable changes as the result of a command action. You can use these variables to acquire specific information that isn't available in any other way. You cannot use set or setenv to define these variables, however, and you can't assign new values to them.

The special variables can be referenced by using the notations shown in Table 12.9.

Table 12.9. Shell special variables.

Variable Meaning
$0 Replaced with the name of the current shell input file, if known. If unknown, this variable is unset, and a reference to it is an error. $0 is shorthand for $argv[0]. $0 can be used with other variable reference operators. For example, $?0 returns 1 if the current filename of the shell is known and 0 if the filename is not known. (Note that argv is the only shell array where referencing the zeroth element returns a value other than the null string.)
$1, $2, ... $9 Replaced with the value of the shell command's first (second, third, É) argument. If used within a shell script invoked by name, these symbols refer to the command-line arguments. Up to nine arguments can be referenced this way. To reference arguments beyond nine, you must use the reference notation $argv[n] or the built-in command shift.
$* Equivalent to $argv[*]. Replaced with all the arguments passed to the shell.
$$ Replaced with the process number of the current shell. When a subshell is invoked, $$ returns the process ID of the parent shell.
$< Replaced with a line of text read from the standard input file.

The variables $1, $2, É $9 have special significance when used inside a shell script, because they refer to the arguments of the command line that invoked the shell script. The same command arguments are accessible via the array variable argv. By using the argv variable, you can refer to all command-line arguments, not just the first nine. For example, $argv[10] references the tenth argument, and $argv[$n] references whichever argument is designated by another variable $n.

The shift built-in command can be used to manipulate command arguments. See "Shell Programming," later in this chapter, for details about the shift command.

Using Predefined Variables

The C shell also recognizes a number of conventionally named variables as having special meaning. Some are initialized automatically when the shell starts; you set others yourself via the set command or by using command-line options when the C shell program csh is invoked. You can assign a value to most of these variables, but some variables are set automatically by the shell when a corresponding event occurs.


NOTE: Note that all predefined shell variables have lowercase names. This is to avoid conflicts with environment variables, which usually have uppercase names.

To set any predefined variable, use the set command. You need to specify a value only if the variable requires one; otherwise, you can omit the value string. For example, use set noclobber to enable the noclobber option, but use set prompt='$cwd: ' to assign a new command-line prompt string. See "Displaying and Setting Local Shell Variables: set," earlier in this chapter, for more information about set.

You can use the unset built-in command to destroy the variable and any associated value, but be aware that an unset variable does not revert to its initial or default value and is not the same as a variable having a null value; an unset variable simply doesn't exist. See the unset built-in command in "Deleting Local Shell Variables: unset," earlier in this chapter, for more information about unset.

Table 12.10 describes the variables to which the shell is sensitive and indicates any initialization or assignment restrictions.

Table 12.10. Predefined shell variables.

Variable Description
argv An array variable containing the current shell parameters. A reference to argv[1] is equivalent to $1, argv[2] to $2, and so on up, to $9. The value of argv is set by the shell at startup and just prior to the execution of each command.
cdpath An array variable specifying a list of directories to be searched by the cd command. The C shell does not provide an initial value for cdpath. If you do not provide a value, the cd command searches only the current directory to resolve unanchored pathnames (pathnames starting with . or / are considered to be anchored).
cwd Contains the full pathname of the current directory. On startup, the shell initializes cwd to the pathname of your home directory. Each cd command you execute changes the value of cwd. Note that $cwd may return a different value than the UNIX command pwd if a link was used to go to the current directory.
echo If set, the shell prints each command before execution. The echo variable is initialized to the null string if the -x or -X option is present on the csh command line; otherwise, the variable is left unset. You can activate command tracing at any time by executing the command set echo; to turn it off, use unset echo. Command tracing is effective only for the current shell invocation; it is not propagated into called shell scripts. For built-in commands, the echo occurs after all expansions are performed except command and filename substitution. Commands that are not built in are echoed immediately before execution.
history Specifies the number of commands to be maintained in the history list. The shell retains at least this many lines of command history if sufficient memory is available. The history variable is not initialized automatically and does not need to be assigned a value. If unset, the shell maintains an optimum amount of command history for the size of available memory. You can set the value of history at any time.
home Initialized to the value of the HOME environment variable at shell startup. The value of home is used as the default directory for cd and as the value substituted for ~. It is almost always improper for you to change the value of home, but you are not prevented from doing so.
ignoreeof If set, the shell ignores an end-of-file (EOF) character typed at the beginning of a line. If not set, an EOF character typed at the beginning of the line signals the shell to exit, which, for your logon shell, also logs you out. The specific key corresponding to the EOF character can be displayed and changed by using the UNIX stty command. Many C shell versions still log you out if a large number of consecutive EOF characters are received.
mail An array variable listing the files to be monitored for change. If the first value is numeric, it specifies the frequency in seconds that the shell should check for new mail. The default frequency varies between C shell versions but is commonly five or 10 minutes. If the last modification date of any one of the files is observed to change, the file issues the message New mail in name, where name is the name of the file that changed. (If mail lists only one file to be monitored, the notification message is You have new mail.) The following command monitors two mail files and specifies a 10-second interval for mail checking: set mail=(10 /usr/mail/taylort /usr/spool/mail/taylort).
noclobber If set, the shell does not replace an existing file for the I/O redirection >. For >>, it requires that the target file already exist. You can activate the option with the command set noclobber and turn it off with unset noclobber. When noclobber is set, you can use >! and >>! to perform the redirection anyway. The noclobber variable is unset initially.
noglob If set, filename expansion using the pattern characters *, ?, and [...] is disabled. The noglob variable is unset initially.
nonomatch If set, a filename pattern that matches no files is passed through unchanged to the command. By default, the shell issues an error message and ignores a command if no matching files can be found for a filename pattern argument. (Note that nonomatch is the default behavior of the Bourne shell.) Use set nonomatch to accept unmatched pattern arguments and unset nonomatch to force a shell error message. The nonomatch variable is unset initially.
notify If set, the shell writes a message to your terminal at once if the status of a background job changes. By default, the shell does not notify you of status changes until just before issuing the next command-line prompt. Be aware that setting notify can cause messages to appear on your screen at inopportune times, such as when using a full-screen editor. The initial value of notify is unset.
path An array variable listing the directories to be searched for commands. If the path variable is not set, you must use full, explicit pathnames to execute non-built-in commands--even those in your current directory (./mycmd, for example). The initial value of path is the same as the PATH environment variable.
The shell maintains a hash table of all the executable files in your search path. The hash table is initialized at startup time and is rebuilt whenever you change the value of path or PATH. Note that if a new command is added to one of the files in your search path (including your current directory), however, the shell might not necessarily be aware of the addition and might fail to find the command even though it exists. Similarly, removing an executable file from a directory early in your search path might not allow the execution of a like-named command in some other directory. In either of these cases, use the rehash built-in command to force rebuilding of the shell hash table.
Other than the cases mentioned earlier, the shell hash table is invisible to you. It exists to speed up the search for commands by skipping directories where a command is known not to exist.
prompt Your prompt string. The value of prompt is printed at the start of each line when the shell is ready to read the next command. The value of prompt is scanned for variable and command substitutions before printing; history substitutions are allowed in the prompt string and refer to the command you last entered. The initial value of prompt is the string "% " (a percent sign followed by a blank). Or, if you are the superuser, the value is "# " (a pound sign followed by a blank).
savehist Specifies the number of history lines to save to ~/.history when you exit your logon shell. When you log on the next time, the C shell executes the equivalent of source -h ~/.history. Not all versions of the C shell support the savehist variable.
shell Because the C shell is capable of executing only shell scripts written in the C shell language, a mechanism is needed so that shell scripts written for the Bourne shell can be detected and passed to the proper program for execution. Any shell script in which the first line begins with a nonexecutable command is considered to be a Bourne shell. To support this convention, Bourne shell scripts usually specify the : built-in command on the first line; there is no : command in the C shell. Similarly, scripts intended for the C shell usually begin with a command line and have the pound sign (#) in the first position. (Note that, for versions of UNIX that support the #!commandInterpreter notation, the first line for a C shell script is #!/bin/csh or, for the Bourne shell, #!/bin/sh. The #! notation helps eliminate the need for the shell variable.

When the shell recognizes that a shell script has been invoked but is not a valid C shell script, the value of shell is used as the initial part of a command to execute the script. The value of shell is initialized to the full pathname of the C shell by using a system-dependent directory prefix (usually, /bin/csh). Any number of options and arguments can be specified along with the shell pathname, however; the filename of the shell script is appended to the value of shell.

You should change the value of shell if you intend to execute Bourne shell scripts. (Note that many commands supplied with UNIX are implemented as Bourne shell scripts.)
status Contains the exit code of the last command executed as a decimal number. The value of status is changed after the execution of each command, so it generally is useless for you to assign a value to status.
time If set, the value of time should specify a number of seconds. Any command you execute that exceeds this time limit causes the shell to print a warning line giving the amount of time that the command used and the current CPU-utilization level as a percentage. The initial value of time is unset.
verbose If set, causes each command to be printed after history substitutions but before other substitutions. The verbose option generally is used within a shell script to echo the commands that follow. The initial value of verbose is unset. The verbose variable can be set automatically by the -v and -V options.

Shell Programming

Although the C shell provides a number of useful extensions to the keyboard interface (such as the command-history mechanism, job control, and additional filename wildcards), its most significant departure from the traditional Bourne shell probably is its syntax for programming constructs--array variables; variable reference forms in general; arithmetic expressions; and the if, while, foreach, and switch statements.

Array variables were discussed in "Using Array Variables," earlier in this chapter. The syntax of variable references was discussed in "Obtaining Variable Values with Reference Expressions." The section "Using Expressions and Operators in Shell Statements," later in this chapter, discusses arithmetic expressions and the special @ command used for calculations. This section looks at the shell statements for flow control: the conditional statements if and switch and the loop control statements while and foreach.

What Is a Shell Script?

A shell script is simply a text file containing shell commands. What makes shell scripts especially handy is the capability to execute the commands in the file simply by typing the file's name as if it were a command. To put it another way, shell scripts provide a fairly painless way to add new commands to your UNIX system. A shell script can be as simple or as complicated to write as you choose. It can be designed to be used by yourself alone, or by many people as a general-purpose command.

Generally, you'll want to write a shell script when you recognize either of two situations:

  • You find yourself repeating a lengthy series of commands over and over to accomplish one general task. Any time you need to accomplish a task on a fairly frequent basis (daily, weekly, or maybe several times a day), and the task requires more than one UNIX command, the task is a good candidate for packaging in a shell script.

  • A repeatable procedure needs to be established for a formal activity. Printing a weekly customer invoicing report, for example, might require a complex procedure--extracting billing information from a master file, computing the invoice data, setting up the printer, and actually generating the print file.

As a general rule, shell scripts written for the first purpose tend to be straightforward to write, whereas the more formal procedures demand generalized shell scripts of greater complexity.

Writing Shell Scripts: An Overview

Writing a shell script is much like entering commands at the keyboard, with a few important differences:

  • You might want to give arguments to your command. The shell automatically puts any words entered on the command line following your script's name into a set of parameters held by the shell variable argv. You don't need to take any special action to get arguments from the command line; they're already available in the parameter array argv when your script begins its execution. See "Using Predefined Variables," earlier in this chapter, for information on accessing argv.

  • You might want to support one or more options with your new command. The shell passes options to your script the same as other command-line arguments. Options, however, can have a complicated structure, especially if you intend to support the standard UNIX convention for options. See the description of the UNIX getopt command for help with processing command-line option strings.

  • Keyboard commands usually are entered with all information customized to the command's use (ls -l foo.html bar.gif), whereas commands inside shell scripts often are parameterized (ls $opt $1 $file) and can be executed conditionally. You parameterize a command by providing variable references and filename substitutions as the command's arguments instead of literal text. To write alternative sets of commands to handle different situations, you need to use the shell's if, switch, while, and foreach commands. These commands rarely are used at the keyboard but occur heavily in shell scripts.

You use the same general procedure for writing shell scripts, regardless of their purpose:

1. Create a text file containing the required commands.

2. Mark the text file as executable by using the chmod command:

chmod +x filename


			
3. Test the shell script.

4. Install the script in its permanent location.

5. Use your script.

You probably already know how to prepare text files by using a text editor. If not, see Chapter 3 of Volume II, "Text editing with vi and emacs." You can use any text editor you want, because the shell is interested only in the file's contents, not in how you created it. The text file cannot contain the formatting characters generated by some word processors, however; the shell script must contain lines identical in format and content to those you would enter at the keyboard. For this reason, you'll probably use a general text editor, such as vi, to prepare shell script files.

A text file must be marked as executable in order to be invoked as a command by entering its filename. You can execute a file as a command even if it is not marked as executable by naming it as the first argument of a csh command. For example, csh payroll causes the shell to search for a file named payroll using the standard search path (defined by the path variable), to open the file for reading, and to proceed to execute the commands in the file. But if you mark the payroll file as executable, you don't have to type csh first: payroll becomes a new command.

The shell uses the same search path for locating script files as it does for locating the standard UNIX commands. To invoke a shell script by name, you must store it in a directory listed in your search path. Alternatively, you can add the directory in which the shell script resides to your search path. Naming too many directories in the search path can down slow the shell, though, so shell scripts commonly are gathered into a few common directories.

You'll find that if you do any shell script writing at all, having a directory named bin under your home directory is very handy. Place all the shell scripts you write for your personal use in ~/bin, and include the directory ~/bin in your search path. Then, to add a new command to your personal environment, simply write a command script file, mark it as executable, and store it in the ~/bin directory: it's ready for use. Because it generally is not a good idea to place your ~/bin ahead of the standard bin directories, /bin and /usr/bin, use aliases to reference customized common UNIX commands so you will not have to prepend the customized command with ~/bin/. For example, if you create a script that replaces/enhances the UNIX rm command, you should set up the following alias:

alias rm ~/bin/rm

Now you can use your new rm without typing ~/bin/rm and without compromising your search path efficiency.

Shell scripts intended for use by a community of users generally are installed in a general directory not owned by any specific user, such as /usr/bin or /usr/local/bin. Most system administrators prefer to store locally written script files in a separate directory from the standard UNIX commands; this makes system maintenance easier. If your installation practices this procedure, you probably already have the path of the local commands directory in your search path. You'll need the help of the system administrator to store a shell script file in the public directory, though, because you probably won't have write access to the directory (unless you're the administrator).

There is nothing magical about testing shell scripts. As a rule, you'll develop a new shell script in a directory you set aside for that purpose. The directory might contain data files you use to test the shell script, and possibly several versions of the script. You won't want to make the script file accessible to others until you finish testing it.

If you find the behavior of a shell script confusing or otherwise unexplainable, you might find it helpful to see the commands the shell actually executes when you run the script. Simply invoke the script with the -x option (for example, csh -x payroll), or embed the command set echo in the script file or modify the #! directive to be #!/bin/csh -x while you are testing the script. With the echo variable set, the shell prints each command just before executing it. You'll see variable substitutions, filename expansions, and other substitutions all expanded, so that you'll know exactly what the shell is doing while running your script. With this trace to look at, you probably will have no difficulty finding errors in your script file.

If the -x output is especially voluminous, you can cut down the range of commands displayed by the shell by bracketing the commands you want to trace. Put the command set echo in front of the range of commands to be traced and the command unset echo at the end. The shell prints just the commands between set and unset while running your script file. Don't forget to remove the set and unset commands after you finish testing and before putting the shell script into production use.

A Simple Shell Script

Shell scripts can be very easy to write. The following lines, if entered into a file named lld, implement a new command that lists all the directories and subdirectories (often called the directory tree) contained in a directory:

# lld - long listing of directories only
if ($#argv < 1) set argv=(.)
find $argv[*] -type d -exec /bin/ls -ld \{\} \;

The lld script contains only three commands. The first, a line containing only a shell comment, serves as a heading and description of the file for anyone displaying it. Many shell script writers place one or more comment lines at the beginning of their script files to provide some documentation for others, in case anyone ever needs to read, change, or enhance the script. Actually, a well-written script file contains many comment lines to help explain the script's operation. Scripts you write for your own use don't need to contain as many comments as scripts written for more public consumption.

The operative statements in the lld script do two things:

  • Provide a default command-line argument if the user didn't provide any. In this case, if the user specifies no directory names, the lld command lists the current directory.

  • Execute the UNIX find command to locate just the directory and subdirectory files contained in the named directory. The -exec option invokes the ls command for each subdirectory located.

Even though the lld shell script is short, it serves the useful purpose of hiding the relatively complicated find command from its users. Even for users very familiar with the find command, it is much quicker to type lld than to type the complete find command.


TIP: Most UNIX commands are very Spartan in their output. The undecorated output is used to help ease processing when passed to another program. When the output of a command is to the user, however, more adorned output (status messages, debug messages, and header lines, for example) often is desired. As a result, shell scripts should have an option for more or less verbosity. The alias vprint (short for verbose print) provides a mechanism to control verbosity. Use the vprint alias whenever output should be based on a verbosity level:
alias vprint 'if ($?bewordy || $?BEWORDY) echo \!*'

The shell and environment variable provide flexibility. If a verbose option is given to a C shell script, the bewordy variable should be set. The environment variable provides a mechanism to control the verbosity of scripts called by scripts. The following example shows a simple script for replacing the token date in files with the system date; the script takes advantage of the verbosity alias:

#!/bin/csh
if ("x$1" == "x-v") then
set bewordy; shift
endif
set DATE="´´date´´"
foreach F(*.html)
vprint Processing file $F
sed "s/DATE/$DATE/" $F > $F.$$
mv -f $F.$$ $F
end


			

Using Expressions and Operators in Shell Statements

In a number of contexts, the shell requires you to write an expression. An expression is a combination of terms and operators which, when evaluated, yields an arithmetic or logical result. An arithmetic result always is represented as a string of decimal digits. A logical value is true or false. In the C shell, a true condition is indicated by 1, and a false condition is indicated by 0. An arithmetic value can be used where a logical value is expected. Any nonzero value is construed as true, and a zero value is construed as false.


NOTE: The logical values of true and false used by expressions are different from the values of true and false used by the conditional execution operators && and ||. In fact, the values are reversed: true is zero and false is any nonzero value. This can be demonstrated by using the UNIX commands true and false:

			
% true
% echo $status
0
% false
% !ec
echo $status
1



			

The reason for this reversal is that only a zero return code indicates successful execution of a command; any other return code indicates that a command failed. So true indicates successful completion, and false indicates failure, as demonstrated in the following example with the C shell && and || operators:


			
% true && echo command OK
command OK
% false || echo command Failed
command Failed


			

A digit string beginning with 0 (for example, 0177) is considered an octal number. The shell generates decimal numbers in all cases, but wherever a number is permitted, you can provide a decimal or an octal value.

Expressions can be used in the @ (arithmetic evaluation), exit, if, and while commands. For these commands, most operators do not need to be quoted; only the < (less than), > (greater than), and | (bitwise or) operators must be hidden from the shell. It is sufficient to enclose an expression or subexpression in parentheses to hide operators from the shell's normal interpretation. Note that the if and while command syntax requires the expression to be enclosed in parentheses.

When writing an expression, each term and operator in the expression must be a separate word. You usually accomplish this by inserting white space between terms and operators. Observe the shell's response to the following two commands, for example. (The @ built-in command is described later in this chapter; it tells the shell to evaluate the expression appearing as its arguments. Note that a space is required between the @ operator and its first argument.)

% set x=2 y=3 sum
% @ sum=$x*$y
2*3: no match
% @sum = 1 + 2
@sum: Command not found.
% @ sum=$x * $y
% echo $sum
6

In the first @ command, after substitution, the shell sees the statement @ sum=2*3. Because 2*3 is a single word, the shell tries to interpret it as a number or an operator. It is neither, so the shell complains because the word starts with a digit but contains non-digit characters.

Most operators have the normal interpretation you might be familiar with from the C programming language. Both unary and binary operators are supported. A complete list of the expression operators supported by the C shell appears in Table 12.11.

Operators combine terms to yield a result. A term can be any of the following:

  • A literal number--for example, 125 (decimal) or 0177 (octal).

  • An expression enclosed in parentheses--for example, (exp). Using a parenthesized expression hides the <, >, and | operators from the shell's normal interpretation. The parenthesized expression is evaluated as a unit to yield a single numeric result, which then is used as the value of the expression. Parentheses override the normal operator precedence.

  • Any variable, command, or history substitution (or combination of these) that, when evaluated, yields a decimal or octal digit string. The usual shell replacement mechanisms are used when scanning an expression. The only requirement you must observe is that, after all substitutions, the resulting words must form decimal or octal digit strings or expressions.

Arithmetic and Logical Operators You can use the operators shown in Table 12.11 to combine numeric terms. Arithmetic operators yield a word consisting of decimal digits. Logical operators yield the string "1" or "0". Remember that operators containing the < (less than), > (greater than), and | (bitwise or) operators must be hidden from the shell by using parentheses--for example, @ x = ($val << 2).

Table 12.11. Arithmetic and logical shell operators.

Operator Syntax Operation
~ ~a Bitwise 1's complement. The bits of the digit string a are inverted so that 1 yields 0 and 0 yields 1. The lower bit value for 5 is 0101; applying the ~ operator yields 1010 or 10 in decimal.
! !a Logical negation. If the value of digit string a is 0, the value of the expression is 1; if the value of digit string a is nonzero, the value of the expression is 0.
* a*b Multiplication. The value of the expression is the arithmetic product of a times b.
/ a/b Division. The value of the expression is the integer quotient of a divided by b.
% a%b Remainder (also known as modulo). The value of the expression is the remainder from the integer division of a by b. The expression 12 % 5 yields 2, for example. Modulo often is used to test for odd or even numbers--for example, if n % 2 yields 0, the number is even.
+ a+b Addition. Yields the sum of a and b.
- a-b Subtraction. Yields the product of a minus b.
<< a << b Left shift. Shifts the bit representation of a left the number of bits specified by b. Equivalent to a * 2b. The lower bit value for 5 is 0101; applying the << operator with a value of 2 (5 << 2) yields 010100 or 20 in decimal.
>> a >> b Right shift. Shifts a right the number of bits specified by b. Equivalent to a / 2b. The lower bit value for 20 is 010100; applying the << operator with a value of 2 (20 << 2) yields 0101 or 5 in decimal.
< a < b Less than. Yields 1 if a is less than b; otherwise, the expression evaluates to 0.
> a > b Greater than. Yields 1 if a is greater than b; otherwise, the expression evaluates to 0.
<= a <= b Less than or equal to. Yields 1 if a is not greater than b; otherwise, yields 0.
>= a >= b Greater than or equal to. Yields 1 if a is not less than b; otherwise, yields 0.
=~ a =~ b Pattern matching. Yields 1 if string a matches pattern b.
!~ a !~ b Pattern matching. Yields 1 if string a does not match pattern b.
== a == b String equivalency. Yields 1 if a is identical to b when a and b are compared as strings.
!= a != b String non-equivalency. Yields 1 if string a is not identical to string b.
| a | b Bitwise or. Yields the inclusive or of a and b. 1010 and 1100 are the low order bit patterns for 10 and 12, respectively. Applying the | operator (10 | 12) yields a pattern of 1110 or decimal 14.
^ a ^ b Bitwise exclusive or. Yields the exclusive or of a and b. 1010 and 1100 are the low order bit patterns for 10 and 12, respectively. Applying the ^ operator (10 ^ 12) yields a pattern of 0110 or decimal 6.
& a & b Bitwise and. Yields the and of corresponding bits of a and b. 1010 and 1100 are the low order bit patterns for 10 and 12, respectively. Applying the & operator (10 & 12) yields a pattern of 1000 or decimal 8.
&& a && b Logical and. Yields 1 if a is not 0 and b is not 0; otherwise, the expression evaluates to 0 if a or b is 0.
|| a || b Logical or. Yields 1 if a is not 0 or b is not 0 (one or both are true); otherwise, the expression evaluates to 0.

Assignment Operators: Evaluating Expressions and Assigning the Results to Variables You can use the @ command to evaluate an expression and assign the result to a variable or to an element of an array variable. The special characters <, >, and | must be quoted or enclosed in parentheses if they are part of the expression; other expression operators can be used without quoting. For example,

@
@ name=expr
@ name[i]=expr

The assignment operators +=, -=, *=, /=, %=, <<=, >>=, |=, ^=, and &= also are supported. The format name operator= expr is equivalent to writing name = name operator expr; for example, @ x=x+y can be written as @ x += y.

The C operators ++ and - are supported in both postfix and prefix forms within expr. This usage is allowed for the @ command, but not for expr generally.

Use the form @ name[i]= to assign the result to the ith element of the array variable name.

The variable name (or array element name[i]) must exist prior to execution of the @ command; the @ command does not create it. A variable or array element is considered to exist even if it has a null value.

Operator Precedence for Arithmetic and Logical Operators The C shell uses precedence rules to resolve ambiguous expressions. Ambiguous expressions are expressions containing two or more operators, as in a+b*c. This expression could be interpreted as (a+b)*c or as a+(b*c). In fact, the latter interpretation applies. Using the values a=3, b=5, and c=7, the expression a+b*c evaluates to 38--not 56.


NOTE: To make life easier for everyone, the shell's rules are identical to those of the C language and a superset of the same precedence rules used by the common, hand-held calculator.

In Table 12.11, operators appear in decreasing order of precedence. Operators fall into eight precedence groups:

  • Unary operators !, ~, and -. These operators have the highest priority. In succession, they associate right to left, so !~a is equivalent to the parenthesized expression !(~a).

  • Multiplicative operators *, /, and %.

  • Additive operators + and -.

  • Shift operators << and >>. The second argument is used as a count and specifies the number of bits by which the first argument should be shifted left or right. Bits shifted out are discarded--for example, 5 >> 1 yields 2.

  • Relational operators <, <=, >, and >=. These operators compare their operands as numbers and yield 1 (true) if the relation is true or 0 (false) if it is not.

  • Equality operators ==, !=, =~, and !~. Note that, unlike other operators, these operators treat their arguments as strings. This requires caution, because the strings " 10", "10 ", and " 10 " all appear unequal even though they are equivalent numerically. To compare strings numerically, use an expression such as $val == ($x + 0).

  • Bitwise operators |, ^, and &. These operators combine the internal binary form of their operands, applying an inclusive or, exclusive or, or an and function to corresponding bits. Definitions of these operations follow:

Inclusive or: Generates a 1 if either of the argument's bits is 1--thus (in binary), 0110 | 1010 yields 1110.

Exclusive or: Generates a 1 if corresponding bits are different--thus (in binary), 0110 ^ 1010 yields 1100.

And: Generates a 1 if both source bits are 1--thus, 0110 & 1010 yields 0010.

  • Logical operators && and ||. These operators accept numeric values and yield 1 or 0.

Operators for Command Execution and File Testing The shell also supports an additional, unconventional set of operators for command execution and file testing in expressions.

Within an expression, you can write a command enclosed in braces ({}). The value of a command execution is 1 if the command executes successfully; otherwise, the value is 0. In other words, a zero exit code yields a value of 1 (logical true) for the command expression { command }; a nonzero exit code yields a value of 0. Many versions of the C shell require a space between the braces--for example, @ x = { true } sets x to 1, whereas @ x = {true} yields a syntax error.

Operators for file testing enable you to determine whether a file exists and what its characteristics are. These operators have the form -f filename and are treated in expressions as complete subexpressions. For filename, specify the name or path of a file, optionally using pattern characters. The argument is subject to all forms of shell substitution, including filename expansion before testing.

Table 12.12 summarizes the file-testing operations supported within expressions.

Table 12.12. File-testing expressions.

Expression Condition When True
-r filename True if file exists and is readable
-w filename True if file exists and is writable
-x filename True if file exists and is executable
-e filename True if file exists
-o filename True if file exists and is owned by the current real User ID
-z filename True if file exists and is zero length
-f filename True if file exists and is a regular file
-d filename True if file exists and is a directory

The following are examples of an expression that mixes file test operators with other operators. In the first case, the expression tests whether the file is readable and not a directory. The second case is somewhat contrived, but it shows how these operators can be part of an equation if necessary.

if (-r $thisfile && ! -d $thisfile) echo Good file
@ x = -f ~/.cshrc + { grep -q foo ~/.cshrc }


TIP: Checking for the value of parameters passed into a shell script is a common task. This task is complicated by the fact that your shell script may take option flags in the form of -option, where option is a single character. Suppose that you want to test for the presence of a -d option, which places your script into a verbose debug mode. The if statement

			
if ($1 == -d) then set debug



			

works as long as the user does not pass in the -d, at which point the shell responds with a message like this:


			
if: Missing file name



			

Reversing the order of comparison--if (-d == $1)--only worsens the problem, because now the shell reports the error every time the script runs. The problem is that the -d argument is interpreted at the -d directory existence operator, whose syntax requires a filename to be present. You can use a technique known as double aliasing to overcome the undesired behavior. Simply place a lowercase x in front of both arguments to the operator ==. The if statement then is transformed to


			
if (x$1 == x-d) then set debug



			

Now the test for the -d option works. Unfortunately, all is not well yet; some perverse, pesky user will do something subversive that can break this syntax (and because I am the only one who uses most of my scripts, I know of what I am speaking). Suppose that you pass a quoted argument that is two words--for example,

x.csh "two words" more args




			

The C shell complains with a message like this:


			
if: Expression syntax



			

This last hurdle can be cleared by enclosing the test in quotes:

if ("x$1" == "x-d") then set debug



			

The test now is as bulletproof as possible.


Entering Comments in Shell Programs

Quite often, when writing programs, program code that was quite logical six months ago might seem fairly obscure today. Good programmers annotate their programs with comments. Comments are entered into shell programs by inserting the pound sign (#) special character. When the shell interpreter sees the pound sign, it considers all text to the end of the line as a comment.

The comment character is considered a comment only if the current shell is not considered interactive. If the command

% echo a pint is a # the world round
a pint is a # the world round

is entered interactively, the results include the # symbol. If the line is executed from a shell script, the words after the # symbol are not displayed, as in this example:

% cat pound.csh
echo a pint is a # the world round
% ./pound.csh
a pint is a

A special form of the comment can be used to specify the command processor of a script file. The #! comment tells the current shell what program will run the script. If you are writing C shell scripts, your first line should always be

#!/bin/csh

This ensures that your script always is run by the C shell. Not all versions of UNIX support this, but most do. Even if a version does not support the #! notation, the # is interpreted as a comment, so no harm comes by using the #! comment. If a Bourne shell user wants to run your script, he will not have to enter "csh script"--simply "script" suffices.

Conditional Statements

A conditional statement provides a way to describe a choice between alternative actions to the shell. The choice actually is made by the shell while executing commands, based on decision criteria you specify. You write a conditional statement when you want your shell script to react properly to alternative real-world situations--for example, to complain when the user omits required command-line arguments or to create a directory when it is missing.

The shell supports two (well, three) commands for conditional execution: if, which evaluates an expression to decide which commands should be executed next; and switch, which chooses commands based on matching a string. The if statement is more appropriate for deciding whether to execute a command or to choose between two commands. The switch statement poses a multiple-choice question; it is designed to handle a situation in which many different actions can be taken, depending on the particular value of a string.

The goto command, although not strictly a conditional statement because it makes no decision, is nonetheless generally used with a conditional statement to move around to arbitrary places in a shell script. The goto command, although valuable in some limited contexts, generally leads to poorly structured shell scripts that are difficult to test and maintain. Experience with the Bourne and Korn shells, which have no goto command, shows that goto is never necessary. You should try to avoid using the goto statement whenever possible.

The following subsections look at the if and switch statements in more detail.

Using the if Statement There are really two different forms of the if statement: a single-line command and a multiline command.

The singl-line command has the general syntax


if (expr) command

Use this form when you need to conditionally execute only one command. This form of the if statement provides the basic type of conditional execution: either you execute the command or you don't. expr can be any valid expression, as described in "Using Expressions and Operators in Shell Statements," earlier in this chapter. If the expression evaluates to a nonzero value at runtime, the expression is considered to be true, and the shell executes command. But if the value of the expression after evaluation is 0, the shell simply skips command, doing nothing. In either case, the shell continues to the next consecutive line of the script file.


CAUTION: Some implementations of the C shell perform an I/O redirection on command even if expr evaluates to false. Unless you have confirmed that your version of csh works otherwise, you should use redirections on the single-line if statement with this presumption in mind.

The multiline command has a more complex syntax:

if (expr) then
commands
else if (expr) then
commands
else
commands
endif

In this case, the if statement consists of all lines beginning with if up to and including the endif line. The multiline form provides a way to tell the shell "either do this or do that." More precisely, the shell executes a multiline if statement as the following: Evaluate the expr expression. If the evaluated expression yields a nonzero result, execute the command group (commands) following then up to the next else or endif. If the evaluated expression yields a 0 result, skip the command group following then. For else, skip the commands following it up to the next else or endif when the evaluated expression is true, and execute the commands following else when the evaluated expression is false. For endif, simply resume normal command execution. The endif clause performs no action itself; it merely marks the end of the if statement.

Notice that, in its basic form, if...then...else, the multiline form of the if statement provides for choosing between two mutually exclusive actions based on a test. The expr expression provides the basis for the choice. The special words then and else introduce command groups associated with the true and false outcomes, respectively.

Because both the single-line and multiline forms of the if statement form complete commands, and you can (indeed, you must) embed commands within an if statement, you can nest if statements by writing one inside the other. Programmers refer to this construct as a nested if statement. Nested if statements are legal but can be confusing if the nesting is carried too far. Generally, one level of nesting (an if inside an if) is considered fair and reasonable; two levels deep (an if inside an if inside an if) is treading on thin ice, and three or more levels of nesting implies that you, as the writer, will forever be called on to make any necessary changes to the script file (the flypaper theory of programmer management). Of course, you are helpless to a certain extent; the amount of nesting you use depends on the job you are trying to do, and not very much on your sense of aesthetics.

In case you don't have a clear idea of how if statements work, here's an example of a single-line statement:

if (-d ~/bin) mv newfile ~/bin

This simple if statement provides an expression that is true only if a file named bin exists in your home directory (~/bin) and is a directory. If the directory exists, the shell proceeds to execute the mv command in the normal fashion. If the directory ~/bin doesn't exist, the entire expression (-d ~/bin) is false, and the shell goes on to the next line in the script file without executing the mv command; the mv command is skipped. The entire statement can be interpreted as the directive move the file newfile to the directory ~/bin if (and only if) the directory ~/bin exists; otherwise, do nothing.

Here's a more complex example using the multiline if statement. In this example, the shell is directed to move the file newfile into the directory ~/bin if it exists, and otherwise to write an error message to the user's terminal and abandon execution of the shell script:

if (-d ~/bin) then
mv newfile ~/bin
else
echo ~/bin: directory not found
exit 1
endif

The longer, multiline if statement is the more appropriate of the two examples for many situations, because it provides the user with some feedback when the script can't perform an expected action. Here, the user is given a helpful hint when the if statement fails to move the file as expected: Either create the missing directory or stop asking to move files there.

Even the dreaded nested if statement can arise from natural situations. For example, the following nests a single-line if statement inside a multiline if statement:

if (-f newfile) then
    if (! -d ~/bin) mkdir ~/bin
    mv newfile ~/bin
else
    echo newfile: file not found
    exit
endif

This last example uses a slightly different approach than the previous two; it begins by dealing with the basic choice between the case where the file to be moved exists or doesn't. If newfile doesn't exist, you can reasonably conclude that the user doesn't know what he's talking about--he should never have invoked the shell script containing these lines, so describe the problem to him and abandon the shell script (the error message is a good candidate to be printed on the standard error using the stderr alias described in "Echoing Arguments to Standard Output"). All the error work is done by the lines following else. If the file newfile exists, however, the script moves the file as expected, creating the directory ~/bin on-the-fly if it doesn't already exist.

As the previous examples show, the if statement often is used in shell scripts as a safety mechanism; it tests whether the expected environment actually exists and warns the user of problems. At the keyboard, you simply would enter the mv command by itself and analyze any error message it reports. When used inside a shell script, the script must decide how to proceed when the mv statement fails, because the user didn't enter the mv command himself--in fact, he might not even realize that invoking the shell script implies executing an mv command. The responsible shell script writer takes into account command failures and provides proper handling for all outcomes, producing scripts that behave in a predictable fashion and appear reliable to their users.

Using the switch Statement The switch statement is like if but provides for many alternative actions to be taken. The general form of the statement follows:

switch (string)
case pattern:
   commands
default:
   commands
endsw

Literally, the shell searches among the patterns of the following case statements for a match with string. In actual use, string generally is the outcome of variable and command substitution, filename generation, and possibly other forms of shell substitution. Each case statement between switch and endsw begins a different command group. The shell skips over command groups following case statements up to the first case statement that matches string. It then resumes normal command execution, ignoring any further case and default statements it encounters. The default: statement introduces a statement group that should be executed if no preceding case statement matched the string. The required endsw statement provides an ending boundary to the switch statement in case the shell still is skipping over commands when it reaches that point; the shell then reverts to normal command execution.

In practice, you'll usually place a breaksw statement after each commands group to prevent the shell from executing the commands in case groups after the one that matched. On rare occasions, you'll have two cases where one case requires some additional preliminary processing before the other case. You then can arrange the two case groups so that the shell can continue from the first case commands group into the second case commands group, by omitting a breaksw. Being able to arrange case groups to allow fall-through is rare, however.

Suppose that you want your shell script to prompt the user for a choice. The user should respond by typing y (for yes) to proceed or n (for no). The switch statement provides a natural implementation because of its string pattern-matching capability:

echo -n "Do you want to proceed?"
set reply=$<
switch ($reply)
case [Yy]*:
   mv newfile ~/bin
   breaksw
default:
   echo newfile not moved
endsw

The echo statement writes a prompt message to the terminal. The -n option causes the cursor to remain poised after the message so that the user can type a reply on the same line. The set statement uses the shell special variable $< to read a line from the terminal, which then is stored as the value of the reply variable. The switch statement tests the value of reply. Although the syntax of switch calls for a simple string between parentheses, variable substitution is performed before analysis of the switch statement, so by the time the shell executes switch, it sees the user's typed response as a string instead of a variable reference. In other words, if the user types yes, after substitution, the shell switch sees the switch statement as if it had been written switch ("yes").

There is only one case in the switch--a default case. If the user types any line beginning with the letter y or Y, the value of $reply matches the pattern string for the first case; the shell then executes the lines that follow the case statement. When it reaches breaksw, the shell skips forward to the next endsw statement.

If the user's typed reply does not begin with the letter y or Y, it won't match any of the case-statement patterns (there is only one). This causes the shell to reach the default: case while still in skipping mode. The effect of default is to start executing statements if the shell is in Skipping mode (which means that the default case must be last in the list of cases), so the effect is to provide a case where the user doesn't type a y or Y. The shell script prints a little message to the terminal confirming that nothing was done. Normal execution then continues to and beyond the endsw.

Here's a slightly more advanced example, where the first command-line argument of the shell script could be an option beginning with - (dash). If the argument is an option, the script saves an indication of the option it found for later reference and discards the option. If it finds an unexpected option, it complains with an error message to the user and abandons execution.

if ($#argv >= 1) then
   switch ($argv[1])
      case -all:
         set flagall
         breaksw
      case -first:
         set flagfirst
         breaksw
      case -last:
         set flaglast
         breaksw
      default:
         echo Invalid option: "$1"
         exit 1
   endsw
   shift
else
   echo $0:t:  "Usage: [ -first | -last | -all ] filename ..."
   exit 1
endif

This example nests a switch statement inside a multiline if statement. If the user provides no command-line arguments, the script skips all the way down to the else statement, prints a brief description of the command's expected argument format, and exits the script. (Note the use of the :t command applied to $0 in order to strip off the path information when displaying the shell script name to the user in the Usage message.) If the user provides at least one argument, a switch statement analyzes the first argument to see which option it is. If the argument matches any of the three strings -first, -last, or -all, the shell script discards the argument after setting an indicator variable. If the argument doesn't match any of the strings, the default: case types the error message Invalid option and terminates the script.

Beginning a Case in switch: case For label, specify a pattern-matching expression to be compared to the control expression of the enclosing switch command:

case label:

If, for a given execution of the switch command, the control expression of switch matches the pattern label, statements following case are executed. Otherwise, the case statement and statements up to the next case, default, or endsw statement are skipped.

The pattern-matching expression label can consist of ordinary characters as well as the wildcard symbols *, ?, and [...]. The pattern is matched against the argument of switch in the same manner as filenames are matched, except that the search here is for a case statement label that matches the switch argument.

For additional information about switch, see "Conditional Statements," earlier in this chapter.

The case statement is intended primarily for use in shell scripts.

Using the Default Case in switch: default Use default to designate a group of statements in the range of a switch statement that should be executed if no other case label matches the switch argument.

For consistent results, you should place the default statement group after the last case statement group in the switch.

For more information about the default statement, see "Conditional Statements," earlier in this chapter.

The default command is intended primarily for use in shell scripts.

Exiting from a switch Statement: breaksw You can use the breaksw command to exit from the immediately enclosing switch statement. The breaksw command transfers control to the statement following the endsw statement. Note that breaksw can exit only from the immediately enclosing switch; any outer switch statements remain active.

For more information on breaksw, see "Conditional Statements," earlier in this chapter.

The breaksw command is intended primarily for use in shell scripts.

Iterative Statements

You use iterative statements to repeatedly execute a group of commands. The iterative statements are while and foreach.

Using the while Loop You use the while statement to repeatedly execute a group of statements until a specified condition occurs. The while command is very generalized. It executes a group of commands repeatedly as long as a calculated expression yields a true result.


CAUTION: Some care is needed when writing a while loop, because an improper design could cause the commands to be repeated forever in an unending loop or never to be executed at all.

The general syntax of the while command follows:

while (expr)
commands...
end

For expr, write a shell expression (see "Using Expressions and Operators in Shell Statements," earlier in this chapter). For commands, write one or more commands to be executed on each iteration of the while loop. Simple and compound commands, pipelines, and parenthesized command lists are all valid.

It is customary when writing shell scripts to indent commands included in the scope of the while or foreach statement. The indentation helps to clarify the commands' subordination to the while or foreach statement and graphically highlights their inclusion in the loop.

The shell evaluates expr before the first iteration of the loop and before each subsequent iteration. If the value of expr is nonzero (in other words, true), commands is interpreted and executed. Any substitutions contained in commands are performed each time the command is encountered, allowing a different value to be substituted on each iteration.

When first encountered, the shell processes a while statement much like an if. It evaluates the expression expr, and, if it is true (nonzero), the shell proceeds with the next statement. Similarly, if expr is false when the shell first encounters the while statement, it skips forward to the end statement, effectively bypassing all the commands between while and end. When you write a while statement, you need to write the test expression expr carefully, realizing that the shell might entirely skip the while statement for certain cases of the expression.

Here is a simple example of a while statement:

while ($#argv > 0)
    if (! -f $1) echo $1: missing
    shift
end

The while statement evaluates the expression $#argv > 0 on each repetition--that is, it tests to see whether there are any command-line arguments. As long as the answer is yes, it executes the following if and shift commands. It stops when the number of command-line arguments has gone to 0, which, after enough repetitions of shift, it will eventually do. For each repetition, the if command simply tests whether a file exists with the same name as the command-line argument--if not, it writes a warning message. The while statement, when invoked with a list of filenames, lists the arguments to standard out where the corresponding file is missing. You could obtain a similar effect simply by entering the command ls name name nameÉ. The difference is that you would have to pick out the filenames generating a not found message from among the normal ls output, whereas the while example simply lists the files that don't exist.

The end statement must be used to mark the end of the range of the while loop. It is a valid statement only within the range of the foreach and while statements. Elsewhere, it generates a shell error message, and the C shell halts processing.

Using the foreach Loop The foreach command is intended for processing lists. It executes a command group once for each word given as an argument to the foreach command. The shell sets a variable to indicate which argument word the iteration is for; you can use the variable in the repeated commands to take the same general action for each word in the list--hence the name of the command.

The general syntax of the foreach statement follows:

foreach name (wordlist)
commands
end

For name, specify the name of a shell variable to which the words of wordlist will be assigned in succession. The named variable does not need to be a new one; it can be an existing variable. Any current value of the variable will be lost, though. On exit from the loop, name contains the value of the last word in wordlist.

For wordlist, specify one or more words enclosed in parentheses. The words can be quoted strings ("foo bar" 'bas'), strings of ordinary characters(one two three), variable references ($var $array $arr[1]), command-substitution strings quoted with backquotes (´´cat /tmp/x.$$´´), filename patterns (*.{cc,hh}), or history substitutions introduced with ! (!$). All the words are scanned and substitutions are performed, and the resulting strings are redivided into words (except where prevented by quoting) before the first loop iteration. You can omit the parenthesized wordlist, in which case the shell uses the command-line arguments as the list of words.

For commands, specify one or more complete commands using the normal shell syntax. commands can be a simple or compound command and can be any of the legal command types, including aliases and built-in shell commands.

The last command in commands must be followed with end as a separate command. It can appear on the same line as the last command, separated from it with the semicolon statement delimiter (;), or on a line by itself. Note that the end command is a valid shell command only when used with a foreach or while statement. In other contexts, it is considered an illegal command and causes a shell error.

The loop is executed once for each word in wordlist. The variable name is set to the current word before each iteration of the loop, in effect stepping through the wordlist word by word from left to right. It stops when the loop has been executed once for each word. In commands, you can use the value of $name to identify which word the repetition is for, or you can ignore its value. You even can change the value of $name--the shell won't complain. It simply sets name to each word in turn, stopping when it runs out of words.

The foreach statement is a very handy tool, because it enables you to repeat an action for each item in a list. It is as useful at the keyboard as inside shell scripts. In the following example, it is used to change the suffix of a series of files, renaming them from .c to .cc:

foreach file (*.c)
  mv $file $file:r.cc
end

Altering Loop Execution: continue and break You can use two additional special shell commands in the command list within the scope of foreach or while: the continue and break commands.

The continue command, which takes no arguments, can be used as part of a conditional statement to terminate execution of the current loop iteration, skip the remaining statements in the command list, and immediately begin the next loop iteration. The continue command is provided as a convenience so that you don't have to use complex if statements to thread a path through the foreach loop. After you execute all the commands you want to for the current loop iteration, simply invoke continue to skip the remaining commands and start the next iteration of the loop with the first command following foreach or while.

The break command terminates the current and all subsequent iterations of the foreach or while loop. After break, the next statement executed is the one following the end statement. Like continue, break skips all intervening commands between itself and the end statement. Unlike continue, break also halts iteration of the loop.

You can nest foreach and while loop-control statements within each other, constructing nested loops. If you do so, you usually will want to use a different control-variable name on each inner foreach statement, although the shell doesn't enforce such a restriction. Keep in mind, however, that after execution of an inner foreach loop, the control variable will be changed. Changing the value of the control variable in one of the command statements does not affect the behavior of the foreach statement; on the next iteration, it is assigned the next word in wordlist in the usual manner.

When using break and continue, you must remember that they affect only the foreach statement on the same level. You cannot use break or continue to abandon an iteration of any outer loop. To break out of a foreach loop nested two or more levels deep, you need to use conditional statements (such as if) to test some condition and execute another break or continue statement. (Note that this is one of the few programming situations in which a well-documented goto actually can improve the quality of the code.)

Altering the Order of Command Execution: goto

You can use goto to change the order of command execution:

goto word

Ordinarily, commands are executed one after another in succession. The looping statements foreach and while enable you to repeat a group of statements a fixed or variable number of times, and the if and switch conditional statements enable you to choose between two or more alternative statement groups. Other than this, the general flow of control in statement execution is from the first to the last statement in a shell script or input command sequence. The goto command makes it possible to change the flow of control in an arbitrary way.

For word, specify an ordinary symbol (a string of characters not enclosed in quotes, not containing blanks or tabs, and not containing any punctuation characters having special meaning to the shell). word is subject to filename and command substitution. Assuming that a file named index exists in the current directory, all the following goto commands will jump to the label index: "goto index", "goto ´´echo index´´", and "goto ind*".

The shell searches the command-input stream for a line beginning with word followed immediately by a colon (word:); this forms a statement label. If the statement label is found, execution resumes with the first command following the label. If the statement label cannot be found, the shell writes an error message and stops.

The goto command generally is used inside a shell script, in which case the range of statements searched for the label is restricted to the contents of the script file. In any other context, the shell backspaces the input medium as far as possible and then searches forward to the statement label. Backspacing is not supported for the terminal, so the goto statement is limited to the current available command-history lines when goto is issued from the keyboard.


CAUTION: Using goto from the keyboard probably will put your session in an infinite loop, if there is actually a label in which to jump. Consider the history buffer

1 ls /usr/bin
2 jumpHere:
3 echo Infinite loop

with the label jumpHere:. If you type goto jumpHere, the C shell inserts the goto command into your history as command 4 and repeats forever commands 3 and 4 on your history list.


Specifying the Response to a Signal: onintr

Use onintr to specify the action to be taken when the shell receives a signal. For example,

onintr
onintr -
onintr label

The onintr command is roughly equivalent to the Bourne shell trap command but differs in syntax and usage.

When specified without arguments, the onintr command sets the default signal action for all signals. When used within a shell script, this causes most signals to result in termination of the shell script. When used from the keyboard, this resets any special signal-handling you established with previous onintr commands.

You can use onintr - to disable and ignore all signals. This form is handy when used within a shell script to protect a sensitive series of commands, which, if interrupted (abandoned because of shell script termination on receipt of a signal), might leave unwanted files or generate invalid results. You can use onintr without arguments to restore the normal default signal actions.

You can use onintr label to cause the shell to perform an implicit goto to the statement label label on receipt of a signal. The shell provides no indication of which signal was received. Because most signals represent a request for termination, though, this form of onintr can be used to perform orderly cleanup before exiting from a shell script. You might use onintr label in a shell script, for example, to provide a cleanup routine if the user presses the INTR key (interrupt key, often configured to be Ctrl-C, use stty -a to query your terminal settings), signaling his desire to cancel the shell script's execution. After performing any desired actions, exit the shell script with the exit command.

For more information about statement labels, see the goto command description in this section.


TIP: One of the basic tenets of programming is based on this old joke: I went to the doctor and told her, "It hurts when I do this." The doctor replies, "Then don't do that." Believe it or not, this is great programming wisdom. Keeping track of temporary files, which includes remembering to delete temporary files regardless of whether an interrupt was passed in, can be problematic. Applying the preceding pearl of wisdom, if it hurts to create temporary files, do not create temporary files if it is at all avoidable. By judiciously using command substitution, pipelining, eval, shell variables, and xargs, you can greatly reduce the need for temporary files. (See "Quoting or Escaping from Special Characters," earlier in this chapter, for a discussion on the use of xargs.)
If you must create a temporary file, following a few simple guidelines can save a great deal of pain. Always use the $$ notation when naming a temporary file, and always place the file in /tmp. Using $$ ensures uniqueness, and, on most UNIX systems, the location /tmp regularly deletes unaccessed files. The following command creates a temporary filename for use in a script:



			

set tmpfile=/tmp/foo$$



			

Processing an Arbitrary Number of Parameters: shift

You can use shift to shift the shell parameters ($1, $2,$n) to the left:

shift
shift name

After execution, the value of $2 moves to $1, the value of $3 moves to $2, and so on. The original value of $1 is discarded, and the total number of shell parameters (as indicated by $argv#) is reduced by 1.

You can use shift name to perform the same type of action on the named array variable.

Interpreting a Script in the Current Shell: source

You can use source to read and interpret a script of shell commands within the current shell environment:

source [-h] name

No subshell is invoked, and any changes to the environment resulting from commands in the script remain in effect afterward. Possible changes that can result from execution of a script file with source include changing the current directory, creating or altering local and environment variables, and defining command aliases.

An exit statement encountered in a script interpreted with source results in an exit from the current shell level; if this is your logon shell, you are logged out.

For name, provide the filename or pathname of a file containing shell commands and statements. Some C shell versions search the current directory path (path variable) for the file if you do not specify a name beginning with /, ./, or ../.

The -h option enables the file to be sourced into the history buffer without the commands being executed.

Customizing Your Shell Environment

The C shell provides for two initialization scripts--the .cshrc and .login files--and one shutdown procedure--the .logout file.

The C shell always looks for a file in your home directory named .cshrc whenever it is invoked, whether as a logon shell, as a command, implicitly by entering the filename of a shell script as a command, or by a subshell expression enclosed in parentheses.

The .cshrc script should perform only those initializations you require for any C shell environment, including shells you invoke from other commands such as vi and pg.

When invoked as a logon shell, the .login script is executed to perform any one-time-only initializations you require. These can include issuing the stty command to define your preferred Erase, Kill, and INTR keys; setting your cdpath, path, and mail variables; and printing the news of the day.

When you exit a logon shell by typing the EOF key (end of file key, often configured to be Ctrl-D; use stty -a to query your terminal settings)at the start of a line or by entering the exit or logout command, the shell searches for a file named .logout in your home directory. If found, the shell executes it and then terminates. You could use the .login and .logout scripts to maintain a time-sheet log recording your starting and ending times for terminal sessions, for example.

What to Put in Your .cshrc Initialization File

You should define command aliases, variable settings, and shell options in your ~/.cshrc file. This file always is executed before the .login script, and by placing such definitions in .cshrc, you ensure that the definitions are available in subshells.

Typical items you will want to have in your .cshrc file include the following:

  • alias lx /usr/bin/ls -FC

  • You probably will want one or more aliases for the ls command. After developing some experience with UNIX, you'll find that you prefer certain options when listing directory contents. On some occasions, you'll want the long listing given by the -l option, but, more often, a multicolumn listing of some form will provide the quick overview of directory contents that helps orient you. You can have as many aliases for the ls command as you want, but only one named ls. If you define an alias for ls, remember that it affects your use of the command in pipelines.

  • set ignoreeof

  • The ignoreeof option prevents you from logging out by accidentally typing the EOF character (usually, Ctrl-D). When this option is set, you must explicitly invoke the exit or logout command to exit from the shell.
  • set noclobber

  • Some users prefer to use the noclobber option, and some don't. If this option is set, you can't accidentally destroy an existing file by redirecting a command's output to it with > filename. If you develop a feeling of frustration after destroying useful files too often with the > operator, by all means, try noclobber. Note that it provides no protection from accidentally deleting the wrong files with rm, though.

  • set path=(dirname dirname ...)

  • You might want to define your search path in .cshrc instead of .login. By defining your path in .cshrc, you ensure that you always have the same search path available for all invocations of the shell. However, you also prevent inheriting an environment. Most people find that it is sufficient to define the search path in the .login script.

You might want to place portions of your initialization into separate files and have the .cshrc source the separate files. For example, place all your aliases in a separate file, ~/.cshrc.alias, and set your path in ~/.cshrc.path. A separate alias and pathfile enable you to run most of your scripts using the -f option, and the script can source the portion of the initialization it needs. (See "Shell Options," earlier in this chapter, for a description of -f.)

For further information about variables and how to set them, see "Variables," earlier in this chapter.

What to Put in Your .login Initialization File

The .login script is an excellent place to do the following things:

  • Identify the kind of terminal you are using--perhaps by prompting the user to enter a code.

  • Set the TERM environment variable to match the terminal type. TERM is used by the vi command to send the correct terminal-control codes for full-screen operation; it can't work correctly with an incorrect TERM.

  • You can issue the stty command to set your preferred control keys:
stty erase '^H' kill '^U' intr '^C'
  • You can set global environment variables:
setenv TERM vt100
setenv EDITOR /usr/bin/vi
setenv PAGER /usr/bin/pg

			
  • You can set local variables:
set path=(/usr/bin /usr/ucb /usr/X/bin $home/bin .)
set cdpath=(. .. $home)
set mail=(60 /usr/spool/mail/$logname)

			
  • And, you can execute any system commands you find interesting:
news
df

			

For further information about variables and how to set them, see "Variables," earlier in this chapter.

What to Put in Your .logout File

There is no standard use for the .logout file. If you don't have a use for the .logout file, you can omit it without incurring any shell error messages.

Job Control

When you type a command on the command-line and press Return, the command executes in the foreground, which means that it has your shell's undivided attention and ties up your shell until the job finishes executing. This means that you must wait until that command executes before you can do any other work in that shell. For commands or programs that finish quickly, this isn't usually a problem. It is a problem for commands or programs that take minutes or hours to finish. Most graphical applications, such as xterm, are made background jobs. By executing commands or programs in the background, you can free up your shell immediately to do other tasks.

The C shell provides you with a job-control mechanism for executing and managing background jobs.


NOTE: When csh was implemented years ago, its job-control mechanism was quite an advancement. In fact, when the Korn shell was implemented to provide C shell features in a Bourne shell style, the csh job-control interface was carried virtually intact and without change. The description of job control for the Korn shell in Chapter 12 is essentially accurate for the C shell as well.

Table 12.13 lists the C shell commands provided for managing background processes started with & (called jobs).

Table 12.13. C shell commands to manage background processes starting with &.

Command Function
& Executes a command in the background
bg Resumes execution of stopped jobs in the background
fg Switches background jobs to foreground execution
jobs Lists active background jobs
kill Sends a signal to specified jobs
wait Waits for all jobs to finish

Executing Jobs in the Background: &

You can use & to execute a command in the background. A background process has no associated terminal:

command &

If the process attempts to read from your terminal, its execution is suspended until you bring the process into the foreground (with the fg command) or cancel it. A command executed in the background is called a job by the C shell.

For command, write a simple command or a compound command. The & operator must appear at the end of the command. The & operator also serves as a statement delimiter; any commands following & on the same line are treated as if they were written on the following line:

xterm & xclock & xload &

The & operator also has lower precedence than any other compound operators. In the following example, all the commands are executed in the background as a single job:

grep '#include' *.cc | pr && echo Ok &

When you execute a command in the background by appending an &, the shell writes a notification message to your terminal identifying the job number assigned to the job. Use this job number, in the form %number, as the operand of kill, fg, bg, or wait to manipulate the job.

Listing Active Background Jobs: jobs

The jobs command simply lists the process group leaders you have active in background execution. The process group leader is the process that owns itself and zero or more additional subprocesses. A simple command appended with & launches one process and one process group leader (one job with one process). A pipe of three commands all executed in the background (for example, ls | sed | xargs &) launches three processes but is still one job.

You can use the jobs statement to list the current set of background jobs:

jobs [ -l ]

The output of jobs has the following general format:

% jobs
[1] +  Stopped    vi prog.cc
[2]    Done       cc myprog.cc

A plus sign (+) marks the shell's current job; a minus sign (-), if shown, marks the preceding job. Various messages, including Stopped and Done, can be shown to indicate the job's current status.

Use option -l to print the process identifier of each job beside its job number:

% jobs -l
[1] +  2147 Stopped    vi prog.cc
[2]    1251 Done       cc myprog.cc

Referring to Job Numbers: fg and bg

Both the bg and fg commands require you to specify a job number. A job number can be any of those listed in Table 12.14.

Table 12.14. Job numbers.

Job Number Description
%n A reference to job number n. When you start a job using the & operator, the shell prints a job number you can use to refer to the job later. For example,

% du | sort -nr &

[1] 27442

The number in brackets is the job number n. The other number is the process identifier of the job.
%string A reference to the most recent background command you executed beginning with string. For string, you can specify only the first command name of the line, but you don't need to specify the entire command name: Any unique prefix of the command name will be accepted. Thus, you can use %da to mean the date command, but you can't safely use %pr to refer to a print command if you also have used the pr command in the same logon session.
%?string A reference to the most recent background command containing string anywhere in the line. For example, %?myprog is a valid reference to the job cc myprog.cc.
%+ A reference to the current job--the job you last started, stopped, or referenced with the bg or fg command. In the listing produced by the jobs command, the current job is marked with + and can be referenced with the shorthand notation %+.
%% Same as %+.
% Same as %+.
%- A reference to the preceding job. In the listing produced by the jobs command, the preceding job is marked with - and can be referenced by the shorthand notation %-.

Moving Foreground Jobs into the Background: bg

You can use the bg command to switch the specified jobs (or the current job, if no job arguments are given) to background execution. If any of the jobs currently are stopped, their execution resumes. bg is used most often after placing a job into the background by using the SUSP key (suspend key, often configured to be Ctrl-Z, use stty -a to query your terminal settings). For example,

bg [ job ... ]

A job running in the background is stopped automatically if it tries to read from your terminal. The terminal input is not executed unless the job is switched to foreground execution. If you use the bg command to resume a job that has been stopped for terminal input, the job immediately stops again when it repeats the pending terminal read request, making the bg command appear to have been ineffective. In such a case, you must terminate the job (by using the kill command) or switch the job to foreground execution and respond to its input request (see the fg command).

You must use the job number when referring to the job--for example, fg %3 or fg %cc. The C shell also supports an abbreviation for the fg command: %10 in itself switches job 10 to foreground execution, acting as an implied fg command. (The Korn shell doesn't exactly support this, although you can set up an alias to achieve the same effect.)

Pausing and Resuming Background Jobs

The Ctrl-Z mechanism provides a handy way to stop doing one thing and temporarily do another, and then switch back. Although some interactive commands such as vi enable you to escape to the shell, not all do. Whether or not the command does, simply press Ctrl-Z to temporarily stop the command; you'll immediately see a shell prompt. Now you can do whatever you want. To resume the interrupted command, enter fg %vi (or %vi, or just %). When you have several outstanding jobs, the jobs command provides a quick summary to remind you of what commands you currently have stacked up.

Moving Background Jobs into the Foreground: fg

You can use fg to switch the specified jobs into foreground execution and restart any that were stopped. For example,

fg [ job ... ]

If you specify no job arguments, the current job is assumed. The current job is the last job you started, stopped, or referenced with the bg or fg command and is identified with a + in the listing produced by the jobs command.

For job, specify any percent expression, as described in "Referring to Job Numbers: fg and bg," earlier in this chapter. Note that %5 or %vi (or any of the allowable percent expressions), entered as a command, is equivalent to issuing the fg command with that argument. Thus, %5 restarts job 5 in the foreground, and %vi restarts the most recent vi command if it is one of your active jobs. (See also the bg, wait, and jobs related commands.)

Stopping a Background Job: stop

You can pause a job that is executing in the background with stop. For example,

stop [ %job ]

This command sends a stop signal (SIGSTOP) to the named job, as if the SUSP key were pressed (usually, Ctrl-Z). The job is stopped.

You can use the bg command to resume execution of the stopped job or fg to bring the job to the foreground and resume its execution.

To terminate the execution of a background job, use the kill command. See "Signaling a Process: kill," earlier in this chapter, for details.

Stopping the Current Shell: suspend

The suspend command suspends execution of, or stops, the current shell. Its effect is the same as pressing the SUSP key (ordinarily, Ctrl-Z). For example,

suspend

Waiting for Background Jobs to Finish: wait

You can use the wait command to wait for all background jobs to finish. For example,

wait

The shell simply stops prompting for command input until it receives notification of the termination of all background jobs.

To stop waiting, simply press the Return (or Enter) key. The shell prints a summary of all background jobs and then resumes prompting for commands in the normal fashion.

Requesting Notification of Background Job Status Changes: notify

You can use the notify command to request that the shell always report any change in the status of a background job immediately. For example,

notify [ %job ]

By default, the shell reports the completion, termination, stoppage, or other status change by writing a message to your terminal just before the command prompt.

You can use notify with no arguments to request immediate notification of background job status changes. Be aware that a notification message might be written to your terminal at inopportune times, however, such as when it is formatted for full-screen operation; the message could garble a formatted screen.

You can use notify %job to request a notification of status change for only the specified job. This form is handy when you run a background command and later decide you need its results before continuing. Instead of repeatedly executing jobs to find out when the background job is done, just issue notify %job to ask the shell to tell you when the job is done.

For %job, specify any of the job-reference formats, as described for the bg command.

Controlling Background Process Dispatch Priority: nice

You can use the nice command to change the default dispatch priority assigned to batch jobs. For example,

nice [ +number ] [ command ]

The idea underlying the nice facility (and its unusual name) is that background jobs should demand less attention from the system than interactive processes (interactive graphical user interface jobs are the exception). Background jobs execute without a terminal attached and usually are run in the background for two reasons:

  • The job is expected to take a relatively long time to finish.
  • The job's results are not needed immediately.

Interactive processes, however, usually are shells where the speed of execution is critical because it directly affects the system's apparent response time. It therefore would be nice for everyone (others, as well as yourself) to let interactive processes have priority over background work.

UNIX provides a nice command you can use to launch a background job and at the same time assign it a reduced execution priority. The nice built-in command replaces the UNIX command and adds automation. Whereas the UNIX nice command must be used explicitly to launch a reduced-priority background job, the shell always assigns a reduced execution priority to background jobs. You use the nice command to change the priority the shell assigns.

When invoked with no arguments, the nice built-in command sets the current nice value (execution priority) to 4. A logon shell always assumes a nice value of 0 (the same priority as interactive processes). You must execute nice or nice +value to change the nice value (until then, you aren't being nice; all your background jobs compete with interactive processes at the same priority).

Use nice +number to change the default execution priority for background jobs to a positive or zero value. A zero value (nice +0) is the same as interactive priority. Positive values correspond to reduced priority, so that nice +5 is a lower priority than nice +4, nice +6 is a lower priority than nice +5, and so on.

If you specify command, the nice command launches the command using the default or specified execution priority but doesn't change the default execution priority. For example, nice cc myprog.c launches the compilation using the default priority, whereas nice +7 cc myprog.c launches the compilation with an explicit priority of 7.

Note that you do not need to append & to the nice command to run a command as a background job; when you specify command, the background operator is assumed.

Signaling a Process: kill

You can use the kill built-in command to send a signal to one or more jobs or processes. For example,

kill [ -signal ] [%job ...] [pid ...]
kill -l

The built-in command hides the UNIX kill command. To invoke the UNIX kill command directory, use its full pathname (probably /bin/kill or /usr/bin/kill). The built-in command provides additional features that are not supported by /bin/kill and can be used in the same manner.

For signal, specify a number or a symbolic signal name. All UNIX implementations support signals 1 through 15; some implementations can support more. By convention, the signals listed in Table 12.15 are always defined.

Table 12.15. Signals.

Signal Name Meaning Effect
1 HUP Hang up Sent to all processes in a process group when the terminal is disconnected by logout or, for a remote terminal, when the terminal connection is dropped.
2 INT Interrupt Sent after the user presses the INTR key (defined by the stty command; usually, Ctrl-C; sometimes, BREAK).
3 QUIT Quit Sent after the user presses the QUIT key (defined by the stty command; there is no default).
9 KILL Kill Sent only by the kill command; it forces immediate termination of the designated process and cannot be ignored or trapped.
10 BUS Bus error Usually caused by a programming error, a bus error can be caused only by a hardware fault or a binary program file.
11 SEGV Segment violation Caused by a program reference to an invalid memory location; can be caused only by a binary program file.
13 PIPE Pipe Caused by writing to a pipe when no process is available to read the pipe; usually, a user error.
15 TERM Termination Caused by the kill command or system function. This signal is a gentle request to a process to terminate in an orderly fashion; the process can ignore the signal.

If you omit signal, the TERM signal is sent by default (unless the -l option is specified, in which case no signal is sent at all).

For job, specify one or more jobs or process identifiers. A job reference can be any one of the following:

  • Any of the valid % operators presented in "Referring to Job Numbers: fg and bg," earlier in this chapter.

  • A Process ID number.

The % operator and Process IDs may both be on the command line--for example, kill %2 1660 543 %3.

There is no default for job. You must specify at least one job or process to which the signal will be sent.

You can use the command kill -l to list the valid symbolic signal names. Always use the kill -l command to identify the exact signal names provided when using a new or unfamiliar version of csh.

Also see the bg, fg, wait, and jobs commands for more information about job control using the C shell.

Using the Shell's Hash Table

The C shell's hash table is used to expedite command searches by identifying the directory or directories where a command might be located. The hash table is created based on the directories specified in your path C shell variable. The order in which the directories are specified determines the search order as well as the efficiency of locating commands you execute.

For each directory in the search path or hash table, the shell invokes the exec UNIX operating system function to search for the command to be executed. If unsuccessful, the search continues with other possible locations for the command. However, the exec operating system function entails considerable operating system overhead; its use increases system load levels and degrades system performance. Consequently, the effectiveness of the shell's hash table is a matter of concern. C shell provides you with three commands for working with the hash table: hashstat, rehash, and unhash.

Determining the Effectiveness of the Hash Table: hashstat

You can use the hashstat command to determine the effectiveness of the shell's hash-table mechanism. For example,

$ hashstat

The statistics printed by hashstat indicate the number of trials needed on average to locate commands, and hence the number of exec function calls per shell command issued. Ideally, every command would be found with one trial. If the hit rate is too low, many directory searches (exec invocations) are occurring for each command executed. You need to reorder the directories in your search path and, if possible, eliminate directories from your path that don't contain any commands you use. In other words, poor hash-table performance is caused by an improperly structured search path, as defined by the path C shell variable. The commands you use most frequently should be located in the directory named first in the path, and successive directories should be referenced less and less frequently. If you list directories in your path that don't contain any commands you use, the shell will waste time searching those directories for commands you do use.

Rebuilding the Hash Table: rehash

You can use the rehash command to rebuild the shell's hash table. The hash table is used to expedite command execution by reducing the set of directories that needs to be searched to locate a particular command. For example,

rehash

The hash table is updated automatically when you change the value of the path variable, but no automatic update is possible when you change the name of an executable file or move executable files in or out of directories in your search path. Changes made by the system administrator to directories containing system commands also go unnoticed. In such cases, use rehash to resynchronize the shell's hash table with the real world.

You need to execute the rehash command only when an attempt to execute a command that you know exists in your search path results in a not found message.

Disabling the Use of the Hash Table: unhash

You can use the unhash command to discontinue the shell's use of a hash table to expedite directory searches for commands. The shell continues to search directories using the path variable for programs in the usual fashion, although with reduced efficiency. See the rehash command to resume usage of the hash table. For example,

unhash

You might want to issue the unhash command while developing a new shell script; when restructuring the contents of directories listed in your path variable; or if you have NFS mounts in your path, and the NFS server goes awry.

Managing Resource Limits: limit and unlimit

UNIX imposes a number of limitations on the amount of resources any system user can commit. For each type of resource, there is a system-defined maximum. The system administrator can increase or reduce the size of a limitation by using the limit command or restore the limitation to its normal value with unlimit. Normal users also can employ the limit and unlimit commands, but only to further restrict resource usage--not to increase it.

The specific types of resources you can control with the limit and unlimit commands follow.

Unless you are the system administrator, changing a resource limit affects only the current process. It doesn't affect any other commands you are running as background jobs at the same time, and it doesn't affect any other users.

Manipulating resource limits is not something you do very often. It is of interest mainly to programmers and system administrators involved in problem determination. You should be aware of the kinds of limits that exist and what their values are, though, because a resource limit can cause a command to fail for spurious or misleading reasons. One of the resource limits sets an upper bound on the size of a disk file, for example. If a command you execute tries to write a file bigger than the file size limit, the command may fail, reporting that it is out of disk space. This may lead you to ask the system administrator to give you more disk space. Getting more disk space won't solve the problem, however, because the file size limit won't allow your command to use the space even if it's available. The proper resolution is to ask the system administrator to change the system's built-in file size limit or to stop trying to write such large files.

Displaying or Setting Maximum Resource Limits: limit

You can use the limit command to display or change system maximums that apply to the current invocation of the shell and all commands and jobs you launch from the shell. For example,

limit [ resource [ maximum ] ]

UNIX provides a limit command you can use to change the maximum file size you can write with any command. The limit built-in shell command can be used for the same purpose, as well as to change a number of other limits.

If you specify no arguments, the limit command lists all settable limits currently in effect.

For resource, specify one of the options shown in Table 12.16. (Note: The resource types you can specify depend on the particular implementation of csh and UNIX you are using.)

Table 12.16. Resource options.

Option Description
coredumpsize The maximum size of a coredump file that can be written. The system defines a maximum size for core files. You can reduce the limit or increase the limit up to the system-defined limit.
cputime The maximum number of CPU seconds any process can run. A process that exceeds this limit is terminated.
datasize The maximum amount of memory that can be allocated to a program's data and stack area. The system defines a default upper limit for the amount of memory a program can use. You can reduce the limit or, if you previously reduced it, you can increase it back up to the system-defined limit.
filesize The maximum number of bytes a file can contain. An attempt to create a new file or to append bytes to a file that would exceed this size causes the operating system to signal an end-of-medium condition to the program. The UNIX system specifies an upper limit for file size that you cannot change. You can use the limit command to display the limit or to reduce it; you cannot increase it, however, unless you have previously reduced the limit, in which case you can increase it up to the system-defined limit.
stacksize The maximum amount of memory the system should allow for a program's stack area. The system defines a maximum size to which any program's stack area can grow. You can reduce the limit or, if you previously reduced it, you can increase it back up to the system-defined limit.

If you specify resource but omit maximum, the limit command displays the current limit value for the specified resource. Otherwise, specify a number of seconds (for cputime) or a number of kilobytes for any other resource (limit filesize 32 sets the maximum file size to 32KB or 32,768 bytes). You can append m to the number to specify megabytes instead of kilobytes: limit datasize 2m sets the maximum program data area to 2,097,152 bytes (2,048KB).

Canceling a Previous limit Command: unlimit

You can use unlimit to cancel the effect of a previous limit restriction. For example,

unlimit [ resource ]

Because the limit command can be used only to reduce system-defined constraints even further (for other than the superuser), the unlimit command restores the named limit (or all limits) to their system-defined maximums.

See "Displaying or Setting Maximum Resource Limits: limit" for a description of the allowable values for resource.

Summary

When compared to the Bourne shell, facilities provided by the C shell include extensions for both the keyboard environment and the shell programming environment. Besides more filename wildcards, command history, history substitution, and job control, the C shell also provides array variables, arithmetic expressions, a somewhat more convenient if statement, and briefer forms of while and foreach (dropping the useless do of the Bourne shell).

Virtually all the features of the C shell also are supported by the Korn shell in a form more consistent with the syntax and use of the Bourne shell. Because of its many extensions for both the keyboard user and shell script writer, the C shell is well worth your investigation; you might find that you like it.

This chapter provided a quick overview of the C shell syntax and features. You can find a more detailed, although turgid, presentation in the reference manuals for your particular version of UNIX; you should consult these for the last word on details of its operation. The C shell, being descended from BSD roots, has never been subjected to the same degree of standardization as the System V side of the UNIX family.

TOCBACKFORWARDHOME


©Copyright, Macmillan Computer Publishing. All rights reserved.