Полезная информация

Chapter 13

Proprietary Extensions to Severs


CONTENTS


Behind every good CGI program is a good server, usually an HTTP server. An HTTP server is a program that intermediates between the end user (the browser) and the CGI program itself. It is the server that actually spawns the CGI program, passing along information from the browser, and it is the server that collects the output of the CGI program and sends it back to the end user who requested it.

At its most basic, that is all a Web server has to do: Be a sort of halfway house for information between the client and the machine storing the data. It didn't take long before people started to realize that the more tricks a server had up its sleeve, the less work they would have to do with external CGI programs. As a result, server extensions were born.

Server extensions are added bits of functionality that are either built straight into the server or dynamically added to the server as needed. Either way, they are part of the server. This means that when a client requests a function from a server, the server extension is right there. Unlike CGI programs, a server extension does not require an extra process to be run on the host machine.

A downside to server extensions is that they are largely proprietary, meaning that a nifty add-on to the ncSA server might not be present in the CERN server or vice versa. However, several server extensions have proven useful enough to be added to many of the most popular Web servers (which at the time of this writing means Apache, ncSA, and Netscape).

I'll start by looking at the two server extensions that are currently used the most: server push and cookies.

Server Push

Server push was first envisioned as a method for displaying simple animations over the Web. Unfortunately, it's not very good at it. At the time server push was implemented, it was the only option, but today Web animations are better served by Java or GIf89a extensions. Other applications have been found for server push, and its cousin, client pull, so it remains a useful extension supported by almost all servers.

To understand server push, you must first look at how the Web server handles data before it passes it on to the client. When asked to retrieve a file from the host, the server first looks at the extension of the file and based on that, assigns it a MIME type. MIME (Multipurpose Internet Mail Extensions) types tell the browser what type of file it is receiving. This is usually put in the initial header of the document, which the end user never sees. For example, if you were to Telnet to www.yahoo.com on port 80 (the HTTP port), and type GET / HTTP/1.0, you would receive Yahoo's home page with the following information attached to the beginning:

HTTP/1.0 200 OK
Last-Modified: Thu, 25 Jul 1996 21:45:35 GMT
Content-Type: text/html
Content-Length: 5490

The first line tells the browser what version of the HTTP protocol Yahoo's server is using; the OK indicates it found the requested document. The second line gives the date the document was last modified, and the last line gives the size of the document in bytes. The third line, Content-type: text/html, tells the browser to expect an HTML document and treat it accordingly. It is this content type header that makes server push possible.

As intimated by the acronym, MIME was originally designed as a set of protocols to enable electronic mail to contain full multimedia enhancements. In fact, most modern mail clients do support MIME, but its usefulness has grown beyond that. One of the necessities of transferring multimedia mail is the capability to enclose several different types of documents in one message. MIME does this through a content type called multipart/mixed. A document of this type has a string of characters given in the header known as the boundary. This boundary can occur any number of times in the message, each time preceded by --. Each section of document between boundaries is considered a separate message with its own MIME type.

In 1995, some smart folks at Netscape, Inc., thought of a way to put a twist on the multipart/mixed MIME type that would allow animations to be displayed over the Web. A new MIME type was created: multipart/x-mixed-replace. The x means that it's an extra type that is not part of the official MIME specification. The replace part is the key. This means that instead of one message with lots of different types, a multipart/x-mixed-replace message is really lots of separate messages all with the same type. As each message is displayed, it replaces the one before it. When applied to a MIME type such as image/gif, this multipart/x-mixed-replace causes one GIF to be displayed and then replaced by another and so on. Voilà! Instant "poor man's animation."

Because the key to server push lies in the header that appears before the document is sent, you can't use server push within a plain HTML document. You must use a CGI program to generate the document and send each part along as needed.

One of the great advantages of server push is that as long the browser remains on the page, the server can keep the connection open as long as it wants. Suppose you want to keep users around the world up-to-date on the local softball game. Using server push, a connection is kept open between your server and each viewer, and you could send an updated score whenever necessary.

Note
The advantage of a persistent connection can also be a disadvantage. Having a server keep the connection open keeps the server's process alive, which could cause added load to your system. This is usually not a problem, but if you get thousands of hits a day, it does add up.

For a simple example, you can go back to the original purpose of server push and make a poor man's animation. I'll use Perl for these examples, although this first one is simple enough for Bourne Shell. I start by cycling through a series of GIFs (assumed to be in the current directory and named 0.gif through 9.gif):

#!/usr/local/bin/perl -Tw

use strict; # The -Tw flags and "use strict" are always useful for
# keeping CGIs safe.

print <<EOP; # Print the HTTP header
Content-type: multipart/x-mixed-replace;boundary=I_Like_Snow

EOP
for $i (0..9) {
    print "--I_Like_Snow\n"; # Print boundary to start new image.
    print "Content-type: image/gif\n\n"; # Tell client that it's a GIF
    open (GIF, "$i.gif");      #
    while (<GIF>) { print; }   #  Send the GIF.
    close GIF;             #
}

print "--I_Like_Snow--\n"; # Tell client that you're done.

Tip
It might be necessary to read the documentation that came with your server before running these examples. Some servers have trouble with certain aspects of server push. Notably, ncSA httpd has been known to have fits if the HTTP header isn't exactly right.

Label this script as a CGI program (for example, cycle.cgi) and then insert the tag <IMG SRC="cycle.cgi"> into any HTML file in the same directory, and suddenly you have a full-blown animation accessible from the Web. As it stands, this script pushes the GIFs to the client as fast as it can. Because most people use 14.4Kbps modems or live far away from your server, this rapid pushing guarantees that the end user receives the animations as quickly as possible. This also means that people using a good connection might receive the images too fast to display and could skip images. You can avoid this by using sleep statements in strategic parts of the script to pause between images. Using sleep statements is also a good way to create slide-show-type animations. You could cause each image to stay on the screen for however long is necessary before the next image is called up.

As mentioned previously, you can also use server push to create an HTML document that updates itself as needed. As long as the client stays on the page, it receives the new information; as soon as the client leaves the page, the connection is broken and no more information is sent. For an example of this, consider the script in Listing 13.1, which displays the users currently logged into a UNIX machine and updates the display every time someone logs in or out.

Note
Listing 13.1 makes heavy use of the system() function so it must be run from a UNIX machine with the necessary commands. You probably need to change it somewhat for your system to reflect the location and differences in syntax of the commands.


Listing 13.1. Dynamically display current users.
#!/usr/local/bin/perl -Tw

print <<EOP; # Print the HTTP header
Content-type: multipart/x-mixed-replace;boundary=I_Like_Snow

EOP

LOOP:
while (1) {     # Perform an endless loop. We leave it up to the client
                # to break the connection.
         open(WHO, "w|");  # Run the 'w' command to list users.
         my @who = <WHO>;  # Collect the result in an array.
         close WHO;  # Close the command.

         my $i; # Dummy variable
         foreach $i (2..$#who) { # Start at the third line of output.
                                 # The first two are other information.
                my @fields = split(/ +/, $who[$i]); # Get the user name.
        push(@users2, $fields[0]); # Add it to list of
                       # current users.
    }
    foreach $user2 (@users2) {
        # If any of the current users were not present last
        # time through, add them to a list of new users.
        if (grep(!/$user2/,@users)) { push(@newusers,$user2); }
    }
    foreach $user (@users) {
        # If any of the users last time through are not
        # present this time, add them to a list of old
        # users.
        if (grep(!/$user/,@users2)) { push(@oldusers,$user); }
    }
    # If no one has logged in or out since last time, don't do
    # anything. This way a new HTML page is only sent when a
    # change is made.
    if (@newusers || @oldusers) {
        $who = join("<br>\n",@who); # Translate linefeeds to <BR>s.

        # Print the boundary, headers, and beginning of message.
        print <<EOX;
--I_Like_Snow
Content-type: text/html

<HTML><HEAD><TITLE>Current Users</title></head><BODY>
Current Users:<br>
$who
<br>
EOX
        # List everyone who has logged on since last time.
        foreach (@newusers) {
            print "$_ has logged on!<br>\n";
        }
        # List everyone who has logged out since last time.
        foreach (@oldusers) {
            print "$_ has logged off!<br>\n";
        }
        print "</body></html>\n";
    }
    # Wait 10 seconds then do it again.
    sleep 10;
}

This script is just a sample of what you can do with server push. Even though it has lost the spotlight to newer and flashier technologies, server push can still yield great results when applied creatively.

HTTP Cookies

Aside from being great with milk, cookies have become one of the most discussed server extensions ever. HTTP cookies are an attempt to solve one of the great hurdles of working with CGI: state retention.

Transfer over the Web is transient. Once the server sends the document to the client, it forgets that the client ever existed. If a user browses through a hundred pages on the same site, they make a hundred separate connections to the same server.

This presents a big problem for CGI programs. Almost all CGI-based applications, from games to databases, need to store information about the person using them between calls to the program. Traditionally, this requires saving information in temporary files or hidden variables in HTML forms. For instance, a series of CGI programs that interact with a database might ask for a username on the first page, save the data to a temporary file, and then pass the username and temporary filename from CGI to CGI by using hidden variables.

If this all sounds complicated, that's because it is. Cookies are an attempt to simplify this by allowing the browser to store information sent to it by the server. The browser then sends the information back each time it accesses that server. To avoid filling up the hard drive of the end user, cookies are generally very short pieces of information: usually an ID number or filename of a file on the server that contains more information.

Note
Although called server extensions, almost everything discussed in this chapter must also have support on the client side. This is especially true for HTTP cookies. Unless the browser has the capability to store the cookie information, your CGI can't do anything with it. At the time of this writing, several major browsers still have not implemented HTTP cookies, including Lynx-FM and ncSA Mosaic. If your pages absolutely depend on cookies, it's a good idea to provide an alternate mechanism for people who have no access to a cookie-capable browser.

Like server push, HTTP cookies are implemented through the HTTP header that is sent
before the document itself. The value of a cookie is passed to the client through the header Set-Cookie. The full syntax, as it would appear to a browser receiving a cookie, is as follows:

Set-Cookie: name=value; expires=date; path=path; domain=domain; secure

Everything except for name=value is optional. The following list outlines each part in more detail:

Every time a URL is entered into a cookie-aware browser, the browser checks to see if it has any cookies that belong to the domain of the URL. If it finds any, it checks them to see if the URL's path contains the path of any of the cookies. If any do, it sends the following as a header back to the server:

Cookie: name=value

If more than one cookie matches, the client returns

Cookie: name1=value1; name2=value2

for as many cookies as are valid.

If two or more cookies have the same name (which is possible if they have different paths), the browser sends them all. If the server is cookie-aware, it sets the environment variable HTTP_COOKIE once it receives the cookie header. It is through this environment variable that CGI programs retrieve the cookie information.

Listing 13.2 is a simple Perl script that displays any cookies that are sent from the browser.


Listing 13.2. Display cookies sent from the browser.
#!/usr/local/bin/perl -Tw

print <<EOP; # Print headers and beginning of message.
Content-type: text/html

<HTML><HEAD><TITLE>Mmmmm.... Cookies</title></head><BODY>
EOP

if (!$ENV{'HTTP_COOKIE'}) {
    # The server puts cookie information into the environment
    # variable 'HTTP_COOKIE. If there are no cookies, print a
    # message then leave.

    print <<EOP;
Sorry, the browser didn't send you any cookies!
</body></html>
EOP
    die;
}

print "<H1>Your Cookies</h1>\n";

$cookies = $ENV{'HTTP_COOKIE'};

# Split the cookie string up into individual cookies.
@cookies = split(/; */, $cookies);

foreach (@cookies) {
    # For each of the cookies, split the cookie up into two parts:
    # The name (key) and the value.
    ( $key, $val ) = split(/=/, $_);

    # If more than one cookie has the same name, combine their
    # values, separated by commas. (The syntax of the HTTP header
    # already guarantees that there are no commas in the cookie.)
    if ($COOKIE{$key}) { $COOKIE{$key} .= ",$val"; }
    else { $COOKIE{$key} = $val; }
}

# Now go through each of the cookies in alphabetical order.
foreach (sort keys %COOKIE) {

    # If the cookie's value has a comma in it, it must be a multiple
    # cookie.
    if ($COOKIE{$_} =~ /,/) {
        # Split the multiple cookie up in to each value...
        @vals = split(/,/,$COOKIE{$_});
        foreach $valu (@vals) {
            # ...and print each value as a separate cookie.
            print "$_ => $valu<br>\n";
        }
    } else {
        print "$_ => $COOKIE{$_}<br>\n";
    }
}
# Say bye bye.
print "</body></html>\n";

You could easily turn the core code of the preceding script into a subroutine that returns the hash %COOKIE containing all the cookie values. It is useful to do this if you use cookies often. Including this subroutine in an external package allows you to check for cookies easily in any CGI script.

Now that you can store and retrieve cookies, what good are they? Originally, they were mainly used for storing user preferences and to facilitate "shopping cart" systems. The ability to store data on the client side has proven to be very versatile. If you have access to a Netscape browser (version 2.0 or greater), turn on the option that brings up a requester every time you are presented with a cookie. It is amazing to see the number of sites using cookies for one reason or another.

To some people, it is also frightening. Of the two main reasons for using cookies, storing user preferences has come under great scrutiny as a possible danger to privacy. Soon after cookies were developed, many sites advertised the capability to alter their pages to suit a user's preference, which is done using cookies. Every page a user visits on one of these sites sends the browser a cookie. The home page is then altered to include links to pages that the user visits frequently. Some users feel that having their movements tracked in that way is a violation of their privacy-despite the fact that all the information gathered from the cookies is also available, albeit harder to interpret, from the logs of the Web server. It's generally a safe plan, if you really want to track a user's preferences, to inform the user of this up front and allow him the option of not having his information logged.

For an example of HTTP cookies, turn to their other major application: the "shopping cart." A Web shopping cart is simply a method for allowing the end user to browse a number of pages, selecting any number of items from HTML forms on those pages and storing the selections for later use. There need not be any actual shopping involved. In fact, the next example would be impractical for real commerce because no security is involved.

Tip
In this example, a module called cgi_head.pm is used to gather the information from the form. This module places a form entry with the name 'foo' into an associative array entry with name $FORM{'foo'}. Included in this module is the subroutine you saw previously, which loads all cookie information into an associative array called %COOKIE. There are several freely available programs for several languages to accomplish this, including CGI.pm for Perl at http://www.perl.com/perl/CPAN/.

For simplicity's sake, the shopping cart is contained in one CGI program that produces one HTML page. The real power of shopping carts is that they can span any number of pages, remembering the user's items throughout. Listing 13.3 displays the simple shopping cart.


Listing 13.3. Simple shopping cart.
#!/usr/local/bin/perl -Tw

require cgi_head; # Get %FORM and %COOKIE hashes.

# Now we create an associative array that contains information about our
# product. This information could be gathered in any method: from a flat file,
# from a database, from another process, etc. This information isn't even
# necessary for the actual program to run, it just provides a more realistic
# simulation.
%whatzit = (
    Blue => '14.95',
    Red => '12.50',
    Big => '19.95',
    Tiny => '4.95',
    Paisely => '99.95'
);

# We set a variable with the date as it will be 3 hours from now (10800
# seconds is 3 hours).
$later = time + 10800;
# We now convert the time into the format required by the Cookie syntax.
$date = gmtime($later);

# The browser didn't send us any cookies, this must be its first trip
# here. All we need to do is assign it an ID number and print out the
# introductory HTML form.
if (!%COOKIE) {

    # The user's ID number is a combination of the current time and
    # the current process id number. Since any two processes running
    # at the same time will have different id numbers, this creates a
    # pretty much unique number.
    $id = time . $$;

###########################IMPORTANT###################################
# The second line below is the actual 'Set-cookie' HTTP header.       #
# The name of the cookie is the ID number, and the value is '0'.      #
# (The ID number is preceded by 'mycartid' just in case someone       #
# else sends the user a cookie with the same ID number. Of course     #
# someone could send them a cookie that begins with 'mycartid' as     #
# well, but the more complex your cookie name is, the less likely     #
# it is to be accidentally duplicated.  The value will be replaced    #
# by whatever items the user chooses. The ''expires'' field is set to #
# the $date variable we create above which is 3 hours from now.       #
# This can be changed to whatever time is needed. The path should     #
# be changed from '/~me/' to whatever URL path points to the          #
# directory where this CGI is located. The domain should be changed   #
# from 'my.isp.com' to whatever your machine name is.                 #
#######################################################################

    print <<EOP;
Content-type: text/html
Set-cookie: mycartid$id=0; expires=$date; path=/~me/; domain=my.isp.com

<HTML><HEAD><TITLE>Welcome to the Whatzit Emporium!</title></head><BODY>
<H1 ALIGN=CENTER>The Whatzit Emporium</h1>
<p>Greetings and welcome to the Whatzit Emporium. Here you will find
Whatzitses of all shapes, sizes and colors to suit your Whatziting needs.

<p>To add a Whatzit to your cart, simply click on the 'Pick Me!' button
next to it. When you are ready to leave, simply click on the 'Check Out!'
button on the bottom of the screen.

<p>
<FORM ACTION="cart.cgi" METHOD=POST>
<UL>
<LI><INPUT TYPE="SUBMIT" NAME="Blue" VALUE=" Pick Me! "> Blue Whatzit
\$$whatzit{'Blue'}
<LI><INPUT TYPE="SUBMIT" NAME="Red" VALUE=" Pick Me! "> Red Whatzit
\$$whatzit{'Red'}
<LI><INPUT TYPE="SUBMIT" NAME="Big" VALUE=" Pick Me! "> Big Whatzit
\$$whatzit{'Big'}
<LI><INPUT TYPE="SUBMIT" NAME="Tiny" VALUE=" Pick Me! "> Tiny Whatzit
\$$whatzit{'Tiny'}
<LI><INPUT TYPE="SUBMIT" NAME="Paisley" VALUE=" Pick Me! "> Paisley
Whatzit
\$$whatzit{'Paisley'}
</ul>
<hr>
<INPUT TYPE=SUBMIT NAME="Checkout" VALUE=" Check Out! ">
</form>
</body></html>
EOP
    die;
}
# Notice that in the form above we don't have lots of input fields and one
# submit button, but rather lots of submit buttons and no fields. This is
# a handy technique for when you have a set of several things for the user
# to choose from but no other information is needed.


# If there are cookies to look at, look at each one.
CLOOP:
foreach (%COOKIE) {

    # If the cookie begins with 'mycartid' we assume it is one of
    # ours.
    if ($_ =~ /^mycartid(.*)$/) {
        # Grab the $id number from the name.
        $id = $1;
        # Grab the cookie value.
        $cookie = $COOKIE{$_};
        # Since we only send one cookie to each browser, now that we
        # have it, we don't need to look at any other cookies.
        last CLOOP;
    }
}
# Dump the unneeded cookies.
undef %COOKIE;

# If the cookie value is '0' they just came from the intro page and have
# nothing in their cart.
if ($cookie = '0') { $cookie = "");

# When we set the cookie value, we separate each type of Whatzit with a
# '+'. Now we split them into an array.
@items = split(/+/,$cookie);

foreach (@items) {
    # When we set the cookie, we separate the type of Whatzit from the
    # amount requested with '-'. Now we split each one into a type and
    # an amount.
    ( $type, $amount ) = split(/-/,$_);
    # We set the variable ${$type} to the amount of the that type. So
    # the variable $Blue will hold the number of Blue Whatzitses.
    ${$type} = $amount;
}

foreach (keys %whatzit) {
    # For each type of Whatzit (defined at the top of the script) we
    # check to see if the user just added onto their cart. If so,
        # we increment the corresponding variable.
    if ($FORM{$_}) { ${$_}++; }
    # If, after adding the most recent entry, there are any of this
    # type, add it and the amount held to an array.
    if (${$_}) { push(@cstring, "$_-${$_}"); }
}
# Combine the types and amounts held into one big string.
# This string would look something like this:
# Blue-3+Red-1+Tiny-1
$cstring = join('+', @cstring);

if ($FORM{'Checkout'}) {

    # If the user wants to checkout, we first clear their cookie, then
    # print the start of the checkout message.
    print <<EOP;
Content-type: text/html
Set-cookie: $id=; expires=$date; path=/~me/; domain=my.isp.com

<HTML><HEAD><TITLE>The Whatzit Emporium!</title></head><BODY>
<H1 ALIGN=CENTER>The Whatzit Emporium</h1>
<p>Thank you for shopping the Whatzit Emporium! We hope you have enjoyed
your visit. Here are the totals of your shopping trip:

<p>
EOP

    # Now for each of the Whatzit types, we multiply the number of
    # Whatzit's in the cart by the price of each Whatzit.
    foreach (keys %whatzit) {
        if (${$_}) {
            my $subtotal = ${$_} * $whatzit{$_};
            print
"$_ Whatzit: ${$_} \@ \$$whatzit{$_} ea. = \$$subtotal<br>";

            # Add them all up for the total.
$total += $subtotal;
        }
    }
    # Say bye bye!
    print <<EOP;
<hr>
Your total is \$$total
<hr>
Have a nice day!
</body></html>
EOP
    die;
}


# If the user isn't checking out, we send them a new cookie and resend
# the form so they can choose more Whatzitses to buy.
print <<EOP;
Content-type: text/html
Set-cookie: $id=$cstring; expires=$date; path=/~me/; domain=my.isp.com

<HTML><HEAD><TITLE>The Whatzit Emporium!</title></head><BODY>
<H1 ALIGN=CENTER>The Whatzit Emporium</h1>
<p>We hope you are enjoying your visit. If you have any questions please
feel free to ask one of our helpful associates.

<p>To add a Whatzit to your cart, simply click on the 'Pick Me!' button
next to it. When you are ready to leave, simply click on the 'Check Out!'
button on the bottom of the screen.

<p>
<UL>
EOP

# As we print out the form, we tell the user how many Whatzitses they have
# in their cart.
foreach (keys %whatzit) {
    print <<EOP;
<LI><INPUT TYPE="SUBMIT" NAME="$_" VALUE=" Pick Me! "> $_ Whatzit
\$$whatzit{$_}
EOP
    if (${$_}) {
        print "<BR>( You currently have ${$_} $_ Whatzitses\n";
    }
}

# Give the user the option to check out.
print <<EOP;
<hr>
<INPUT TYPE=SUBMIT NAME="Checkout" VALUE=" Check Out! ">
</form></body></html>
EOP

The preceding example is about as simple as it gets for shopping carts. All the items are on one page. There is no mechanism for removing items from the cart or for buying multiple items at once. These topics are covered more in Chapter 23, "Shopping Carts." The point here is showing what can be done with cookies. The only data you passed through the HTML form is the most recent addition to the cart. Yet, through using cookies, you were able to remember every single past addition the user had ever made. If the user were to turn off his computer and come back an hour later, the data would still be there. You chose to make the cookie expire after three hours, but that is completely arbitrary. However, in the interest of good manners, you should set your cookies to expire as soon as they are not needed, so they won't clutter up the user's hard drive.

Cookies have many other uses as well. In lieu of image-based counters, some sites have begun keeping track of hits by sending each visitor a cookie. Keeping track of this data, however, requires a modification to the server, or each page must be CGI generated to receive the cookie information.

Other Server Extensions

Although cookies and server push get the most press, many other extensions are written for servers. Most are proprietary to one server or another, and some require the browser to have special capabilities as well, but all are useful, one way or another, in extending the capabilities of ordinary CGI.

WebServer/400

An extreme example of what can be done with a Web server is the WebServer/400 from I/Net (http://www.inetmi.com/products/webserv/webinfo.htm). This is a server that runs only IBM AS/400 mainframes and uses them to unique advantage. The AS/400 is usually run through TN/3270 terminals with no graphics capabilities. This means that the interface to any AS/400 program is easily reproducible on the Web. WebServer/400 allows this, in a sense making
every program on the mainframe a CGI program. Input fields in the program are translated to HTML forms and the data is passed through the program to create new forms with the results. The entire machine can be run from a Web browser.

Apache Modules

The Apache server is a freely available Web server for UNIX that has swiftly become the most widely used server on the Web. At last count, 34 percent of all Web sites were using Apache as a server. The most recent versions of Apache have introduced a feature called modules that should make server extensions much more common in the future. Modules are server extension that are loaded into the server while it's running and used as they are needed. This takes up less memory than keeping them in the server all the time, and it is faster than calling an external CGI program. Apache itself comes with several modules, and many more are available in the public domain. I'll discuss the ones that bear a direct relationship with CGI programming in the following sections.

XSSI

XSSI (Extended Server-Side Includes) was developed by Howard Fear as an enhanced version of the server-side includes provided by Apache HTTPD. A server-side include is part of the HTML document that is parsed before being sent to the client. XSSI provides several capabilities previously only accessible through CGIs, such as simple if-then-else flow control and access to all environment variables set by the client. Using XSSI, a page could detect which type of browser is calling it and display HTML accordingly. More information on XSSI is available at ftp://pageplus.com/pub/hsf/xssi/xssi-1.1.html.

mod_rewrite

mod_rewrite is a module that intercepts the URL sent from the client and rewrites it based on a set of regular expressions. This is similar to the concept of URL aliases provided by most servers but takes it one step further. The incoming URL can be mapped to any other URL on the host machine. mod_rewrite was written by Ralf S. Engelschall and is available at http://www.engelschall.com/sw/mod_rewrite/.

mod_perl

mod_perl, written by Doug MacEachern, is a fully functional Perl interpreter linked dynamically to Apache, which cuts downs on the startup cost of launching Perl for each CGI request. mod_perl allows you to write your own server extensions in Perl and dynamically link them to Apache. You can find information on mod_perl at http://www.osf.org/~dougm/apache/.

CGI_SUGid and suCGI

One of the biggest hurdles in programming CGI is that all CGI programs are run as the user of the Web server (usually user "nobody"). Without this, security would be hard to maintain, but it also means that CGI programs can only write to directories that are world writeable. This makes interfacing with flat-file databases rather difficult. If the database is world writeable, any user on the host machine can change or delete the database at any time. If it's not, the CGI program can't modify it.

CGI_SUGid and suCGI are attempts to remedy this. These modules are installed in Apache as a replacement for the default CGI handler (mod_cgi). When a CGI program is called, it checks the owner of the program and executes the CGI as that owner. This way, CGI programs owned by you are run as you, allowing them write access to your directories while preserving security for the machine as a whole. CGI_SUGid was written by Philippe Vanhaesendonck and is available at http://linux3.cc.kuleuven.ac.be/~seklos/mod_cgi_sugid.c. suCGI is by Jason A. Dour and is available at http://www.louisville.edu/~jadour01/mothersoft/apache/.

WebCounter

Web counters are a hot commodity right now; everybody wants one. This popularity is despite the fact that they are horribly unreliable and that the odometer motif was cute for about one second. The WebCounter module is an attempt to rectify the "horribly unreliable" part. Having the server itself keep a count of how many times each page is hit is much more efficient than CGI programs that count hits to graphics (which, of course, ignore hits from text browsers or browsers with images turned off). WebCounter was written by Brian Kolaci and is available from ftp://ftp.galaxy.net/pub/bk/webcounter.tar.gz.

NeoWebScript

NeoWebScript is similar in concept to XSSI in that it allows basic scripting to be done within the HTML file itself. It takes a different approach, however. Instead of embedding the scripting commands in SSI-style includes, NeoWebScript commands are separated from the HTML by comment tags but otherwise resemble full scripts (as in JavaScript). NeoWebScript's language is based on Safe TCL, which in turn is based on TCL. With the recent announcement that Sun Microsystems is going to push TCL (and its graphical extension, Tk) as a companion to Java and the planned support of TCL/Tk in most major browsers (including Netscape), a server-side TCL interpreter provides a very useful complement to client-side implementations. More information on NeoWebScript is available at ftp://ftp.neosoft.com/pub/tcl/neowebscript/.

These are just some of the modules that extend the capabilities of the Apache server. An extremely useful list of modules for Apache is maintained by Zyzzyva Enterprises at http://www.zyzzyva.com/server/module_registry.

Jigsaw Resources

Jigsaw is a new HTTP server created by the World Wide Web Consortium (W3C). W3C consists of the people who create the Web, as well as many other entities, and its purpose is to promote the Web by creating official standards for HTTP and HTML and by writing cutting-edge software to push the frontiers of the Web.

The main difference between Jigsaw and all other Web servers in existence is that Jigsaw is written in Java. The W3C saw Java as an advantage in terms of portability and extensibility. Java support is available for all major platforms. Jigsaw has an extension capability similar to Apache except that the extensions are called resources rather than modules. Because Java bytecode need only be compiled once to run on any Java interpreter, Java resources can easily be added and removed from the server just by telling Jigsaw the location of the resource class.

Jigsaw is still under development and has some lingering bugs, but it looks promising. Unfortunately, no third-party Java resources have been announced at the time of this writing, but due to Java's popularity, they should be appearing soon. You can find more information about Jigsaw at http://www.w3.org/pub/WWW/Jigsaw/.

Netscape and Microsoft

The two commercial giants of the Internet, Microsoft and Netscape, also have their own Web servers. These servers come with their own proprietary extension APIs, which are covered elsewhere in this book. (Netscape Server API is covered in Chapter 26, "NSAPI," and Microsoft's Internet Server API is covered in Chapter 25, "ISAPI.")

Summary

The concept of creating extensions to HTTP servers adds whole new worlds to the capabilities of CGI programming. Open standard extensions such as server push and cookies allow all servers and browsers to expand. Server push provides rudimentary animation abilities and slide-slow features to allow dynamic, near real-time HTML pages. Cookies allow the end users to store information sent to them by servers. This information allows servers to keep track of users' progress and keep state information alive during the users' visits to the site (and even long after they have left).

Some extensions, such as the everything-can-be-a-CGI feature of the WebServer/400, are narrowly specific. Others, such as the Apache and Jigsaw servers' capability to allow users to add their own extensions, have the most general appeal.

As browsers grow more powerful and flashy, the reports of the imminent death of CGI become more frequent. By adding new functionality and extending the power of the server side of the equation, CGI takes on new life and gains even more power to provide true interactivity on the Web.