UNIX Unleashed, Internet Edition
- 9 -
Formatting with Macro Packages
by David B. Horvath, CCP
This chapter introduces macros and macro packages. We will begin with a sample macro, and you'll see how and why it works. You'll then see it evolves from simple to complex.
Macro packages are collections of macros. A macro is a collection of troff primitives or requests. In this chapter, the man macro package (used to format the man, or manual, pages in the UNIX system) is examined and used as an example for other macro packages.
What Is a Macro?
With embedded troff primitives, you can format a page just about any way you want. The trouble is that you have to reinvent the wheel every time you write a new document. For example, every time you format a first-level heading, you have to remember the sequence of primitives you used to produce a centered 14-point Helvetica bold heading. Then you have to type three or four troff requests, the heading itself, and another three or four requests to return to the normal body style. While this may result in many lines, it is not very productive. It is a laborious process that makes it difficult, if not impossible, to maintain consistency over a set of documents.
There is a solution: You can use macros to simplify formatting and ensure consistency. Macros take advantage of one of the UNIX system's distinguishing characteristics: the capability to build complex processes from basic, primitive units. A macro is nothing more than a series of troff requests, specified and named, that perform special formatting tasks.
Chapter 10, "Writing Your Own Macros" explains how to write your own macros. You can use your own macros with the macro packages by embedding the macros in your source document or sourcing them in.
The man Macro Package
The man macro package is used to produce documents in a specific format. The format is used for UNIX system documentation manual pages--man pages, for short. In addition, information entered with the man macros is used to create the formidable permuted indexes so dear to the hearts of UNIX users.
There are only a few macros in the man package. Summary information is provided for the me and ms packages at the end of the chapter.
The man macros produce an 8.5 x 11-inch page with a text area of 6.5 x 10 inches. There is a troff--but not an nroff--option for producing a smaller 6 x 9 inch-page with a text area of 4.75 x 8.375 inches. If you choose this option, point size and leading are reduced from 10/12 to 9/10.
The .IN macro sets the indent relative to subheads. The default setting is 7.2 ens in troff and 5 ens in nroff.
The .LL macro sets the line length, which includes the value of IN.
The footer produced by the man macros is an example of making the best of a bad deal. By default, a hard coded date in the macro package is used instead of the current system date. The historical reasons for this behavior are not entirely clear, but it probably was a way of controlling updates to reference manuals.
The system administrator can change the date by modifying the macro package (the .TH macro (table heading) contains a string definition for a string called [5. [5 contains the date printed in the footer. To redefine [5, use the following at the top of your file:
.ds [5 "January 1, 2001
Now, what about that "Page 1"? Manpages are not numbered like ordinary document pages. The reason is that reference manuals are lengthy and are updated frequently. Furthermore, Bell Laboratories decided many years ago never to number replacement pages with letters, such as 101a, 101b, and so on. Because it was impractical to reprint a 2,000 page manual just because you had inserted two pages at the beginning, Bell Labs came up with another solution: Number the pages consecutively only for each entry; then start again with "Page 1."
You can change this, but you'll face the same dilemma that Bell Labs faced: What do you do about updates? Assuming this isn't a problem, how do you number reference manual pages consecutively?
You can achieve consecutive page numbering by using the register (-r) option to set the P register to 1 when you print your file:
troff -rP1 filename
Later in this chapter, Table 9.3 details the registers that can be set from the command line.
The man macros fall into two basic categories: headings and paragraph styles. Using these macros correctly is an art, not a science as it once was. Fonts are no longer as rigidly defined. For example, earlier UNIX reference manuals did not use a monospace--or constant width--font. Today, monospace is routinely used for file and directory names and for "computer voice," which is anything you see on the screen. Sometimes a distinction is made between monospace (\f(CW) and bold monospace (\f(CB). Bold monospace is used to indicate what the user types; it appears in the syntax section of a manpage.
The example in Figure 9.1 represents one way of using the man macros. Type styles are a matter of individual or company preference.
man recognizes three types of headings:
.TH and .SH are mandatory. A manpage must have a .TH and at least one .SH.
.TH takes up to four arguments. These are positional arguments. Therefore, if you don't use the third (and least common) argument but you want the fourth, you must insert a null argument ("") before the fourth argument. The syntax for .TH is
.TH <title> <section number> <commentary> <manual name>
title specifies the title of the manpage. This appears in the page header on the left and the right. It can be more than one word, so enclose it in quotation marks. The title of the manpage shown in Figure 9.1 is namehim.
section number indicates the section of the reference manual to which the entry belongs. The standard sections are broken down as shown in Table 9.1.
Table 9.1. Manual Section Numbers.
The section number appears in the header in parentheses after the title. Don't include parentheses; they are supplied automatically. The manpage shown in Figure 9.1 has 0 as the section number, even though 0 is not really a permissible section number.
commentary is an extra comment, such as Local. The argument appears in the header. It must be enclosed in quotation marks if there are embedded blanks. The manpage shown in Figure 9.1 doesn't have any commentary.
Listing 9.1 shows the man macros and text used to produce the sample manual page shown in Figure 9.1 (after processing by troff -man). The use of .HP, .IP, .TP, .RS, and .RE are included.
Listing 9.1. Basic man source.
.TH namehim 0 "Novelist's Work Bench" .SH NAME namehim - supplies one or more names (first, last, or both) for fictional character .SH SYNTAX namehim [ -F | -L ] [ -t type ] [ -a age ] [ -y year ] ... .SH OPTIONS .IP "-F | -L" 3m specifies first or last name; if neither F nor L is specified, both are produced. .IP -t 3m Specifies type of name: select from the following (may be combined): .RS .IP a 3m all .IP f 3m fancy .IP h 3m hero .IP l 3m light .RE .TP 3m -a Specifies the character's age, the younger the character, the more likely it will be a nickname. .TP 3m -y Specifies the year or era of the character. The older names will be more in the Medieval style.
manual name is the name of the manual--for example, UNIX System V or Documenter's Workbench. The name of the manual shown in Figure 9.1 is Novelist's Workbench.
.TH is a shared macro name; it has one meaning in the man macro package and another for the tbl preprocessor. The .TH macro for the tbl preprocessor is used to specify column headings on a multiple page table. It is identified by starting and ending macros--.TS and .TE. This presents a potential problem. tbl is described in Chapter 11, "Tools for Writers."
The .TH table heading macro can appear only within a .TS and .TE pair. Supposedly, this insulates the macro and alerts the macro processor to rename the .TH man title macro whenever a .TS is encountered. However, you are not guaranteed that this will happen. Be sure to use tbl before nroff/troff if your document contains tables.
Some implementations of the man macros support automatic preprocessing by tbl and eqn/neqn by inspecting the first line of the file. To force tbl preprocessing, the first line should consist of '\" t.
The .SH macro is a crucial one. With .TH it is mandatory for manpages. It is customarily followed by a keyword, although you can specify any word or words you want. The most common .SH keywords are
SYNTAX or SYNOPSIS
EXAMPLE or EXAMPLES
The .SH macros are used like this:
.SH NAME namehim - brief description of entry
Text following .SH is indented, as shown in Figure 9.1.
.SH keywords are always printed in all caps, and you don't need to put quotation marks around a two-word keyword. If you do use quotation marks, they won't be printed.
The most crucial .SH is .SH NAME. .SH NAME is mandatory. It is used to produce the permuted index, and its arguments must be entered on a single line, no matter how they are. No period is used at the end of the line. Naturally, it's a good idea to be as terse as possible.
The manpage shown in Figure 9.1 uses .SH OPTIONS after .SH SYNTAX. An alternate style sometimes seen in the reference manuals is the where form, which puts the word where on a line by itself and lists the options and arguments shown in the syntax section.
If a manpage needs headings under the .SHs, use .SS. Text following .SS is indented further.
There are four ordinary paragraph macros:
To set the indentation for .PP (and .P), use number register PI. The default unit is ens, but you can use any unit you want as long as you specify it. Unlike ms, man provides the .PD macro to change the spacing between paragraphs.
The .PD macro is nothing more than ms's PD number register turned into a macro. Because the format of manpages is so exacting, writers need more control over spacing. The argument to .PD specifies interparagraph spacing. Remember, when using nroff, this argument is interpreted as whole lines; for troff you can specify .3v or something similar. .PD is most often used to suppress spacing between list items, which are paragraphs in man. This is done very simply: .PD 0. The default spacing for .PD is .4r in troff, one line in nroff.
man has three hanging paragraph styles: .HP, .IP, and .TP. .HP is a simple hanging paragraph. The first line is flush with the margin. All subsequent lines in the paragraph are indented by the amount specified in the argument to .HP. .TP is more complex, and it is described later, following the discussion of .IP.
The .IP macro is similar to the ms .IP macro and is useful for formatting lists. .IP can take two arguments. The first argument is a label, or tag. It can be a word, a number, or even the troff code for a bullet. The second argument specifies how far in from the left margin to indent the rest of the first line and all the rest of the paragraph.
The .RS and .RE pair is used to create relative indents. .RS (relative start) starts a 5-en indent from whatever the current indent is. .RE returns to the indent whatever it was before .RS was called. For every .RS in your file, you need a .RE to undo it. You can use this pair of macros to build nested lists.
.TP is similar to .IP. In fact, .TP produces virtually the same output. However, you specify it a little differently. Whereas .IP takes two arguments, .TP takes only the indentation. The line following the .TP macro call is called the tag. If the tag is wider than the specified indentation, the text following the tag starts on the next line. You can use .IP without a tag (actually, a null tag), .TP requires a tag. That tag can be a blank line.
Fonts and Point Size
man recognizes the .R (roman), .I (italics), and .B (bold) macros--all of which operate exactly as they do in ms and mm. man permits all six permutations of alternating roman, italic, and bold fonts:
You may never have occasion to use these macros, but it's nice to know that they're available.
In addition to the font change macros, there is one macro for changing point size: .SM. man needs .SM more than the other macro packages because manpages contain terms with long names that must be written in capital letters. To make these terms more readable and to conserve space, man includes a macro that produces a smaller point size--two points smaller.
.SM has another special use: printing the word UNIX in capital or small cap letters. Because UNIX is a registered trademark, it should be printed in a way that distinguishes it from ordinary text. Sometimes it appears in all capital letters. Another acceptable way is with a capital U and small capital N, I, and X, as in UNIX.
The only preprocessor macros recognized by man are the .TS and .TE table macros. The table macro .TH can cause problems.
The man package has three predefined strings. They are
.TH resets tab stops whenever it is called. The default settings are every 7.2 ens in troff and every 5 ens in nroff. However, experimenting with various customized indents might affect tab settings. If you want to restore the tab settings and you can't wait for the next .TH, use the .DT macro.
The .PM (proprietary marking) macro is interesting for its history, but unless you change its text, it isn't really useful. It takes two arguments. The first argument identifies the type of marking, such as Proprietary or Restricted. The second argument is the year. If you omit the year, the default is the current year.
Using man Macros with troff and nroff
You can invoke the man macros with the troff or nroff command. Printing man files is covered in detail in the "Printing Files Formatted with man Macros" section in this chapter.
man Macro Summary
Table 9.2 lists the man macros and describes their functions.
Table 9.2. Summary of the man macros.
Printing Files Formatted with the Standard Macro Packages
You can use either nroff or troff to process files formatted with the standard macro packages, ms, me, and man.
Both nroff and troff expect to find a pointer to the appropriate macro package in the /usr/lib/tmac directory and to find the macro file in the /usr/lib/macros directory. Some versions look to the /usr/ucblib/doctools/tmac directory for the packages and files.
Printing Files Formatted with ms, me, and man
You can use either nroff or troff to process files that use the me, ms, or man macros. All of the options shown in Table 9.7 can be used; however, the -r option has limited use because all predefined number registers in me and ms have two-character names.
Most of man's predefined number registers also have two-character names. You can set register s to 1 to reduce the page size from 8.5 x 11 to 5.5 x 8.
When you use nroff or troff to print files formatted with these macro packages, your command line takes this form:
nroff -ms options filenames troff -ms options filenames nroff -me options filenames troff -me options filenames nroff -man options filenames troff -man options filenames
The options must precede the filename(s).
A complete listing of nroff and troff options can be found in Table 9.7.
Setting Number Registers from the Command Line
The -r option to nroff/troff lets you set certain number registers on the command line. This initializes the registers because it is done before the macro package is called. Only registers with one-character names can be initialized this way.
Table 9.3 lists the registers that can be initialized with the -r option to nroff/troff.
Table 9.3. Registers that can be initialized on the nroff/troff command line.
The -r option is useful if you have a file that will be printed somewhat differently over the course of its life. As an example, assume the first draft of your document has to be double spaced and have the word "DRAFT" at the bottom of every page. Set the C register to 4 on your command line:
troff -ms -rC4 docname
As the document nears completion, you have to print it single spaced, but you still want the word "DRAFT" at the bottom of every page:
troff -ms -rC3 docname
When the document is complete, you can use -rC1 to print "OFFICIAL FILE COPY" at the bottom of each page, or you can use -rC0 to omit that line entirely.
Error messages are largely self explanatory. They can be generated by the system (if you type torff instead of troff), by nroff or troff, by the macro package, or by the preprocessors (the tbl, eqn/neqn, pic, and grap sections of Chapter 11, "Tools for Writers," contain information about preprocessor error messages).
It doesn't really matter whether troff or a macro package generates a message; you have to correct the error. Errors usually fall into one of the following categories:
The one thing to remember is that the line number, helpfully supplied by troff, is the troff output line number. So it's not uncommon to be told that you have an error in line 1500 when your text file is 600 lines long. Macro packages attempt to give you the source file line number. Don't wager a large amount on its accuracy.
me Macro Summary
The me macro package is generally used for formatting technical papers.
Table 9.4 lists the me macros and describes their functions.
Table 9.4. Summary of the me macros.
ms Macro Summary
The ms macro package is used for general formatting.
Table 9.5 lists the ms macros and describes their functions.
Table 9.5. Summary of the ms macros.
The ms macro package supports a number of numeric registers and predefined strings. Table 9.6 lists the ms macro package registers and describes their functions. Table 22.7 lists the ms macro package predefined strings and describes their functions.
Table 9.6. Summary of the ms macro package registers.
Table 9.7. Summary of the ms macro package predefined strings.
Using macro packages such as man, ms, or me, ease the development of documents and standardizes the "look and feel." As you work with a particular package, you learn more about it, but you can start with only the basics and produce good-looking documents.