by Sue Charlesworth
Interest in and use of the World Wide Web has been expanding at a phenomenal rate. As the Web grows, so must its vehicle of communication, HTML. The HTML 2.0 specification is dated November, 1995. Since then, the HTML 3.0 draft specification expired on September 28, 1995, without becoming recommended, and HTML 3.2 became a W3C (World Wide Web Consortium) Recommendation on January 14, 1997. Now we have the public draft for HTML 4.0, announced on July 8, 1997. This draft is almost certain to undergo changes before being accepted by the W3C as a Proposed Recommendation--if it does, indeed, ever become a recommendation.
In addition to this official work on HTML, the browsers have been making their own additions to HTML. Some changes were eventually adopted into W3C HTML Recommendations; others remain proprietary coding aspects that only the individual browsers recognize. The browsers' versions of HTML changed, too, in a game of marketing and programming one-upmanship, hoping to lock Web developers into using one browser or the other exclusively.
Designing for the Web can be a confusing activity, indeed.
In order to keep up with (or try to) the rapidly changing world of HTML, we present here the changes between HTML 3.2 and HTML 4.0. HTML 4.0 introduces eight new elements, deprecates ten (more about deprecation in a bit), and makes obsolete three more. Frames, formerly only found in the browser versions of HTML, join the official fold. Tables provide better tabular presentation; forms more readily respond to the needs of the disabled; style sheets provide for better formatting and presentation; and multimedia, scripting, and printing are improved. And, as if that weren't enough, HTML 4.0 uses a different character-encoding format that expands the number of alphabets and languages able to implement Web documents.
Let's start with the changes to single tags first, then move on to the topics, like tables, that encompass more than an individual tag.
The W3C document "Changes between HTML 3.2 and HTML 4.0" lists eight new tags in HTML 4.0. A brief description of these tags follows.
<Q>...</Q> The <Q>...</Q> tag acts much the same as the <BLOCKQUOTE> tag, but applies to shorter quoted sections, ones that don't need paragraph breaks. Example:
According to the W3C, <Q>BLOCKQUOTE is for long quotations and Q is intended for short quotations that don't require paragraph breaks.</Q>
HTML 4.0 requires both the start tag and the end tag for <Q>.
<ACRONYM>...</ACRONYM> The <ACRONYM>...</ACRONYM> tag indicates an acronym in the text. <ACRONYM> is a "phrasal" tag, meaning that it helps define the structure of a text phrase. Make sure to use <ACRONYM> for the acronym itself, not the title that the letters stand for. <ACRONYM> behaves like <EM>, <STRONG>, and <CODE>. Example:
Working with the World Wide Web requires a good head for acronyms. <ACRONYM>HTML</ACRONYM>, <ACRONYM>WWW</ACRONYM>, and <ACRONYM>HTTP</ACRONYM> are but a few of the acronyms found around the Web.
HTML 4.0 requires both the start tag and the end tag for <ACRONYM>.
<INS>...</INS> and <DEL>...</DEL> Use <INS>...</INS> to mark parts of a document that have been added since the document's last version. <DEL>...</DEL>, similarly, marks document text that has been deleted since a previous version. Example:
Welcome to our online personnel policy guide. <INS>In the spirit of relaxed living, our dress code now requires only that you meet TV's decency standard.</INS> <DEL>In the spirit of conservative virtues, we require every employee to wear a suit to work every day.</DEL>
HTML 4.0 requires both the start tag and the end tag for both <INS> and <DEL>.
<COLGROUP>...</COLGROUP> <COLGROUP>...</COLGROUP> allows you finer control over the formatting of tables by specifying groups of columns that share width and alignment properties. Every table must have at least one <COLGROUP>; without any specific <COLGROUP> definition, HTML 4.0 assumes the table consists of a single column group that contains all the columns of the table. If you wanted, for example, to create a table that had a single, wide description column followed by a series of small check boxes, you would code:
<TABLE> <COLGROUP span="10" width="30"> <COLGROUP span="1" width="0*"> <THEAD> <TR>... </TABLE>
This way, the first <COLGROUP> tag formats all ten check boxes, much nicer than typing in ten identical specifications--for each row!
The start tag for <COLGROUP> is required; the end tag is optional.
<FIELDSET>...</FIELDSET> With the <FIELDSET>...</FIELDSET> tag, you can group related form fields, making your form easier to read and use. Human brains like to be able to classify information, and <FIELDSET> helps do just that. When you enclose a group of form elements in the <FIELDSET> tags, the browser will group the elements so you can easily tell they belong together. Figure A.1 shows how Internet Explorer 4.0 displays Listing A.1.
HTML 4.0 requires both the start tag and the end tag for <FIELDSET>.
<FIELDSET> groupings in Internet Explorer 4.0.
<HTML> <HEAD> <TITLE>Work preferences</TITLE> </HEAD> <BODY> We'd like you to help us design your new personnel policies. Please give us your preferences for the areas below. <FORM action="..." method="post"> <FIELDSET> <LEGEND align="top">Work week preferences</LEGEND> Number of days in work week: <SELECT NAME="WorkWeek" SIZE="5"> <OPTION VALUE="3day">3 <OPTION VALUE="4day">4 <OPTION VALUE="5day">5 <OPTION VALUE="6day">6 <OPTION VALUE="7day">7</SELECT> Number of hours in work day: <SELECT NAME="WorkDay" SIZE="5"> <OPTION VALUE="3day">5 <OPTION VALUE="4day">6 <OPTION VALUE="5day">7 <OPTION VALUE="6day">8 <OPTION VALUE="7day">9</SELECT> </FIELDSET> <P> <FIELDSET> <LEGEND>Boss preferences</LEGEND> I want a boss who is: <INPUT NAME="BossValues" TYPE="checkbox" VALUE="Fair">Fair</INPUT> <INPUT NAME="BossValues" TYPE="checkbox" VALUE="Generous">Generous</INPUT> <INPUT NAME="BossValues" TYPE="checkbox" VALUE="Easy">Easygoing</INPUT> </FIELDSET> </FORM> </BODY> </HTML>
<LABEL>...</LABEL> If you looked at the code for the <FIELDSET> example above, you saw the <LABEL>...</LABEL> tags in action. Use <LABEL> with <FIELDSET> to attach a label to the form grouping. Figure A.2 is the same as the <FIELDSET> example, except that the first <LABEL> has been removed. HTML 4.0 requires both the start tag and the end tag for <LEGEND>.
The <FIELDSET> example with the first <LABEL> removed.
<BUTTON>...</BUTTON> The <BUTTON>...</BUTTON> tag, another addition to forms, allows you to have push buttons on forms that more closely resemble push buttons available in Windows and other applications. Many aspects of <BUTTON> are similar to those of <INPUT> elements of types submit and reset, but <BUTTON>, in the words of the W3C, "allows richer presentational possibilities." One example of a "richer presentational possibility" is the fact that a <BUTTON> has beveled, shadowed edges, looking 3-D rather than flat, and "moves" when clicked, giving the impression of being pushed in, then released. Listing A.2 and Figure A.3 show buttons at work.
The <BUTTON> tag at work.
<HTML> <HEAD> <TITLE> Moccasin Day </TITLE> </HEAD> <BODY> <FORM action="http://somesite.com/prog/adduser" method="post"> Do you want to celebrate Wear Your Moccasins to Work Day? <P> <INPUT type="radio" name="vote" value="Yes"> Yes<BR> <INPUT type="radio" name="vote" value="No"> No<BR> <P> <BUTTON name="submit" value="submit" type="submit"> Send</BUTTON> <BUTTON name="reset" type="reset"> Reset</BUTTON> </FORM> </BODY> </HTML>
HTML 4.0 requires both the start tag and the end tag for <BUTTON>.
Deprecated tags and attributes are those that have been replaced by other, newer, HTML constructs. Deprecated tags are still included in the HTML draft or recommendation but are clearly marked as deprecated. Once deprecated, tags may well become obsolete. The draft "strongly urges" the nonuse of deprecated tags.
<ISINDEX>...</ISINDEX> <ISINDEX> allowed a form to contain a simple string search. This action should be replaced by an <INPUT> form element.
<APPLET>...</APPLET> The <APPLET>...</APPLET> tag enabled the running of a Java applet. This tag has been replaced by the more encompassing <OBJECT>...</OBJECT> tag.
<CENTER>...</CENTER> The <CENTER>...</CENTER> tag, oddly enough, centered text or graphics. <CENTER> is deprecated in favor of <DIV> tag with the align attribute set to "center."
<FONT>...</FONT> <FONT>...</FONT> allowed the specification of font sizes, colors, and faces. Style sheets, rather than HTML code, have taken over character formatting duties.
NOTE: Based as it is on SGML, HTML purists have never been happy using markup--the description of a document's structure--to define presentation, or how a document appears. With the formal (pending) adoption of style sheets, character formatting can be taken out of HTML code.
<BASEFONT>...</BASEFONT> <BASEFONT>...</BASEFONT> set a base font size that could then be referenced for size increases or decreases. Use style sheets instead to set and reference relative font sizes.
<STRIKE>...</STRIKE> and <S>...</S> Both <STRIKE>...</STRIKE> and <S>...</S> created strikethrough characters. Replace these tags with style sheets.
<U>...</U> <U>...</U> created underlined characters. As with the tags above, use style sheets to create underlines.
<DIR>...</DIR> Moving away from fonts, we have the <DIR>...</DIR> tag. <DIR> describes a directory list. While originally designed to output elements in horizontal columns like UNIX directory listings, browsers formatted <DIR> lists like unordered lists. As there is no difference between the two, use a <UL>...</UL> list instead of a <DIR>...</DIR> list.
<MENU>...</MENU> <MENU>...</MENU> lists have also fallen by the wayside. The <MENU> tag described single-column menu lists. As with <DIR> lists, browsers made no distinction between <MENU> and <UL> lists. Use <UL>...</UL> lists instead of <MENU> ones.
Obsolete tags have been removed from the HTML specification. While browsers may still support obsolete tags, there is no guarantee that this support will continue.
The three tags that become obsolete in HTML 4.0 are <XMP>, <PLAINTEXT>, and <LISTING>. In all cases, replace these tags with <PRE>.
NOTE: Despite the fact that the HTML 4.0 draft doesn't specifically mention frames as new, and despite the fact that you may have seen them in use for some time now, frames are new to the official HTML specification.
Despite its name, the World Wide Web has had some difficulty reaching out past the Western languages and alphabets. In general, character representation in HTML was largely confined to the use of the ISO 8859-1 (Latin-1) character set. This character set contains letters for English, French, Spanish, German, and the Scandinavian languages, but no Greek, Hebrew, Arabic, or Cyrillic characters, among others, and few scientific and mathematical symbols. Also, the Latin-1 character set contains no provisions for marking reading direction.
Part of the problem with Latin-1 is that it simply doesn't have room to handle all the alphabets and languages of the world. It is an 8-bit, single-byte coded graphic character set and, as such, can represent only up to 256 characters.
Enter Unicode. Unicode is a character-encoding standard that uses a 16-bit set, thereby increasing the number of encoded characters to more than 65,000 characters.
HTML 4.0 uses the Universal Character Set (UCS) as its character set. UCS is a character- by-character equivalent to Unicode 2.0.
© Copyright, Macmillan Computer Publishing. All rights reserved.