HTML (like most other SGML applications) helps you describe the structure of your document in a way that is portable from one machine to another. It does this by using tags to surround important elements of text so that the computer can recognize them.
Here's an example of a tag in use:
A few tags are defined as empty: they only have a start-tag and they don't enclose any content, for example
It is possible to omit some of the end-tags in some restricted
circumstances, for example when a
</li> is followed directly by another
but it is good practice to be orthogonal unless (a) you know how SGML works and when HTML allows you to omit end-tags or (b) you are using a conformant SGML editor which can handle this kind of minimization for you.
A HTML file should be self-documenting, so it should begin with a header which specifies that this is HTML, and gives the title of the document and links it with the owner. The header is followed by the body which is where your text goes:
<html><head><title>How to make $1,000,000</title> <link rev="made" href="mailto:JillDoe@wunderkind.ulr.edu"> </head> <body> ... </body></html>This structure should occur in every file so that you know what it is and who is responsible for keeping it up-to-date. There are a few other tags which can be included for special effects which we'll come on to later. A few important points to note:
</html>tag surrounds everything in the file.
</head>element surrounds both the
</title>element and the
</title>element surrounds some text which you make up (here, `How to make $1,000,000');
<link>element is empty (there's no
</link>), but instead it includes some extra information inside the angle brackets which attributes the ownership. The
rev="made"is required as shown, but you must substitute your own electronic mail address in the
</body>element follows straight after the header and doesn't finish until just before the end of the file at the
The original HTML also defined
<plaintext> but these
are deprecated and will be dropped in HTML+.
mentioned earlier can go in the header if you want the document to be
searchable. If you use it, put it somewhere like right after the
</title> tag. The behaviour of browser
clients varies from implementation to implementation when they
encounter this tag. Mosaic inserts a prompt and a panel for the user
to type the search string, some others (Lynx, for example) need you to
press a key (S) first.
You can put comments in your file which you
can see when editing
it, but which won't get displayed to others. A comment looks like a
tag but has no name: instead there's a
!-- after the
opening angle bracket and a
-- before the closing one
(and no end-tag). The comment text in between can go over many lines:
Inside the body of the document, the most common element is
probably the paragraph. In the original HTML,
is specified without a
</p> end-tag, and is used to
separate paragraphs, and most browsers still accept this usage.
more normal SGML practice to use
</p> to enclose
paragraphs, like this:
<body> <p>Try typing a paragraph of your document. Put it between the start-tag and end-tag for the body of the document, as this one is.</p> </body>and HTML+ defines
<p>in this manner. You can have as many paragraphs as you want, one after another, each one inside its own
You cannot use blank lines on their own to separate paragraphs, as you do in a wordprocessor: SGML pays no attention to multiple blanks, tabs and linebreaks (except in special circumstances)- to make your text format correctly you use the tags.
Most documents come divided into some form of sections, each with
its own heading. HTML allows you up to six levels of section heading,
programs (browsers) usually display different sizes, colors or
positions of type for the headings. Here's a top-level heading:
Section headings in HTML are section levels,
not section numbers, so
<h3> means `heading level 3', not `section number
3': there is no automated section-numbering in HTML. You can have up
to six levels of headings: but each level can occur as many
times as necessary.
Most W3 clients (browsers) support the ISO Latin-1 character entity accents defined by the International Standards Organization (there are lots of others but they are not defined in HTML). They have to be typed in a special form, so that they work on all computers, because each computer manufacturer has his own (usually non-standard) idea about how to do accents, and none of them match! Don't be tempted to use your computer's own idea of accented letters, because they may be complete garbage to other users on different computers.
The form they take is a mnemonic for the name of the accent
enclosed between the two characters & and ;
(ampersand and semicolon) like this:
é so to
get `Resumé' you type:
ResuméIt's a bit longwinded if you don't have an SGML editor, but it's the only way to make sure your accents work on other people's machines: see the full list for more of them.
There are a few additional character entities for doing other stuff:
&gets you an ampersand (&)
<gets you the less-than (<) sign
>gets you the greater-than (>) sign
Mosaic provides a Multi-Locality enhancement to make use of various national character sets in the X version (2.*) but this is not defined in HTML.
If you've grasped that lot, you should be able to write a simple sectioned document:
<html><head><title>How to make $1,000,000</title> <link rev="made" href="JillDoe@wunderkind.ulr.edu"></head> <body><h1>How to make a million dollars</h1> <p>When I sat down to try and tell the world how they too could become millionaires, everyone said `John, you must be crazy! How could you possibly give away a secret like that?'</p> <p>Well, I decided that this kind of information was for sharing; it really wouldn't be right to keep it to myself. So here's my book, and I hope it makes you as much money as it's making me.</p> <h2>How to make money</h2> <p>Write a book that tells everyone how to do it, and make your money from the sales of the book. Like PT Barnum said, `there's a sucker born every minute' - <em>stultus omni momento nascitur</em> if you prefer!</p> </body></html>Remember, for normal text, formatting is irrelevant: browsers do the formatting for you, using the tags as guides. Give it a try, compare it with this one and come back for the next part when you're ready, where we'll look at lists, links and visual effects.