HTML

What is HTML?

HTML was the original “language” of the web, it was created to provide structure to documents.

The first “formal” definition was the HTML 2.0 specification1 published in 1995.

It is not a programming language in the traditional sense, as one does not write programs in HTML (though with HTML5 that line is much more blurred).

An HTML document is comprised of tags, which can have attributes and contain text or other tags.

<html>
    <body>
        <div class="this-is-an-attribute">
            This div contains text and some more tags, like a <a href="#link">link</a>.
        </div>
    </body>
</body>

The browser converts these documents to what you see in front of you.

Most of the early HTML tags were focused on presentation:


<table>
<tr>
    <th>item</th><th>price</th>
</tr>
<tr>
    <td>pancake</td><td>3.99</td>
</tr>
<tr>
   <td>egg</td><td>1.25</td>
</tr>
<tr>
   <td>toast</td><td>1.99</td>
</tr>
</table>

<b>this creates bold text</b>
<marquee>there were even elements that animated text</marquee>
item price
pancake 3.99
egg 1.25
toast 1.99
this creates bold text there were even elements that animated text, now deprecated

But a few special tags made HTML more than just a typesetting language.

Hypertext

With the most important tag being the <a> tag, which created links, or hyper-references.

These inline references were critical to the key idea of the web, to build a collaborative system of interlinked information.


<a href="https://en.wikipedia.org/wiki/Octopus">octopus</a>

octopus


We take this for granted today, but this is what makes the web a web.

HTML and HTTP take their names Hypertext Markup Language and Hypertext Transfer Protocol from this idea.

Forms

The other key idea that gave HTML a tremendous amount of power was the <form> tag.

MDN: The Form Element

This tag gave pages a way to prompt users for input, and send that input to the backend server when a button was clicked.

This made user logins possible, and furthermore the idea of a web application as opposed to a simple page.

Evolution of HTML

Semantic Markup & CSS

HTML evolved as a standard controlled by the W3C, founded by web-founder Tim Berners-Lee.

In the late 1990s they released several revisions of HTML under the HTML 4 name.

The goal was to try to standardize HTML since competing browsers were adding incompatible features.

There was also a push towards a concept of semantic HTML, the idea that markup should not specify how the information should be displayed, it should instead denote what the information is, and allow the browser (or other program, such as a screenreader for the visually impaired) make that decision.

It had become popular to use the <table> tag to lay out entire webpages, a practice that still persists today but is frowned upon.

<b>mitochondria</b>: the powerhouse of the cell <br>

This HTML creates a bold term, followed by a definition, and ends with a <br> tag which creates a blank line in the page.

The markup therefore serves the purpose of denoting how it should be displayed.

<dt>mitochondria</dt>
<dd>the powerhouse of the cell</dd>

This uses the “dictionary term” and “dictionary definition” tag, which indicate that the relationship between these items is akin to a dictionary. (It should be noted, it could still be used more creatively for things that have a similar kind of semantic meaning.)

The browser would then apply default styling to these, which the user could override.

It also becomes possible to configure this styling via an auxiliary language known as CSS.

JavaScript

JavaScript was the most revolutionary development that changed how HTML was used.

See that page for more details.

Permissiveness

One of the most notable things about HTML compared to most languages is how it is permissive of omitted syntax or outright mistakes.

Most HTML was written by hand, and people would forget closing tags, use special characters, and nest tags in unorthodox ways.2

HTML5

There were no formal revisions to the HTML spec from roughly 2000-2009.3

This led browser makers to form the Web Hypertext Application Technology Working Group (WHATWG) and begin work on the HTML5 spec. This was adopted a decade after HTML 4.0.

HTML5 encouraged more semantic HTML, dozens of new elements like <canvas> for dynamic graphics, and a suite of JavaScript APIs and extensions to the DOM. (See JavaScript for more details.)

WHATWG & the HTML Living Standard

The current status of HTML is that it is under the control of the WHATWG, as the HTML living standard.

Resources

The specifications published are very technical and not the best way to learn HTML.

Instead I’d highly recommend Mozilla Developer Network/MDN.

They have reference materials like https://developer.mozilla.org/en-US/docs/Web/HTML and learning paths for all key web technologies.


  1. This document is what is known as an RFC a type of technical memo used to define internet protocols. ↩︎

  2. It isn’t fair to blame humans alone for this. Lots of tools that generated HTML like MS Word generated atrocious HTML. It was not rare to see things like:

    <i>e</i>
    <i>v</i>
    <i>e</i>
    <i>r</i>
    <i>y</i>
    <i> </i>
    <i>l</i>
    <i>e</i>
    <i>t</i>
    <i>t</i>
    <i>e</i>
    <i>r</i>
    <i> </i>
    <i>i</i>
    <i>n</i>
    <i> </i>
    <i>a</i>
    <i> </i>
    <i>t</i>
    <i>a</i>
    <i>g</i>
    
     ↩︎
  3. Instead, they had introduced a very unpopular revision, XHTML. A merger of XML and HTML. ↩︎


the written language of the web

Last Updated: 2024-04-09

Status: sprout

Length: ~900 words, 4 minutes


table of contents more understanding the web