A Touch of Class

f. rainne
2005-07-09

The Art of Semantics

A Romanian translation of this article is provided by Alexander Ovsov.

Horizontal Rules

Towards the end of the month of May, the WHATWG mailing list dove into a debate over the purpose of the <hr> element: whether it is purely presentational and should be dropped, whether it indicated a break between sections and should be replaced by proper sectioning, or whether indeed it had a different role to play:

Ian Hickson writes

On Sat, 21 May 2005, Anne van Kesteren wrote:
>
> Why doens't SECTION suffice? They are sections separated by decoration. 
> At least, that is how it appeals to me.

They're not really sections. The chapter is the section, these are 
paragraphs together in the same chapter, with a divider between some of 
the paragraphs.

I read a lot of fiction books and when I come across a "* * *" it reads to 
me like a paragraph, saying "Meanwhile, in a different part of the 
universe:"; it doesn't read as "end section. new section:".

IMHO.

Later on, Ian takes out his ASCII brushes to depict his point more clearly:

On Mon, 23 May 2005, Christoph Päper wrote:
> Anyhow you can still group paragraphs by wrapping them in a division 
> instead of dividing them by a separator. The latter is IMO not a very 
> markupish approach. MLs are usually about putting (informational) atoms 
> into bags and these into larger ones, iterated until you reach the top 
> one, the root.

The paragraphs are all part of the <section> (chapter). They're not 
further grouped together, IMHO. An <hr> is equivalent to a <p> with the 
content "Meanwhile, somewhere else..." or similar ("From someone else's 
point of view...", "At another time..."). In fact if a story was to be 
styled into comic form, <hr> elements would typically be presented as 
narrator-level text in the next paragraph's panel. For example:

   <p><q>No!<q> said Fred.</p>
   <hr>
   <p>The tree stood alone.</p>

In comic form:

   +----------+ +--------------+-------+
   |  _____   | | Meanwhile... |       |
   | < No! >  | +--------------+       |
   |  \/^^^   | |         /|\          |
   |  o       | |         /|\          |
   | -+-      | |          |           |
   | / \      | |          |           |
   +----------+ +----------------------+

Not that we have any user agents or stylesheet languages (short of actual 
humans) capable of that kind of interpretation and presentation today, but 
my point is that the <hr> here is a unit on par with a paragraph, it's not 
an artefact of an implied higher level grouping.

In text form:

   "No!" said Fred.

      *   *   *

   The tree stood alone.

In aural form:

   No!, said Fred. [pause] The tree stood alone.

If we didn't have <hr>, I would imagine the above example would be marked 
up as:

   <p><q>No!</q> said Fred.</p>
   <p>* * *</p>
   <p>The tree stood alone.</p>

...or some such. I really don't think:

    <p><q>No!</q> said Fred.</p>
   </plot>
   <plot>
    <p>The tree stood alone.</p>

...would be better than <hr>, in fact I think it would be unnatural from 
an authoring perspective. We mustn't fall into the trap of considering 
everything to be a hierarchy, just because that is what XML most easily 
marks up. Book authors have managed quite well for centuries without 
considering their documents to be formed of trees! (Notwithstanding what 
paper is made of, I mean.)

Meaningless Markup

Along the way, Christoph Päper and Ian also argued the semantics of <div> and <span>. As Christoph was proposing that <hr> should be replaced by deliberate use of classified <div>, he explained:

'div' is the proper HTML element type for subdivisions (of sections) 
that actually are not sections. (IMO sections always have a heading, 
'h'.) It, optionally, can be categorized with the 'class' attribute and 
be identified by an 'id' attribute.

But Ian didn't agree that the code in Christoph's example expressed his intended semantics:

Remember that class="" and <div> are meaningless. A document has
the same semantics after you strip out any class attributes and
<div> elements.

However, Christoph rejected such an extreme conception of <div>, <span>, and 'class', explaining that while such markup does not express the semantics of purpose, it still expresses the semantics of structure:

You say that and some others share that point of view, but I (and 
probably others) disagree. It is true that 'div' or 'class' don't 
provide semantics directly, but they do indirectly: Everything inside a 
'div' belongs together somehow and everything that shares a class 
(inside a document instance) is related to each other somehow. You 
cannot know /how/, but /that/.

FWIW, I share Christoph's point of view here.

x/HTML5

As Hixie reforms the Hypertext Markup Language through the WHATWG, he has subtly altered and refined the definitions of many seemingly simple elements. The new semantics are surprisingly elegant, and gracefully more powerful than before. I think if anyone is capable of taking HTML to its ideal, it's him, and I believe the language will owe Ian a great debt some years from now.