The Art of Semantics
A Romanian translation of this article is provided by Alexander Ovsov.
Horizontal Rules
Towards the end of the month of May, the WHATWG
mailing list dove into a debate over the purpose of the
<hr>
element: whether it is purely presentational and should be dropped,
whether it indicated a break between sections and should be replaced
by proper sectioning, or whether indeed it had a different role to
play:
Ian Hickson writes
On Sat, 21 May 2005, Anne van Kesteren wrote: > > Why doens't SECTION suffice? They are sections separated by decoration. > At least, that is how it appeals to me. They're not really sections. The chapter is the section, these are paragraphs together in the same chapter, with a divider between some of the paragraphs. I read a lot of fiction books and when I come across a "* * *" it reads to me like a paragraph, saying "Meanwhile, in a different part of the universe:"; it doesn't read as "end section. new section:". IMHO.
Later on, Ian takes out his ASCII brushes to depict his point more clearly:
On Mon, 23 May 2005, Christoph Päper wrote: > Anyhow you can still group paragraphs by wrapping them in a division > instead of dividing them by a separator. The latter is IMO not a very > markupish approach. MLs are usually about putting (informational) atoms > into bags and these into larger ones, iterated until you reach the top > one, the root. The paragraphs are all part of the <section> (chapter). They're not further grouped together, IMHO. An <hr> is equivalent to a <p> with the content "Meanwhile, somewhere else..." or similar ("From someone else's point of view...", "At another time..."). In fact if a story was to be styled into comic form, <hr> elements would typically be presented as narrator-level text in the next paragraph's panel. For example: <p><q>No!<q> said Fred.</p> <hr> <p>The tree stood alone.</p> In comic form: +----------+ +--------------+-------+ | _____ | | Meanwhile... | | | < No! > | +--------------+ | | \/^^^ | | /|\ | | o | | /|\ | | -+- | | | | | / \ | | | | +----------+ +----------------------+ Not that we have any user agents or stylesheet languages (short of actual humans) capable of that kind of interpretation and presentation today, but my point is that the <hr> here is a unit on par with a paragraph, it's not an artefact of an implied higher level grouping. In text form: "No!" said Fred. * * * The tree stood alone. In aural form: No!, said Fred. [pause] The tree stood alone. If we didn't have <hr>, I would imagine the above example would be marked up as: <p><q>No!</q> said Fred.</p> <p>* * *</p> <p>The tree stood alone.</p> ...or some such. I really don't think: <p><q>No!</q> said Fred.</p> </plot> <plot> <p>The tree stood alone.</p> ...would be better than <hr>, in fact I think it would be unnatural from an authoring perspective. We mustn't fall into the trap of considering everything to be a hierarchy, just because that is what XML most easily marks up. Book authors have managed quite well for centuries without considering their documents to be formed of trees! (Notwithstanding what paper is made of, I mean.)
Meaningless Markup
Along the way, Christoph Päper and Ian also argued the semantics of
<div>
and <span>
.
As Christoph was proposing that <hr>
should be replaced by deliberate use of classified <div>
,
he explained:
'div' is the proper HTML element type for subdivisions (of sections) that actually are not sections. (IMO sections always have a heading, 'h'.) It, optionally, can be categorized with the 'class' attribute and be identified by an 'id' attribute.
But Ian didn't agree that the code in Christoph's example expressed his intended semantics:
Remember that class="" and <div> are meaningless. A document has the same semantics after you strip out any class attributes and <div> elements.
However, Christoph
rejected
such an extreme conception of
<div>
,
<span>
,
and 'class'
,
explaining that while such markup does not express the semantics of purpose,
it still expresses the semantics of structure:
You say that and some others share that point of view, but I (and probably others) disagree. It is true that 'div' or 'class' don't provide semantics directly, but they do indirectly: Everything inside a 'div' belongs together somehow and everything that shares a class (inside a document instance) is related to each other somehow. You cannot know /how/, but /that/.
FWIW, I share Christoph's point of view here.
x/HTML5
As Hixie reforms the Hypertext Markup Language through the WHATWG, he has subtly altered and refined the definitions of many seemingly simple elements. The new semantics are surprisingly elegant, and gracefully more powerful than before. I think if anyone is capable of taking HTML to its ideal, it's him, and I believe the language will owe Ian a great debt some years from now.