TruerWords Logo
Google
 
Web www.truerwords.net

Search TruerWords

Welcome
Sign Up  Log On

“Creative Commons, Trackback, HTML Comments, and Embedded RDF”

From: Seth Dillingham In Response To: Top of Thread.  
Date Posted: Monday, March 7, 2005 6:57:14 PM Replies: 3
   
Enclosures: None.

How many years do I have to work with HTML before I stop discovering important technical points of which I should have been aware all along?

Today's rather embarrassing example regards the format of HTML comments. This completely took me by surprise.

HTML comments have the following syntax:

    <!-- this is a comment -->
    <!-- and so is this one,
    which occupies more than one line -->

White space is not permitted between the markup declaration open delimiter("<!") and the comment open delimiter ("--"), but is permitted between the comment close delimiter ("--") and the markup declaration close delimiter (">"). A common error is to include a string of hyphens ("---") within a comment. Authors should avoid putting two or more adjacent hyphens inside comments.

Information that appears between comments has no special meaning (e.g., character references are not interpreted as such).

Note that comments are markup.

In other words, comments don't end with "-->". They end with "--" followed eventually by an occurence of '>'.

This explains problems I've seen for years but never bothered to dig into deeply enough. I thought that the HTML comment delimiters were simply a "balanced pair", similar to the < and > that mark the start and end of a tag. A "<!--" followed at some point by a "-->".

Practical implications? There are a few. The most obvious (in my world) is that one can't reliably "comment out" the results of some Conversant macros. For example, if the macro returns a user-defined string from a database (such as a message subject), that string might include a "--". If it does, then the very next '>' will close the comment.

Trackback "autodiscovery" data is RDF embedded in HTML comments. It looks something like this:

<!--
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
<rdf:Description
    rdf:about="http://www.example.com/example/page.html"
    dc:identifier="http://www.example.com/example/page.html"
    dc:title="HTML Comments -- They Don't Work How You Thunk"
    trackback:ping="http://www.example.com/4594/trackback" />
</rdf:RDF>
-->

See the problem? The dc:title attribute of the Description element contains a "--", and so the comment is closed by the very next '>'. That leaves the </rdf:RDF> and the --> outside of the comment, and in fact Firefox displays the --> on the web page!

Creative Commons Licenses work in a similar way: they embed licensing information in the HTML via comments. They're not bitten by this comment syntax problem, though, because they have more control over the attribute values of the tags, and can intentionally avoid the double-hyphen problem.

See Kendall Grant Clark's Creative Comments: On the Uses and Abuses of Markup for a discussion of the semantic problems with this approach, and Phil Ringnalda's throrough review of the trackback problem (years old).

I didn't see any mention, in my brief research, of using a technorati-style link to solve the problem. Instead of embedding the RDF in the html, we could link to an RDF autodiscovery file with an invisible-to-people link like <a href="http://www.example.com/mt-trackback.cgi?type=autodiscover&id=4594" rel="autodiscovery"></a>. Any reason that wouldn't work?

Comments appreciated.


Discussion Thread:
Trackbacks:

There are no trackbacks.



TruerWords
is Seth Dillingham's
personal web site.
Read'em and weep, baby.