TruerWords Logo
Google
 
Web www.truerwords.net

Search TruerWords

Welcome
Sign Up  Log On

Topic: Creative Commons, Trackback, HTML Comments, and Embedded RDF

Messages: (4) 1


Author: Seth Dillingham

Date:3/7/2005

Permalink Icon

# 4599

Creative Commons, Trackback, HTML Comments, and Embedded RDF

How many years do I have to work with HTML before I stop discovering important technical points of which I should have been aware all along?

Today's rather embarrassing example regards the format of HTML comments. This completely took me by surprise.

HTML comments have the following syntax:

    <!-- this is a comment -->
    <!-- and so is this one,
    which occupies more than one line -->

White space is not permitted between the markup declaration open delimiter("<!") and the comment open delimiter ("--"), but is permitted between the comment close delimiter ("--") and the markup declaration close delimiter (">"). A common error is to include a string of hyphens ("---") within a comment. Authors should avoid putting two or more adjacent hyphens inside comments.

Information that appears between comments has no special meaning (e.g., character references are not interpreted as such).

Note that comments are markup.

In other words, comments don't end with "-->". They end with "--" followed eventually by an occurence of '>'.

This explains problems I've seen for years but never bothered to dig into deeply enough. I thought that the HTML comment delimiters were simply a "balanced pair", similar to the < and > that mark the start and end of a tag. A "<!--" followed at some point by a "-->".

Practical implications? There are a few. The most obvious (in my world) is that one can't reliably "comment out" the results of some Conversant macros. For example, if the macro returns a user-defined string from a database (such as a message subject), that string might include a "--". If it does, then the very next '>' will close the comment.

Trackback "autodiscovery" data is RDF embedded in HTML comments. It looks something like this:

<!--
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
<rdf:Description
    rdf:about="http://www.example.com/example/page.html"
    dc:identifier="http://www.example.com/example/page.html"
    dc:title="HTML Comments -- They Don't Work How You Thunk"
    trackback:ping="http://www.example.com/4594/trackback" />
</rdf:RDF>
-->

See the problem? The dc:title attribute of the Description element contains a "--", and so the comment is closed by the very next '>'. That leaves the </rdf:RDF> and the --> outside of the comment, and in fact Firefox displays the --> on the web page!

Creative Commons Licenses work in a similar way: they embed licensing information in the HTML via comments. They're not bitten by this comment syntax problem, though, because they have more control over the attribute values of the tags, and can intentionally avoid the double-hyphen problem.

See Kendall Grant Clark's Creative Comments: On the Uses and Abuses of Markup for a discussion of the semantic problems with this approach, and Phil Ringnalda's throrough review of the trackback problem (years old).

I didn't see any mention, in my brief research, of using a technorati-style link to solve the problem. Instead of embedding the RDF in the html, we could link to an RDF autodiscovery file with an invisible-to-people link like <a href="http://www.example.com/mt-trackback.cgi?type=autodiscover&id=4594" rel="autodiscovery"></a>. Any reason that wouldn't work?

Comments appreciated.

[Top]


Author: Brian Carnell

Date:3/7/2005

Permalink Icon

# 4600

Re: Creative Commons, Trackback, HTML Comments, and Embedded RDF

Seth Dillingham wrote:

>See the problem? The dc:title attribute of the Description element contains
>a "--", and so the comment is closed by the very next '>'. That leaves
>the </rdf:RDF> and the --> outside of the comment, and in fact
>Firefox displays the --> on the web page!
>
Oh, man, what a headache. I'd seen occasional -->'s on my sites and
figured I'd just had some screwy template. But I use a lot of headlines
that are " xxx -- yyy zzz".

[Top]


Author: Terry Frazier

Date:3/8/2005

Permalink Icon

# 4604

RE: Creative Commons, Trackback, HTML Comments, and Embedded RDF

Boy, this explains some things. I'm am forever using double-dashes as an emdash substitute because so many browsers don't honor the emdash character. Guess I'll have to stop that.

[Top]


Author: Seth Dillingham

Date:3/18/2005

Permalink Icon

# 4630

Phil Responds (sorta) re: HTML Comments and Linking Technologies

I thought Phil was ignoring me. I wrote to him on the 7th, after posting about HTML comments and embedded RDF, to ask what he thought of my suggesting regarding invisible links pointing to autodiscovery documents.

He wasn't ignoring me, he just couldn't put his brain into gear for this one. (I'm not picking on him, that's what he said!).

He asked people to comment on my idea. Unfortunately, most of the comments have nothing to do with the problem at hand, and many of the commentors apparently don't understand the issue. (They talk about everything from the usefulness of trackback to the second coming of <blink>.

For the record, and perhaps to help bring the discussion back to the real point, here's the bulk of the email I sent to him:

[SNIP]

<http://www.truerwords.net/4599>

At the end of that post, I had an idea/suggestion for linking to autodiscovery (or other types of metadata) documents that would seem to work with pages having multiple items (which has always been one of the two big problems with using a <link> elements in the head).

I realize it's not perfect, as the <a> tag doesn't actually have any identifying information that would allow the machine to associate it with a post on a page containing more than one (such as a weblog's home page). So, what about something like this?

<a rel="trackback:http://philringnalda.com/blog/2002/08/trackback_and_validation_summary.php" href="http://philringnalda.com/mt/mt-tb.cgi/55?_type=discover"></a>

The single value in the rel attribute manages to both identify it as a trackback-related url, and identify the server object it supports. It appears to be a legal value, as the spec only says that the value of rel must be a string with the space separating multiple values (but we're only providing one long value).

It's logical, too, making it very easy to automate.

Any thoughts on this? I'm interested not just for the sake of trackback, but because other technologies will have a chance to "bloom" if this hurdle can be overcome.

He didn't publish that email, which is good, but without it the people being asked to comment would really have very little on which to comment.

Hopefully he'll post a follow-up with a link to this entry, so some more relevant discussion can take place.

[Top]



<- Previous Thread:
"Don's Show"

Next Thread: ->
PVC Artillery Reprise

Until August 31
My Amazon sales
benefit the PMC

Homepage Links

Apr 1 - Aug 31
Ad revenue
benefits the PMC


TruerWords
is Seth Dillingham's
personal web site.
From now on, ending a sentence with a preposition is something up with which I will not put. - WC