¶ Tim Bray Looks Back On XML
Thursday, February 21, 2008, 8:54pm
XML is ten years old today. It feels like yesterday, or a lifetime. I wrote this that year (1998). It's really long.
The title was originally Good Luck and Internet Plumbing but the filename was "XML-People" and I decided I liked that better. I never got around to publishing it, so why not now?
Remember, it's ten years old; some of the people and companies are in different places now.
¶ KnitML For Knitters
Wednesday, December 12, 2007, 12:01pm
The KnitML Project's main goal is to develop and promote adoption of a standard content model for knitting patterns. By developing a community-supported specification (KnitML) and providing basic rendering and transformation tools, the KnitML Project aims to make KnitML easy to use and valuable to the knitter.
Now all I need is some sort of machine that I can feed this pattern to and I'll be able to knit at the press of a button, instead of sitting there for hours.
¶ Microformats Applied To Various Social Networking Hubs
Monday, October 8, 2007, 8:54pm
Why is it that every single social network community site makes you:
- re-enter all your personal profile info (name, email, birthday, URL etc.)?
- re-add all your friends?
In addition, why do you have to:
- re-turn off notifications?
- re-specify privacy preferences?
- re-block negative people?
AKA "social network fatigue problem" and "social network update/maintenance problem".
I love the mockup.
Drupal and XML
Saturday, March 13, 2004, 5:08pm
Everyone has heard of the latest standard, XHTML. Pretty much every blogger proudly displays his or her tag of validation, maybe even a valid CSS tag as well. That's good and fine, but are you serving the document as an XML document or just as tag soup?
First, for almost all people, there is no reason to serve documents as anything else but tag soup. The XHTML specification states that for HTML compliant XHTML 1.0 you should serve documents as application/xhtml+xml but that you may serve documents as text/html. For any other document types, such as XHTML and MathML, you may not serve the document as text/html, but still should serve documents as application/xhtml+xml.
Why would you serve a document as XML?
Sort of. HTML is fine for the average website. But what if you need to include another XML subset, such as MathML, in your document? The Guide For Authoring MathML For Mozilla says this about authoring MathML pages:
For a MathML fragment to be rendered in Mozilla, it should be a well-formed XML fragment within a well-formed XML document.
Mozilla is XML 1.0 compliant and can render XML pages that are styled with CSS or XSLT. Different XML vocabularies can be mixed in the same XML document, and each vocabulary will be distinguished by its namespace. Mozilla supports XHTML documents, the format of choice when serving MathML fragments embedded in other text.
Because of the limitations of HTML, the W3C is now backing XHTML and is not considering doing any major work on HTML anymore. XHTML (eXtensible HTML) is a reformulation of HTML in XML. Contrary to HTML where sloppy markups are tolerated, ill-formed XML (and thus ill-formed XHTML) documents are not rendered.
There's a few parts in there that jump out. The first is that MathML fragments must be a code fragment within a well-formed XML document. That means none of my pages can be tag soup, they must all be well-formed XML. Pushbutton, the latest theme for Drupal comes very close. I've only seen a handful of instances where there wasn't well formed XML and I fixed those in about half an hour.
The other part is that ill-formed XML documents aren't rendered. What that means is that, unlike HTML rendering, XML rendering isn't even attempted if there is one tiny error. Not closing a tag, not using the proper entity for an ampersand, anything that makes it invalid and the entire page is replaced by an error message. Not good if Joe Internet-User will be leaving comments here, he could break any page at his will just by posting an ill-formed comment.
mezzoblue goes into detail about this in his entry, Bulletproof XHTML:
However, Jacques soon discovered that he was Living On A Knife's Edge. Because he was using Mozilla's unforgiving XML parser, one little mistake, a mismatched tag, an unescaped entity, would choke his visitor's browser. And to his consternation, Jacques found that even if he wrote perfectly well-formed XHTML, other people were conspiring to mess up his web pages. By allowing comments, opening up trackbacks, and displaying snippets from alien RSS feeds, Jacques had opened up his site for any random visitor to crash that page with garbage markup. In order to produce 100% valid XHTML, Jacques realized that he had to "bulletproof" his site. Strip control characters. Validate comments. Batten the hatches. If he was going to take advantage of the power of XHTML, he would have to protect his site from his own mistakes and everyone else's.
OK, so all I've got to do is change how the documents are served and I'm good to go?
Nope. Internet Explorer craps out on on XML. Just doesn't do what it's supposed to. Any IE users can go to MozillaQuestQuest and they won't see a website, just a DOM tree. It's not the site that's broken, it's the browser. So we need to do some browser sniffing. I made use of the example found at Keystone Websites and changed Drupal's /includes/common.inc around a bit. This is what it had:
// spit out the correct charset http header
header("Content-Type: text/html; charset=utf-8");And this is what I changed it to:
// spit out the correct charset http header
$mime = "text/html";
$charset = "UTF-8";
if(stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml")||
stristr($_SERVER["HTTP_USER_AGENT"],"Validator")) {
$mime = "application/xhtml+xml";
}
header("Content-Type: $mime;charset=$charset");
Basically, if your user agent accepts XHTML or if you are a validator, I give you a application/xhtml+xml document, but if you're anyone else, such as IE, I give you a text/html document so you don't flip out and act all stupid.
But didn't you say that for something like MathML to be rendered, it must included in a well-formed XML document? What about IE users who want to see the included MathML?
Well, IE users can either get a real browser, such as Firefox or they can get the MathPlayer Plugin, which lets IE render MathML code fragments inside a plain HTML document.
OK, are you done now?
Nope. People here have to preview their comments before they go live. IE users will preview their ill-formed tag soup and go on with their lives, leaving the Mozilla users to see nothing but an error message when they try to view the same page. I need to work on a way to validate comments before letting them be submitted. If Movable Type can do it at Musings than Drupal should be able to as well. With Drupal's community plumbing, it could easy become courseware (with math, engineering and physics students in mind), making use of features that Movable Type lacks, such as the powerful taxonomy and book features.