What's This "XML," Anyhow?

 

B Movies
Flixm Build
Flixm Dtd
Syntax Check
Whats Xml
Why Flixml
Xml Links
 

At a bare minimum, "XML" is an acronym for "Extensible [or, as it's usually styled, eXtensible] Markup Language."

Starting at the end...

The ML part of the name puts XML in a category both with its parent, the Standardized General Markup Language (SGML), and with its, uh, niece/nephew, Hypertext Markup Language. These are all involved with marking up text and documents in such a way as to indicate something about their contents -- particularly the structure of the contents.

All three languages use special "markers" to indicate to software that a particular chunk of text occupies a particular place in the document's structure. These markers, which themselves are plain old text labels, are called tags, and are distinguished from the actual text by being wrapped in special symbols: the less-than and greater-than symbols, < and > respectively.

Here's some sample HTML to give you the general idea:

<body>
<h1>Just XML</h1>
<p>"Why <i>Just XML</i>?" It's a good question....</p>
</body>

The tags -- <body>, <h1>, <p>, and <i> -- tell software that different pieces of this text are to be treated (or "understood") in particular ways. The <h1> tag, for instance, is a "primary heading"; <p> means "paragraph"; and so on.

...but the beginning is really important

You probably have heard, maybe even know, about HTML already. You're looking at HTML right now: it's the markup language that enables stuff to be displayed on the Web. (If you're using the Netscape browser, pull down the View menu and select Page Source in order to see the current page, HTML and all; if Microsoft's Internet Explorer, pull down View and select Source.)

Although HTML has enabled the Web to explode, it's got some severe limitations. The main one may not even have occurred to you: It forces all Web documents into the same sorts of structure. This goes against the grain of experience. For instance, open up any novel to a random page, and place alongside it any random page from a scholarly dissertation. The "paragraphs" look sort of the same, but they're also different. Suppose you were writing a novel about a scholar -- would the word "paragraph" truly mean the same thing regardless whether it referred to a chunk of narrative, one of description, one of dialogue, and one from the protagonist's academic writings?

No. Among other effects, that's why the Web search engines do such a cruddy job of telling you what you want to know, as opposed to what they think you asked for: They don't have any way of distinguishing one kind of content from another.

That's why the "extensible" in XML's name is so important. XML lets authors extend the tags used in their documents; they can create whole new tags, or define new "meanings" for existing ones. In theory, a smart enough search engine will let astronomers and groupies retrieve two entirely different sets of documents by using the keyword "star."

 
   





eXTReMe Tracker