What Is XML?
By Steve Hoenisch
Last updated on May 25, 2002 | Search this page: Ctrl+F
View XML Source
Extensible Markup Language is a metalanguage for describing and structuring data with tags. Metalanguage means a language for how to describe other languages. Like HTML, XML uses tags (words bracketed by "<" and ">"), but unlike HTML, XML has neither a predefined set of tags nor rules for how to use them (though XML does have generic rules governing markup; for instance, tags may not overlap). In XML, tags and the rules for them, or grammar, are defined by the users themselves.
XML programmers use tags and their corresponding grammar to describe and structure data in text files (as opposed to a binary format), making the data easy to reuse, manipulate, and search. Because users can define their own tags, XML, as its full name implies, is also extensible, meaning that, unlike HTML, it is capable of being extended.
The hodgepodge of abbreviations and acronyms surrounding XML points up another of its characteristics: It is a family of technologies designed to be used over the Internet. Some of XML's family members and their main functions are as follows:
Extensible Stylesheet Language (XSL), which in turn comprises Extensible Stylesheet Language Transformations (XSLT) and Extensible Stylesheet Language Formatting Objects (XSL-FO). XSL is an XML-based language for expressing stylesheets. XSLT is a language for transforming XML documents into (typically) other XML documents, plain text, HTML, and WML. XSL-FO is used to format XML documents for printing.
Document Type Definition (DTD). Stemming from Standard Generalized Markup Language, the progenitor of both HTML and XML, a DTD defines the rules that constrain an XML document or a set of XML documents.
XML Schema may eventually supplant DTDs as the primary mechanism for constraining XML data. An XML Schema, which is itself in the format of an XML document, serves the same function as a DTD while correcting some of its limitations.
XLink, XPath, and XPointer together provide the foundation for creating links and asserting relationships within and among XML documents. XLink describes how to create hyperlinks among XML documents and provides mechanisms for advanced capabilities like multidirectional linking. XPath, heavily used by XSLT, allows users to locate specific nodes or node sets within XML documents. XPointer specifies how to describe and point to locations of various kinds within an XML document. Meantime, For a quick description of linking in XML, check out XML Linking: An Introduction at http://www.stg.brown.edu/~sjd/xlinkintro.html.
Namespaces is a specification that describes how to associate a URL with all the tags in an XML document to ensure that they are unique, thereby eliminating ambiguity when the same tag is used by different XML coders.
The Document Object Model (DOM) and the Simple API for XML (SAX). The Document Object Model, itself an Application Programming Interface, is used to represent the structure and content of XML (and other) documents and provides an interface that allows programs and scripts to manipulate them. SAX, like DOM, also allows applications to get information from an XML document and do something with it.
XML Query. XML's answer to SQL.
You can visit the World Wide Web Consortium's (W3C) web site, http://www.w3c.org
, for the rundown on most of these initiatives.
For another description of what XML is and what its core related technologies are, take a look at "XML in 10 Points
," at http://www.w3.org/XML/1999/XML-in-10-points.
An article called "Converting Unstructured Documents to XML
," at http://xml.oreilly.com/news/xmlnut3_0301.html, demonstrates how to isolate atomic elements to reveal a document's underlying structure and convert it to XML.
The tutorials in this series proceed as follows:
An Introduction to XML
Structuring Documents in XML
Developing a Document Type Definition
Attributes and Entities in DTDs
An Introduction to XSL
Using XSLT to Separate Content from Presentation