XML Interviews Question page8
How do I write my own DTD?
You need to use the XML Declaration Syntax (very simple: declaration keywords begin with
<!ELEMENT Shopping-List (Item)+>
<!ELEMENT Item (#PCDATA)>
It says that there shall be an element called Shopping-List and that it shall contain elements called Item: there must be at least one Item (that's the plus sign) but there may be more than one. It also says that the Item element may contain only parsed character data (PCDATA, ie text: no further markup). Because there is no other element which contains Shopping-List, that element is assumed to be the ‘root’ element, which encloses everything else in the document. You can now use it to create an XML file: give your editor the declarations:
<!DOCTYPE Shopping-List SYSTEM "shoplist.dtd"> (assuming you put the DTD in that file). Now your editor will let you create files according to the pattern:
It is possible to develop complex and powerful DTDs of great subtlety, but for any significant use you should learn more about document systems analysis and document type design. See for example Developing SGML DTDs: From Text to Model to Markup (Maler and el Andaloussi, 1995): this was written for SGML but perhaps 95% of it applies to XML as well, as XML is much simpler than full SGML—see the list of restrictions which shows what has been cut out.
Incidentally, a DTD file never has a DOCTYPE Declaration in it: that only occurs in an XML document instance (it's what references the DTD). And a DTD file also never has an XML Declaration at the top either. Unfortunately there is still software around which inserts one or both of these.
Can a root element type be explicitly declared in the DTD?
No. This is done in the document's Document Type Declaration, not in the DTD.
I keep hearing about alternatives to DTDs. What's a Schema?
The W3C XML Schema recommendation provides a means of specifying formal data typing and validation of element content in terms of data types, so that document type designers can provide criteria for checking the data content of elements as well as the markup itself. Schemas are written in XML Document Syntax, like XML documents are, avoiding the need for processing software to be able to read XML Declaration Syntax (used for DTDs). There is a separate Schema FAQ at http://www.schemavalid.comFAQ. The term ‘vocabulary’ is sometimes used to refer to DTDs and Schemas together. Schemas are aimed at e-commerce, data control, and database-style applications where character data content requires validation and where stricter data control is needed than is possible with DTDs; or where strong data typing is required. They are usually unnecessary for traditional text document publishing applications.
Unlike DTDs, Schemas cannot be specified in an XML Document Type Declaration. They can be specified in a Namespace, where Schema-aware software should pick it up, but this is optional:
More commonly, you specify the Schema in your processing software, which should record separately which Schema is used by which XML document instance. In contrast to the complexity of the W3C Schema model, Relax NG is a lightweight, easy-to-use XML schema language devised by James Clark (see http://relaxng.org/) with development hosted by OASIS. It allows similar richness of expression and the use of XML as its syntax, but it provides an additional, simplified, syntax which is easier to use for those accustomed to DTDs.
How do I get XML into or out of a database?
Ask your database manufacturer: they all provide XML import and export modules to connect XML applications with databases. In some trivial cases there will be a 1:1 match between field names in the database table and element type names in the XML Schema or DTD, but in most cases some programming will be required to establish the desired match. This can usually be stored as a procedure so that subsequent uses are simply commands or calls with the relevant parameters. In less trivial, but still simple, cases, you could export by writing a report routine that formats the output as an XML document, and you could import by writing an XSLT transformation that formatted the XML data as a load file.