XML Interviews Question page12

When should I use a CDATA Marked Section? You should almost never need to use CDATA Sections. The CDATA mechanism was designed to let an author quote fragments of text containing markup characters (the open-angle-bracket and the ampersand), for example w

XML Interviews Question page12

XML Interviews Question page12

     

  1. When should I use a CDATA Marked Section?
    You should almost never need to use CDATA Sections. The CDATA mechanism was designed to let an author quote fragments of text containing markup characters (the open-angle-bracket and the ampersand), for example when documenting XML (this FAQ uses CDATA Sections quite a lot, for obvious reasons). A CDATA Section turns off markup recognition for the duration of the section (it gets turned on again only by the closing sequence of double end-square-brackets and a close-angle-bracket). Consequently, nothing in a CDATA section can ever be recognised as anything to do with markup: it's just a string of opaque characters, and if you use an XML transformation language like XSLT, any markup characters in it will get turned into their character entity equivalent.
    If you try, for example, to use:
    some text with <![CDATA[markup]]> in it. in the expectation that the embedded markup would remain untouched, it won't: it will just output some text with <em>markup</em> in it.

    In other words, CDATA Sections cannot preserve the embedded markup as markup. Normally this is exactly what you want because this technique was designed to let people do things like write documentation about markup. It was not designed to allow the passing of little chunks of (possibly invalid) unparsed HTML embedded inside your own XML through to a subsequent process—because that would risk invalidating the output.

    As a result you cannot expect to keep markup untouched simply because it looked as if it was safely ‘hidden’ inside a CDATA section: it can't be used as a magic shield to preserve HTML markup for future use as markup, only as characters.
  2. How can I handle embedded HTML in my XML?
    Apart from using CDATA Sections, there are two common occasions when people want to handle embedded HTML inside an XML element:
    1. when they have received (possibly poorly-designed) XML from somewhere else which they must find a way to handle;
    2. when they have an application which has been explicitly designed to store a string of characters containing < and & character entity references with the objective of turning them back into markup in a later process (eg FreeMind, Atom).
    Generally, you want to avoid this kind of trick, as it usually indicates that the document structure and design has been insufficiently thought out. However, there are occasions when it becomes unavoidable, so if you really need or want to use embedded HTML markup inside XML, and have it processable later as markup, there are a couple of techniques you may be able to use:
    * Provide templates for the handling of that markup in your XSLT transformation or whatever software you use which simply replicates what was there, eg
    <xsl:template match="b">
    <b>
    <xsl:apply-templates/>
    </b>
    </xsl:template/>
    * Use XSLT's ‘deep copy’ instruction, which outputs nested well-formed markup verbatim, eg
    <xsl:template match="ol">
    <xsl:copy-of select="."/>
    </xsl:template/>
    * As a last resort, use the disable-output-escaping attribute on the xsl:text element of XSL[T] which is available in some processors, eg
    <xsl:text disable-output-escaping="yes"><![CDATA[<b>Now!</b>]]></xsl:text>
    * Some processors (eg JX) are now providing their own equivalents for disabling output escaping. Their proponents claim it is ‘highly desirable’ or ‘what most people want’, but it still needs to be treated with care to prevent unwanted (possibly dangerous) arbitrary code from being passed untouched through your system. 
  3. What are the special characters in XML ?
    For normal text (not markup), there are no special characters: just make sure your document refers to the correct encoding scheme for the language and/or writing system you want to use, and that your computer correctly stores the file using that encoding scheme. See the question on non-Latin characters for a longer explanation. If your keyboard will not allow you to type the characters you want, or if you want to use characters outside the limits of the encoding scheme you have chosen, you can use a symbolic notation called ‘entity referencing’. Entity references can either be numeric, using the decimal or hexadecimal Unicode code point for the character (eg if your keyboard has no Euro symbol (€) you can type €); or they can be character, using an established name which you declare in your DTD (eg ) and then use as € in your document. If you are using a Schema, you must use the numeric form for all except the five below because Schemas have no way to make character entity declarations. If you use XML with no DTD, then these five character entities are assumed to be predeclared, and you can use them without declaring them:
    &lt;
    The less-than character (<) starts element markup (the first character of a start-tag or an end-tag).
    &amp;
    The ampersand character (>) starts entity markup (the first character of a character entity reference).
    &gt;
    The greater-than character (>) ends a start-tag or an end-tag.
    &quot;
    The double-quote character (") can be symbolised with this character entity reference when you need to embed a double-quote inside a string which is already double-quoted.

    The apostrophe or single-quote character (') can be symbolised with this character entity reference when you need to embed a single-quote or apostrophe inside a string which is already single-quoted. If you are using a DTD then you must declare all the character entities you need to use (if any), including any of the five above that you plan on using (they cease to be predeclared if you use a DTD). If you are using a Schema, you must use the numeric form for all except the five above because Schemas have no way to make character entity declarations.

Tutorials

  1. XML Interviews Question
  2. XML Interviews Question page3
  3. XML Interviews Question page9
  4. XML Interviews Question page10
  5. XML Interviews Question page12
  6. XML Interviews Question page19
  7. XML Interviews Question page21
  8. XML Interviews Question page1,xml Interviews Guide,xml Interviews
  9. XML Interviews Question page11,xml Interviews Guide,xml Interviews
  10. XML Interviews Question page13,xml Interviews Guide,xml Interviews
  11. XML Interviews Question page14,xml Interviews Guide,xml Interviews
  12. XML Interviews Question page15,xml Interviews Guide,xml Interviews
  13. XML Interviews Question page16,xml Interviews Guide,xml Interviews
  14. XML Interviews Question page17,xml Interviews Guide,xml Interviews
  15. XML Interviews Question page18,xml Interviews Guide,xml Interviews
  16. XML Interviews Question page1,xml Interviews Guide,xml Interviews
  17. XML Interviews Question page21,xml Interviews Guide,xml Interviews
  18. XML Interviews Question page22,xml Interviews Guide,xml Interviews
  19. XML Interviews Question page23,xml Interviews Guide,xml Interviews
  20. XML Interviews Question page24,xml Interviews Guide,xml Interviews
  21. XML Interviews Question page25,xml Interviews Guide,xml Interviews
  22. XML Interviews Question page26,xml Interviews Guide,xml Interviews
  23. XML Interviews Question page27,xml Interviews Guide,xml Interviews
  24. XML Interviews Question page28,xml Interviews Guide,xml Interviews
  25. XML Interviews Question page29,xml Interviews Guide,xml Interviews
  26. XML Interviews Question page30,xml Interviews Guide,xml Interviews
  27. XML Interviews Question page1,xml Interviews Guide,xml Interviews
  28. XML Interviews Question page5,xml Interviews Guide,xml Interviews
  29. XML Interviews Question page6,xml Interviews Guide,xml Interviews
  30. XML Interviews Question page7,xml Interviews Guide,xml Interviews
  31. XML Interviews Question page8,xml Interviews Guide,xml Interviews