Changes between Initial Version and Version 1 of CDATA Syntax in XML


Ignore:
Timestamp:
06/30/11 21:26:29 (15 years ago)
Author:
stachnik
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • CDATA Syntax in XML

    v1 v1  
     1
     2Elements that contain data of type xs:string may need to be escaped to prevent the xml parser from getting confused. How best to do this depends on the encoding of the string.
     3
     4== ASCII ==
     5
     6If the string is guaranteed to only contain characters from the ASCII character set then it needs to be escaped only if it contains the characters <, & or > (> is normally OK except under confusing circumstances so it's probably best to escape it anyway). A CDATA declaration has the form:
     7
     8{{{
     9<![CDATA[some_string_goes_here]]>
     10}}}
     11
     12This will escape any ASCII string except for ones that contain the substring ]]> since this would otherwise be the close delimiter. To escape such strings one can perform a global search and replace:
     13
     14{{{
     15s/]]>/]]]]><![CDATA[>/g
     16}}}
     17
     18Which splits the substring between two CDATA tags, thus allowing it to be escaped. CDATA elements get ignored by conforming XML parsers so such a string transformation won't effect what gets read in by the parser.
     19
     20== Unicode ==
     21
     22Unfortunately Unicode is more complicated to handle. If we need to deal with this we might want to consider using an external library.