From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Mon Sep 25 2000 - 09:06:01 EDT
On Mon, Sep 25, 2000 at 02:30:19PM +0200, Helge Hess wrote:
[Please sub scribe to post to the list]
> Hi,
>
> is it correct behaviour that
>
> <string></string>
>
> doesn't work (CharRef: invalid xmlChar value 1) ? I would expect that I
> can encode/escape any unichar that way ?
I don't think so !!!
http://www.w3.org/TR/REC-xml#charsets explicitely says:
-----------------------------------------
A parsed entity contains text, a sequence of characters, which may
represent markup or character data. A character is an atomic unit of text
as specified by ISO/IEC 10646 [ISO/IEC 10646]. Legal characters are tab,
carriage return, line feed, and the legal graphic characters of Unicode
and ISO/IEC 10646.
-----------------------------------------
And the following production is quite precise about this:
-----------------------------------------
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] |
[#xE000-#xFFFD] | [#x10000-#x10FFFF]
/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
-----------------------------------------
And 0x7 is clearly not within that range. It means you cannot
embed binary data within XML documents without escaping (uuencode
is one of the methods which should work). Using  is not an escaping
at the XML level it is an escaping at the encoding level IMHO.
Daniel
-- Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux XML libxml WWW Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind http://www.w3.org/People/all#veillard%40w3.org | RPM badminton Kaffe ---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net
This archive was generated by hypermail 2b29 : Mon Sep 25 2000 - 09:43:59 EDT