Re: [xml] Possible XML Entity encoding bug.

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@imag.fr)
Date: Wed Jan 10 2001 - 13:05:03 EST


On Wed, Jan 10, 2001 at 05:47:21PM +0000, steve woolfries wrote:
>
> Dear Sir/Madam,
>
> I am using libxml2-2.2.4 parser on SunOS that seems to convert the
> £ (or £) sequence into £ (ie a capital A-circumflex followed
> by a pound-sterling sign) instead of simply a pound-sterling sign.
>
> Being a bit unfamiliar with libxml, it's quite possible that I'm missing
> something or doing something stupid. Any help or redirection to
> somewhere where I could get help more on my level would be much
> appreciated.

  Quite possible :-)
Libxml keep its internal representation as sequences of UTF8 characters
Here is how it is represented:

orchis:~/XML -> ./xmllint --dedug tst.xml
<?xml version="1.0"?>
<tst>&#xA3;</tst>
orchis:~/XML -> ./xmllint --debug tst.xml
DOCUMENT
version=1.0
URL=tst.xml
standalone=true
  ELEMENT tst
      TEXT
        content=#C2#A3

  The UTF8 representation of the character 163 is a sequence of 2
bytes 0xC2 and 0xA3

  Please check the doc on internationalization support
    http://xmlsoft.org/encoding.html
for further information on this topic,

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
daniel@veillard.com  | libxml Gnome XML toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
----
Message from the list xml@rpmfind.net
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Jan 10 2001 - 13:43:50 EST