Re: [xml] Bad ilegal char handling

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Tue Aug 29 2000 - 14:17:05 EDT


On Tue, Aug 29, 2000 at 02:12:58PM -0300, Nilo S. Mismetti wrote:
> Sirs,
>
> Libxml ("20202") loops trying to parse a file that contains low ASCII
> characters, example:
>
> <?xml version="1.0" encoding="ISO-8859-1" ?>
> <NewHUB>
> <MFYtext>Xabdef</MFYtext>
> </NewHUB>
>
> when X is "low ASCII", for example X=0x1b.
>
> Tracing thru the code I notice that the routine "xmlParseTryOrFinish" loops
> looking over and over to the char 0x1b.
>
> I'm using Libxml on a Win98 machine but I did not notice ANY depencency on
> Win32.
>
> Example attached.

  Oops right, the given .xml file is not an XML document and libxml
complains rightly about this:

--------------------------------
~/gnome-xml -> ./xmllint ~/newkvs.xml
/u/veillard/newkvs.xml:12: error: detected an error in element content
                <Dll>
                     SPWIN.DLL</Dll>
       ^
/u/veillard/newkvs.xml:12: error: Premature end of data in tag <Dll>
                                                                    SPWIN.DLL</Dll>
                <Seri
                <Dll>
                     SPWIN.DLL</Dll>
       ^
/u/veillard/newkvs.xml:12: error: detected an error in element content
                <Dll>
                     SPWIN.DLL</Dll>
       ^
/u/veillard/newkvs.xml:12: error: Premature end of data in tag <Configuration>
                <Dll>
                     SPWIN
                <Dll>
                     SPWIN.DLL</Dll>
       ^
/u/veillard/newkvs.xml:12: error: detected an error in element content
                <Dll>
                     SPWIN.DLL</Dll>
       ^
/u/veillard/newkvs.xml:12: error: Premature end of data in tag <PosDB>
<!-- * -->
<!-- Confi
                <Dll>
                     SPWIN.DLL</Dll>
       ^
/u/veillard/newkvs.xml:12: error: Extra content at the end of the document
                <Dll>
                     SPWIN.DLL</Dll>
       ^
~/gnome-xml ->
--------------------------------

Except in push mode where it was looping because I forgot to add
one of the tests
  
--------------------------------
~/gnome-xml -> ./xmllint --push ~/newkvs.xml

~/gnome-xml ->

 So right, this is a bug in libxml I fixed it (patch enclosed) but you
will have to fix the document because it's not XML anyway. The range of
unicode chars allowed in XML is defined in the specification in section
2.2:

     http://www.w3.org/TR/REC-xml#charsets

The character whose value is 0x1b is definitely not part of the accepted
set, you have the right to complain to whoever provided this "pseudo XML"
file to you, it's plain wrong :-\

   thanks for the report,

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe


----
Message from the list xml@rpmfind.net
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Tue Aug 29 2000 - 11:43:19 EDT