Re: [xml] libxml-2.2.3 is released.

Date view Thread view Subject view Author view

From: Wayne Davison (wayned@blorf.net)
Date: Sun Sep 17 2000 - 16:37:27 EDT


On Sun, 17 Sep 2000, Daniel Veillard wrote:
> If there is fixes which seems to be missing, send them again

I'm been expecting you to comment on (or perhaps accept) my recent patch
to fix the parsing of UTF8 characters in HTML tag-attribute values.

Here's the patch:

Index: HTMLparser.c
@@ -1970,7 +1970,7 @@
             }
         } else {
             unsigned int c;
- int bits;
+ int bits, l;
 
             if (out - buffer > buffer_size - 100) {
                 int index = out - buffer;
@@ -1978,7 +1978,7 @@
                 growBuffer(buffer);
                 out = &buffer[index];
             }
- c = CUR;
+ c = CUR_CHAR(l);
             if (c < 0x80)
                     { *out++ = c; bits= -6; }
             else if (c < 0x800)

Attached is a test file that demonstrates the problem when it is run like
this:

    ./testHTML -sax test.html

Both HREF tags have the same high-bit character in them (&#145;), but the
second instance outputs an "&Acirc;" instead.

..wayne..


----
Message from the list xml@rpmfind.net
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Sun Sep 17 2000 - 16:43:12 EDT