From: Marc Sanfacon (sanm@copernic.com)
Date: Mon Oct 30 2000 - 08:31:45 EST
Hi Daniel,
I applied the patch this morning and using 'testHTML', I still have
the same problem. Here is the output:
SAX.setDocumentLocator()
SAX.startDocument()
SAX.startElement(html)
SAX.startElement(body)
SAX.ignorableWhitespace(
, 2)
SAX.startElement(b)
SAX.characters(bbbbbbbbbb, 10)
SAX.endElement(b)
SAX.ignorableWhitespace( , 1)
SAX.startElement(b)
SAX.characters(ccccccccccccccc, 15)
SAX.endElement(b)
SAX.ignorableWhitespace(
, 2)
SAX.endElement(body)
SAX.endElement(html)
SAX.ignorableWhitespace(
, 2)
SAX.endDocument()
I will try to pinpoint the problem today.
Thank you,
Marc.
-----Original Message-----
From: xml-request@rufus.w3.org [mailto:xml-request@rufus.w3.org]On
Behalf Of Daniel Veillard
Sent: October 27, 2000 18:49 PM
To: xml@rpmfind.net
Subject: Re: [xml] Bug in parser (HTML)
On Fri, Oct 27, 2000 at 03:18:16PM -0700, Wayne Davison wrote:
> I don't see how that follows. Any whitespace inside a paragraph-like
> container is significant,
That's not how it's done now :-)
> with the possible exception of leading and
> trailing whitespace (which occurs at paragraph boundaries). So,
> whitespace inside of <p>, <h1>, <td>, etc. is all significant, but
> whitespace directly inside something like <table> or <body> is not.
Hum, currently there wasn't a distinction made between
+ mixed content
+ element child only
in the HTML parser. This could be added
> So, your "that ain't true for <b>" example confused me (but maybe I'm
> missing something). Are you saying that these are somehow different?
>
> <html><body><p>
> <b>bbbbbbbbbbbb</b>
> <b>cccccccccccc</b>
> </p></body></html>
>
> and
>
> <html><body><p>
> <b>bbbbbbbbbbbb</b> <b>cccccccccccc</b>
> </p></body></html>
I think they are different. At a rendering level b is a text node
which just happen to possibly be rendered with a different font. That's
why I suggest handling spaces encountered after it as being significant.
I may not be right, but the enclosed patch does this. If you prefer
something better see areBlanks(), modify it and send me the patch :-)
Daniel
-- Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | libxml Gnome XML toolkit Tel : +33 476 615 257 | 655, avenue de l'Europe | http://xmlsoft.org/ Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Rpmfind search site http://www.w3.org/People/all#veillard%40w3.org | http://rpmfind.net/ ---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net
This archive was generated by hypermail 2b29 : Mon Oct 30 2000 - 09:43:37 EST