Re: [xml] Bug in parser (HTML)

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Fri Oct 27 2000 - 18:48:42 EDT


On Fri, Oct 27, 2000 at 03:18:16PM -0700, Wayne Davison wrote:
> I don't see how that follows. Any whitespace inside a paragraph-like
> container is significant,

  That's not how it's done now :-)

> with the possible exception of leading and
> trailing whitespace (which occurs at paragraph boundaries). So,
> whitespace inside of <p>, <h1>, <td>, etc. is all significant, but
> whitespace directly inside something like <table> or <body> is not.

  Hum, currently there wasn't a distinction made between
    + mixed content
    + element child only
in the HTML parser. This could be added

> So, your "that ain't true for <b>" example confused me (but maybe I'm
> missing something). Are you saying that these are somehow different?
>
> <html><body><p>
> <b>bbbbbbbbbbbb</b>
> <b>cccccccccccc</b>
> </p></body></html>
>
> and
>
> <html><body><p>
> <b>bbbbbbbbbbbb</b> <b>cccccccccccc</b>
> </p></body></html>

  I think they are different. At a rendering level b is a text node
which just happen to possibly be rendered with a different font. That's
why I suggest handling spaces encountered after it as being significant.
  I may not be right, but the enclosed patch does this. If you prefer
something better see areBlanks(), modify it and send me the patch :-)

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | libxml Gnome XML toolkit
Tel : +33 476 615 257  | 655, avenue de l'Europe | http://xmlsoft.org/
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Rpmfind search site
 http://www.w3.org/People/all#veillard%40w3.org  | http://rpmfind.net/


----
Message from the list xml@rpmfind.net
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Fri Oct 27 2000 - 19:45:57 EDT