From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Fri Oct 27 2000 - 17:09:54 EDT
On Fri, Oct 27, 2000 at 12:02:31PM -0700, Wayne Davison wrote:
>
> On Fri, 27 Oct 2000, Daniel Veillard wrote:
> > the heuristic concludes it's an ignorable white space.
>
> I think that the root of the problem is that <B> didn't trigger an implied
> <P> tag. If it had added the missing <P> tag, the space would not have
> been considered to be ignorable.
Of course that's the first thing I tried :-)
~/XML -> cat tst.html
<html><body>
<p><b>bbbbbbbbbb</b> <b>ccccccccccccccc</b>
</body></html>
~/XML -> ./testHTML tst.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>
<b>bbbbbbbbbb</b>
<b>ccccccccccccccc</b>
</p></body></html>
~/XML ->
the interesting point is that Wayne is right in the sense that
this generate a character() SAX callback instead of
ignorableWhitespace() ...
SAX.startElement(b)
Start of element b, was p
SAX.characters(bbbbbbbbbb, 10)
Close of b stack: 4 elements
0 : html
1 : body
2 : p
3 : b
SAX.endElement(b)
End of tag b: popping out b
SAX.characters( , 1)
One can consider libxml broken there but I really do think
<html><body>
<p><a href="xxx">bbbbbbbbbb</a> <a href="yyy">ccccccccccccccc</a>
</p></body></html>
is equivalent to
<html><body><p>
<a href="xxx">bbbbbbbbbb</a>
<a href="yyy">ccccccccccccccc</a>
</p></body></html>
But that ain't true for <b>. And my understanding is this is due to
<b> being actually text node at a semantic level. It's just an
artifact of adding the style on structure.
So I'm inclined to fix only <b> <em> <strong> and the likes (did I
forgot one ?). But if someone want to fix this more strongly I will
take the patch :-)
Daniel
-- Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | libxml Gnome XML toolkit Tel : +33 476 615 257 | 655, avenue de l'Europe | http://xmlsoft.org/ Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Rpmfind search site http://www.w3.org/People/all#veillard%40w3.org | http://rpmfind.net/ ---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net
This archive was generated by hypermail 2b29 : Fri Oct 27 2000 - 17:43:29 EDT