From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Fri Nov 17 2000 - 16:54:57 EST
On Fri, Nov 17, 2000 at 03:54:19PM -0500, Marc Sanfacon wrote:
>
> Oups, sorry...
>
> Just in case it doesn't work, here is the code:
>
> <center>
> <html><head><TITLE>Classifieds</TITLE>
> </head><body>
> <center>
> </center><a name=rsearch"></form></BODY></HTML><!-- END PAGE FOOTER
> --></center>
>
> One of the files contains 5 lines, no CR at the end. This one is causing
> the bug. The other ones contains 6 lines, with a CR at the end. No bug.
Okay I see,
the enclosed patch try to clean up the mess introduced by auto-opening
body and head (related to one of the case you sent earlier this week)
and also the following.
If you could try it for a little while and report if it clean things up
I would feel better. Problem is that at that point we are starting to
play heuristics at the parser level and I don't really like this. this has
the potential to clean up a number of problem but may also raise new
ones :-\, so feedback on heavy duty HTML parsing tasks would be welcome.
Daniel
-- Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | libxml Gnome XML toolkit Tel : +33 476 615 257 | 655, avenue de l'Europe | http://xmlsoft.org/ Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Rpmfind search site http://www.w3.org/People/all#veillard%40w3.org | http://rpmfind.net/
---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net
This archive was generated by hypermail 2b29 : Fri Nov 17 2000 - 17:43:32 EST