Re: [xml] bug in libxml-1.8.10

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Thu Sep 07 2000 - 11:51:33 EDT


On Thu, Sep 07, 2000 at 11:33:29AM -0400, Jordan Henderson wrote:
>
> Can anyone point me to the justification for making white space significant in
> XML?
>
> When I heard about this, I couldn't hardly believe it. I know this is going to
> cause a
> million niggling problems like this one.

 I don't have pointer handy, but there were quite some discussions
about this problem in the markup community. Here is what I understand:

  Let's take the following document, now design a rule where
you can be sure what spaces are ignorable and whose are really the
author intent (remember that carriage returns are spaces):
---------------------
<p>
<a></a>
<a> </a>
<a> abc </a>
<a> <b>abc</b> </a>
<a>
<b>abc</b>
</a>
<a>
   <b>abc</b>
</a>
<a><b>abc
   </b></a>
<a>
   <b>abc
   </b>
</a>
</p>
---------------------
  If you don't have a DTD to check against a content model this is
impossible. Even with a DTD it's not possible to always tell.
  More than 10 years of work on SGML were not able to bring a decent
solution to this problem. As a result all spaces are significant
(and even if you validate, and are guaranteed that one doesn't have
a mixed content-model, most of the XML DOM parser will provide the
white spaces).
  It's better to have a specific policy enforced at the application
level than a broken policy at the parser level.

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe
----
Message from the list xml@rpmfind.net
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Thu Sep 07 2000 - 12:43:19 EDT