Re: [xml] Loss of whitespace

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Sat Mar 04 2000 - 10:13:10 EST


On Sat, Mar 04, 2000 at 08:04:21AM -0600, Paul DuBois wrote:
>
> > Following the feedback received I have made version 2.x comply with
> >the XML specification. All blanks spaces are passed to SAX as characters()
> >with the only exception of doing validating parsing and the element
> >is not defined as CDATA nor mixed content. The xml:space="preserve"
> >is looked for in 2.x code and in that case blanks which can be ignored
> >accordingly to the Dtd are passed anyway through characters(), and will
> >generate nodes in the DOM tree.
>
>
> Does this apply to newlines as well? In particular, will a text segment
> consisting only of a newline generate an XML_TEXT_NODE node? (I ask
> because of the behavior I posted a note to the list about yesterday.)

Yes, by default in version 2.0 no blank is ever ignored unless one is
using a DTD and the element are not mixed content nor CDTATA. Note also
taht this version does the end of line normalization:

~/XML -> cat tst.xml
<a>
   <b> </b>
</a>
~/XML ->

  The default behaviour:

~/XML -> ./tester --debug tst.xml
DOCUMENT
version=1.0
standalone=true
  ELEMENT a
    TEXT
    content=
    ELEMENT b
      TEXT
      content=
    TEXT
    content=

  The old behaviour using a compatibility switch:

~/XML -> ./tester --noblanks --debug tst.xml
DOCUMENT
version=1.0
standalone=true
  ELEMENT a
    ELEMENT b
      TEXT
      content=
~/XML ->

  The behaviour in case of presence of DTD declaration for the
parent element:

~/XML -> cat tst2.xml
<!DOCTYPE a [
<!ELEMENT a (b*)>
<!ELEMENT b EMPTY>
]>
<a>
   <b/> <b></b>
</a>
~/XML -> ./tester --debug --valid tst2.xml
DOCUMENT
version=1.0
standalone=true
  DTD(a)
    ELEMDECL(a), MIXED (b)*
    ELEMDECL(b), EMPTY
  ELEMENT a
    ELEMENT b
    ELEMENT b
~/XML ->

  And the limit case where the document is well formed but not
valid:

~/XML -> cat tst3.xml
<!DOCTYPE a [
<!ELEMENT a (b*)>
<!ELEMENT b EMPTY>
]>
<a>
   <b/> validity error <b></b>
</a>
~/XML -> ./tester --debug tst3.xml
DOCUMENT
version=1.0
standalone=true
  DTD(a)
    ELEMDECL(a), MIXED (b)*
    ELEMDECL(b), EMPTY
  ELEMENT a
    ELEMENT b
    TEXT
    content= validity error
    ELEMENT b
~/XML ->

  Note that this version also support xml:space:

~/XML -> cat tst4.xml
<!DOCTYPE a [
<!ELEMENT a (b*)>
<!ELEMENT b EMPTY>
]>
<a xml:space="preserve">
   <b/> validity error <b></b>
</a>
~/XML -> ./tester --debug tst4.xml
DOCUMENT
version=1.0
standalone=true
  DTD(a)
    ELEMDECL(a), MIXED (b)*
    ELEMDECL(b), EMPTY
  ELEMENT a
    ATTRIBUTE xml:space
      TEXT
      content=preserve
    TEXT
    content=
    ELEMENT b
    TEXT
    content= validity error
    ELEMENT b
    TEXT
    content=
~/XML ->

For version 1.x the gnome CVS tree has the default mapped to the old
behaviour and one need to call xmlKeepBlanksDefault(1); to activate
the mormal blanks handling. I will probably release 1.8.7 with this over
the week-end (or Monday).

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe
----
Message from the list xml@xmlsoft.org
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@xmlsoft.org


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Aug 02 2000 - 12:30:08 EDT