Re: [xml] Progressive parsing

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Mon Jul 26 1999 - 11:48:54 EDT


On Fri, Jul 23, 1999 at 08:00:56PM -0400, Daniel Veillard wrote:
>
> Hi Michael,
>
> > Does libxml provide any support for progressive parsing, or will it in the
> > near future? I believe this issue has arisen before, but I am unsure of
> > its current status. What I am looking for is the capability to stream in
> > XML from a socket or other "slow" medium, hand each chunk to libxml as
> > they come in, and receive SAX callbacks as the data is parsed. Currently
> > the entrypoints appear to require an entire document to parse at once, and
> > the only C or C++ library I can find with this kind of functionality is
> > expat, which is awkward for other reasons. I'm assuming others would also
> > desire such a feature, would it be unwise to begin using libxml on the
> > assumption of progressive parsing being implemented over the next few
> > months?
>
> Yes, that's definitely in my TODO list, but it has been there for some
> time now <grin/>
> I cannot promises any hard deadline on libxml coding unfortunately,
> but i definitely support it. Switching to progressive parsing need a bit
> of rework of the parser internals (to move some state from the call stack to
> the parser context), it's definitely in scope and I already worked toward
> this in the past.

  Ok, I spent some time this w.e. on this issue, and committed a version
of libxml using progressive parsing, it's currently on the W3C cvs base
at
  http://dev.w3.org/cgi-bin/cvsweb/XML/

To get it:

  CVSROOT=:pserver:anonymous@dev.w3.org:/sources/public
  export CVSROOT
  cvs login
  passwd: anonymous
  cvs -z3 get XML

I will commit it to the gnome CVS as soon as I will have a bit more
of testing on it.
 Note that now a piece of text may generate multiple SAX.character()
callbacks, it's inherent to having a progressive parser ...

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux, WWW, rpmfind,
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | rpm2html, XML,
http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Kaffe.
----
Message from the list xml@rufus.w3.org
Archived at : http://rufus.w3.org/veillard/XML/messages
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rufus.w3.org


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Aug 02 2000 - 12:29:40 EDT