Re: [xml] streaming output

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Fri Jun 04 1999 - 14:13:48 EDT


  Hi Havoc,

> I have the following problem: when writing out a Guppi file with an XML
> node for each value in a dataset, I exhausted 128 megs of real memory and
> 256 megs of swap (because the in-memory XML tree was huge).

  Ouch, Real World apps are coming ...

> You'll be glad to know that libxml didn't crash, even after repeated
> malloc failures, but the save wasn't successful either. :-)

  Interesting, incidentally my 2.2.5 kernel crashed last week during
a "make test" malloc'ing all my virtual memory. Good to know that if the
kernel doesn't fail, the apps stays up :-)

> Any suggestions on how to handle this? Some kind of streaming output? Does
> the SAX interface allow that?

  Actually SAX is only for reading, it's just a predefined set of callbacks
from the parser. I understand that in your case you actually build an
in memory tree which was getting larger than your VM. Not exactly the same
time in the processing. If I understand correctly, I see only 2 ways to
try to solve the problem:
   1/ manage to split the document, either in multiple XML files or in
      multiple XML entities (loaded from the main one).
   2/ avoid building the in-memory tree on output, if the tree is just
      built to then be saved, bypassing the libxml processing and writing
      directly to disk may be more realistic.
 
  Here is a small rant against DOM: it eats too much memory, and really
makes sense when one actually modify the data in-memory for example when
loading a document, manipulating it's content and saving it back.

> Another issue that comes up with large files is a progress display during
> load/save. It would be nice to be able to provide a callback like:
>
> typedef int (*xmlPulseFunc) (void* user_data);
>
> where libxml would invoke the callback every second or so, and if the
> callback returned FALSE the load/save would end (to allow a Cancel button
> on the progress dialog).

  Hum, I never used test cases large enough to "feel" the time spent
processing files. Maybe I need a demo of Guppi :-), Ok, I'll try to
think about that.

Daniel

-- 
	    [Yes, I have moved back to France !]
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux, WWW, rpmfind,
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | rpm2html, XML,
http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Kaffe.
----
Message from the list xml@rufus.w3.org
Archived at : http://rufus.w3.org/veillard/XML/messages
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rufus.w3.org


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Aug 02 2000 - 12:29:38 EDT