From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Fri Jun 04 1999 - 14:13:48 EDT
  Hi Havoc,
> I have the following problem: when writing out a Guppi file with an XML
> node for each value in a dataset, I exhausted 128 megs of real memory and
> 256 megs of swap (because the in-memory XML tree was huge). 
  Ouch, Real World apps are coming ...
> You'll be glad to know that libxml didn't crash, even after repeated
> malloc failures, but the save wasn't successful either. :-)
  Interesting, incidentally my 2.2.5 kernel crashed last week during
a "make test" malloc'ing all my virtual memory. Good to know that if the
kernel doesn't fail, the apps stays up :-)
> Any suggestions on how to handle this? Some kind of streaming output? Does
> the SAX interface allow that?
  Actually SAX is only for reading, it's just a predefined set of callbacks
from the parser. I understand that in your case you actually build an
in memory tree which was getting larger than your VM. Not exactly the same
time in the processing. If I understand correctly, I see only 2 ways to 
try to solve the problem:
   1/ manage to split the document, either in multiple XML files or in
      multiple XML entities (loaded from the main one).
   2/ avoid building the in-memory tree on output, if the tree is just
      built to then be saved, bypassing the libxml processing and writing
      directly to disk may be more realistic. 
 
  Here is a small rant against DOM: it eats too much memory, and really
makes sense when one actually modify the data in-memory for example when
loading a document, manipulating it's content and saving it back.
> Another issue that comes up with large files is a progress display during
> load/save. It would be nice to be able to provide a callback like:
> 
> typedef int (*xmlPulseFunc) (void* user_data);
> 
> where libxml would invoke the callback every second or so, and if the
> callback returned FALSE the load/save would end (to allow a Cancel button
> on the progress dialog).
  Hum, I never used test cases large enough to "feel" the time spent
processing files. Maybe I need a demo of Guppi :-), Ok, I'll try to
think about that.
Daniel
-- [Yes, I have moved back to France !] Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux, WWW, rpmfind, Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | rpm2html, XML, http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Kaffe. ---- Message from the list xml@rufus.w3.org Archived at : http://rufus.w3.org/veillard/XML/messages to unsubscribe: echo "unsubscribe xml" | mail majordomo@rufus.w3.org
This archive was generated by hypermail 2b29 : Wed Aug 02 2000 - 12:29:38 EDT