Re: [xml] parsing multiple files

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Thu Mar 30 2000 - 22:50:03 EST


On Thu, Mar 30, 2000 at 01:29:37PM -0800, Alice Tull wrote:
>
> hi,
>
> I'm using the SAX parser to parser a set of XML files.
> I used xmlCreatePushParserCtxt to create the parser
> context each time, and xmlFreeParserCtxt after
> parsing every document. This seems quite expensive.

  There is nothing really expensive in there. A couple of
structures are allocated and filled in with default values.
I think that this should be quite reasonable for most
use in regard to actually get the data fetched and processed.

> Is there a way to create the parser context only once,
> and reset it before parsing each xml file? I tried to
> use xmlClearParserCtxt instead of freeing it, but seems
> like it clears my SAXHandler and userData also ...

  xmlInitParserCtxt() does such a job of initializing
a parser context. But it will override the sax value too
and if not careful you may end-up with serious memory leaks
by not freeing-up previously allocated memory.
  Having a look at that function (in parser.c) you can see
what fields are reset, they all in one way or another
represent some of the parser state. Doing those initialization
in a separate function could be done but I don't recommend
that since this is not guaranteed to work well in future
versions, and an inadequate initialization of one of the
input, name or space information stacks may just lead to
unreliable parsing.

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe
----
Message from the list xml@xmlsoft.org
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@xmlsoft.org


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Aug 02 2000 - 12:30:09 EDT