[xml] Lower speed with greater xmlParseChunk() chunks?

Date view Thread view Subject view Author view

From: rolf@pointsman.de
Date: Sat Oct 14 2000 - 21:23:05 EDT


I'm using the libxml SAX Interface (without validation). I'm doing
things very close to the way shown in SAXtest.c.

More detailed I'm using xmlCreatePushParserCtxt() to create the parser
(and feed in the first 4 byte of the input, as mentioned in the
documentation and shown in SAXtest.c).

Then I use xmlParseChunk to feed in the rest of my XML data chunk by
chunk. Everything seems to work very well.

Playing around I found one very strange behavior. Parsing speed slows
(dramatically) down, if the chunks of data are big.

Parsing always the same medium sized XML Data (around 11 Mbyte) I got:

  Chunk Size time
  1 MB 360s (!)
  100 kB 28s
  1 kB 12,8s

From test to test, I changed nothing, but the buffer size within the
parsing loop. It seems, optimum is around 1kB. chunk of 512 Bytes are
as fast as 1 kByte Chunks, 128 Byte chunks are slightly slower. Memory
consumption seems to be equal independent from chunk size. This all at
linux 2.2.13 with libc 2.1.2-24, egcs-2.91 and libxml 2-2.2.4.

Of course, nobody reads a file in 1 MByte chunks. I discovered this
behavior while parsing already in memory XML-Data coming from
elsewhere. In this situation is seems to be the most easiest way, to
feed the hole XML string into the parser with one xmlParseChunk()
call.

Is somebody able to reproduce this behavior?

It's easy to use a small chunk size even for in memory XML Data, of
course. But at least a short hint within the documentation would be
helpful, if this all is true (and not a fault of me). Maybe best would
be, if the parsing engine would spilt up the input into the
comfortablest chunks automatically. (xmlSAXUserParseMemory() doesn't
seems to do this.)

rolf

----
Message from the list xml@rpmfind.net
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Sat Oct 14 2000 - 21:43:21 EDT