Re: [xml] [Q] libxml-2.2.2.3 Thread Safe ?

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Thu Sep 28 2000 - 16:37:12 EDT


On Thu, Sep 28, 2000 at 04:18:02PM -0400, Marc Sanfacon wrote:
> Hi Daniel,
> I continued my work on that today. I identified the difference
> between the 2 generated xml tree, one in ST and the other in MT. The
> difference I can see is that in the MT tree, some of the nodes are not
> positionned under the same node as the one in ST. They are indented one
> node more in the MT trees. Not all the nodes, sometimes just one.
>
> So I found this problem in HTMLParser.c
>
> void
> htmlInitAutoClose(void) {
> int index, i = 0;
>
> if (htmlStartCloseIndexinitialized) return;

  Right from a theorical point of view there should be a synchronization
barier here ... This code is not reentrant and the initialization have
to take place before further processing ...

> for (index = 0;index < 100;index ++) htmlStartCloseIndex[index] = NULL;
> index = 0;
> while ((htmlStartClose[i] != NULL) && (index < 100 - 1)) {
> htmlStartCloseIndex[index++] = &htmlStartClose[i];
> while (htmlStartClose[i] != NULL) i++;
> i++;
> }
>
> /* NEW LINE */
> htmlStartCloseIndexinitialized = 1;
> }

[...]

> int htmlParseDocument(htmlParserCtxtPtr ctxt) {
> xmlDtdPtr dtd;
>
> htmlDefaultSAXHandlerInit();
>
> Is another place. Why do we need to initalize 'htmlDefaultSAXHandler' here
> ? It was already initialized when instantiated.

  Well since htmlParserCtxtPtr if a public structure there it is not
formally forbidden to enter here without having called any of the context
initialization routines.

> Am I right by saying that
> it should not (must not) be modified by any function ?

  Well some people might prefer changing directly the default structure
than calling the SAX functions instead of the normal ones.

> To be thread safe,
> actually, every parsing should have its own context (which is true) and its
> own sax handler, which in some cases seem not to be true.

  Possible, but impossible to guarantee in a formal way, libxml APIs
are too "open" to allow too many assumptions, makes writing the inner code
a bit more complex, I agree ...

> I will continue my testing to be sure this does the trick. But for now, the
> test has been running for 1 hour without a crash.

  I think the problem will either show up early or not at all ...
unless there is a race condition occuring only in a very short sequence
of code that only long burning test could trigger ...

> Of course, As I said, since I call the parser in ST first, some of the bugs
> might be there also. It would be a good idea to have an initialization
> routine available, so I could call it when starting the system and ensure
> all the globals variables are initialized.

   Yes, this is the best way to handle this in threaded contexts.
We just need to define xmlInitParser() (like was done for cleanup
xmlCleanupParser() ) and probably an HTML counterpart.

   I will look at this, either tonight or during this w.e.

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe
----
Message from the list xml@rpmfind.net
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Thu Sep 28 2000 - 16:43:33 EDT