From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Thu Sep 28 2000 - 16:37:12 EDT
On Thu, Sep 28, 2000 at 04:18:02PM -0400, Marc Sanfacon wrote:
> Hi Daniel,
> I continued my work on that today. I identified the difference
> between the 2 generated xml tree, one in ST and the other in MT. The
> difference I can see is that in the MT tree, some of the nodes are not
> positionned under the same node as the one in ST. They are indented one
> node more in the MT trees. Not all the nodes, sometimes just one.
>
> So I found this problem in HTMLParser.c
>
> void
> htmlInitAutoClose(void) {
> int index, i = 0;
>
> if (htmlStartCloseIndexinitialized) return;
Right from a theorical point of view there should be a synchronization
barier here ... This code is not reentrant and the initialization have
to take place before further processing ...
> for (index = 0;index < 100;index ++) htmlStartCloseIndex[index] = NULL;
> index = 0;
> while ((htmlStartClose[i] != NULL) && (index < 100 - 1)) {
> htmlStartCloseIndex[index++] = &htmlStartClose[i];
> while (htmlStartClose[i] != NULL) i++;
> i++;
> }
>
> /* NEW LINE */
> htmlStartCloseIndexinitialized = 1;
> }
[...]
> int htmlParseDocument(htmlParserCtxtPtr ctxt) {
> xmlDtdPtr dtd;
>
> htmlDefaultSAXHandlerInit();
>
> Is another place. Why do we need to initalize 'htmlDefaultSAXHandler' here
> ? It was already initialized when instantiated.
Well since htmlParserCtxtPtr if a public structure there it is not
formally forbidden to enter here without having called any of the context
initialization routines.
> Am I right by saying that
> it should not (must not) be modified by any function ?
Well some people might prefer changing directly the default structure
than calling the SAX functions instead of the normal ones.
> To be thread safe,
> actually, every parsing should have its own context (which is true) and its
> own sax handler, which in some cases seem not to be true.
Possible, but impossible to guarantee in a formal way, libxml APIs
are too "open" to allow too many assumptions, makes writing the inner code
a bit more complex, I agree ...
> I will continue my testing to be sure this does the trick. But for now, the
> test has been running for 1 hour without a crash.
I think the problem will either show up early or not at all ...
unless there is a race condition occuring only in a very short sequence
of code that only long burning test could trigger ...
> Of course, As I said, since I call the parser in ST first, some of the bugs
> might be there also. It would be a good idea to have an initialization
> routine available, so I could call it when starting the system and ensure
> all the globals variables are initialized.
Yes, this is the best way to handle this in threaded contexts.
We just need to define xmlInitParser() (like was done for cleanup
xmlCleanupParser() ) and probably an HTML counterpart.
I will look at this, either tonight or during this w.e.
Daniel
-- Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux XML libxml WWW Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind http://www.w3.org/People/all#veillard%40w3.org | RPM badminton Kaffe ---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net
This archive was generated by hypermail 2b29 : Thu Sep 28 2000 - 16:43:33 EDT