[xml] Conditional Sections and the X3D DTD

Date view Thread view Subject view Author view

From: Jonathan P Springer (jonathan.springer2@gte.net)
Date: Sun Nov 12 2000 - 12:01:40 EST


I've been fiddling with using libxml to parse Web3D X3D files. I'd had no
problems parsing them until I turned on validation, at which point all heck
broke
loose, primarily around conditional sections.

Basically, the X3D DTD (see http://www.web3d.org/TaskGroups/x3d/translation/x3d-compromise.dtd)
uses several parsed entities to define which sections of the X3D
specification
should be included and which should be excluded. The idea is that the
DOCTYPE
in the X3D file can be used to switch on and off certain features in the
specification.

OK, so once Daniel put in the Conditional Section coding (was that version
2.2.3?), I thought I'd be home free. Think again...

Running my test file (see http://members.theglobe.com/springjp/try2.txt )
yielded the following warning:

--> warning: PEReference: %ChildrenNodes; not found
--> <!ENTITY % SceneNodes " ( %ChildrenNodes; |
%WildcardNodes;
)*, ROUTE* "

I did a little investigation in the DTD file and found out that what was
happening was this:

--> <!ENTITY % CoreProfile "IGNORE">
...
--> <![%CoreProfile;[
--> <!ENTITY % ChildrenNodes " %BehaviorLeafNodes; |
%BindableNodes; | %GroupingNodes; | %SceneLeafNodes; " >
--> <!ENTITY % SceneNodes " ( %ChildrenNodes; |
%WildcardNodes; )*, ROUTE* " >
--> ]]>

A bit of nosing around in parser.c found that although the results of
parsing
weren't being applied for the IGNOREd section, the Parsed Entities (in this
case
ChildrenNodes) were still being passed through the parser.

Next stop, the Gospel according to XML 1.0 Second Edition. The grammar
spec
for an IGNOREd section is:

--> [61] conditionalSect ::= includeSect | ignoreSect
--> [62] includeSect ::= '<![' S? 'INCLUDE' S? '[' extSubsetDecl ']]>'
--> [63] ignoreSect ::= '<![' S? 'IGNORE' S? '[' ignoreSectContents* ']]>'
--> [64] ignoreSectContents ::= Ignore ('<![' ignoreSectContents ']]>'
Ignore)*
--> [65] Ignore ::= Char* - (Char* ('<![' | ']]>') Char*)

Or, in English, take anything that comes in the section, so long as the
'<!['s
and ']]>'s balance.

If you've had the spare time to stick with me through this, I've attached
my
recommended patches (as of v2.2.4) below. In essence, I've added a state
for
IGNORE sections and changed the code in xmlParseConditionalSections to
count
opens
and closes. I hope that will be enough; additional testing or alternative
approaches would be welcome.

I also welcome other input or recommendations.

Regards,
-jonathan springer

----

diff -cr old-libxml2-2.2.4/parser.c libxml2-2.2.4/parser.c *** old-libxml2-2.2.4/parser.c Sun Oct 1 16:29:52 2000 --- libxml2-2.2.4/parser.c Sun Nov 5 09:50:26 2000 *************** *** 647,652 **** --- 647,655 ---- */ if ((ctxt->external == 0) && (ctxt->inputNr == 1)) return; + break; + case XML_PARSER_IGNORE: + return; } NEXT; *************** *** 4469,4474 **** --- 4472,4479 ---- } else if ((RAW == 'I') && (NXT(1) == 'G') && (NXT(2) == 'N') && (NXT(3) == 'O') && (NXT(4) == 'R') && (NXT(5) == 'E')) { int state; + int instate; + int depth = 0; SKIP(6); SKIP_BLANKS; *************** *** 4494,4533 **** * But disable SAX event generating DTD building in the meantime */ state = ctxt->disableSAX; ctxt->disableSAX = 1; ! while ((RAW != 0) && ((RAW != ']') || (NXT(1) != ']') || ! (NXT(2) != '>'))) { ! const xmlChar *check = CUR_PTR; ! int cons = ctxt->input->consumed; ! int tok = ctxt->token; ! ! if ((RAW == '<') && (NXT(1) == '!') && (NXT(2) == '[')) { ! xmlParseConditionalSections(ctxt); ! } else if (IS_BLANK(CUR)) { ! NEXT; ! } else if (RAW == '%') { ! xmlParsePEReference(ctxt); ! } else ! xmlParseMarkupDecl(ctxt); ! ! /* ! * Pop-up of finished entities. ! */ ! while ((RAW == 0) && (ctxt->inputNr > 1)) ! xmlPopInput(ctxt); ! if ((CUR_PTR == check) && (cons == ctxt->input->consumed) && ! (tok == ctxt->token)) { ! ctxt->errNo = XML_ERR_EXT_SUBSET_NOT_FINISHED; ! if ((ctxt->sax != NULL) && (ctxt->sax->error != NULL)) ! ctxt->sax->error(ctxt->userData, ! "Content error in the external subset\n"); ! ctxt->wellFormed = 0; ! ctxt->disableSAX = 1; ! break; ! } } ctxt->disableSAX = state; if (xmlParserDebugEntities) { if ((ctxt->input != NULL) && (ctxt->input->filename)) fprintf(stderr, "%s(%d): ", ctxt->input->filename, --- 4499,4525 ---- * But disable SAX event generating DTD building in the meantime */ state = ctxt->disableSAX; + instate = ctxt->instate; ctxt->disableSAX = 1; ! ctxt->instate = XML_PARSER_IGNORE; ! while (depth >= 0) { ! if ((RAW == '<') && (NXT(1) == '!') && (NXT(2) == '[')) { ! depth++; ! SKIP(3); ! continue; ! } ! if ((RAW == ']') && (NXT(1) == ']') && (NXT(2) == '>')) { ! if (--depth >= 0) SKIP(3); ! continue; ! } ! NEXT; ! continue; } + ctxt->disableSAX = state; + ctxt->instate = instate; + if (xmlParserDebugEntities) { if ((ctxt->input != NULL) && (ctxt->input->filename)) fprintf(stderr, "%s(%d): ", ctxt->input->filename, *************** *** 7314,7319 **** --- 7306,7313 ---- fprintf(stderr, "PP: try EPILOG\n");break; case XML_PARSER_PI: fprintf(stderr, "PP: try PI\n");break; + case XML_PARSER_IGNORE: + fprintf(stderr, "PP: try IGNORE\n");break; } #endif *************** *** 8009,8014 **** --- 8003,8015 ---- fprintf(stderr, "PP: entering START_TAG\n"); #endif break; + case XML_PARSER_IGNORE: + fprintf(stderr, "PP: internal error, state == IGNORE"); + ctxt->instate = XML_PARSER_DTD; + #ifdef DEBUG_PUSH + fprintf(stderr, "PP: entering DTD\n"); + #endif + break; } } done: diff -cr old-libxml2-2.2.4/parser.h libxml2-2.2.4/parser.h *** old-libxml2-2.2.4/parser.h Sun Oct 1 16:29:52 2000 --- libxml2-2.2.4/parser.h Sun Oct 22 11:21:22 2000 *************** *** 99,105 **** XML_PARSER_ENTITY_VALUE, /* within an entity value in a decl */ XML_PARSER_ATTRIBUTE_VALUE, /* within an attribute value */ XML_PARSER_SYSTEM_LITERAL, /* within a SYSTEM value */ ! XML_PARSER_EPILOG /* the Misc* after the last end tag */ } xmlParserInputState; /** --- 99,106 ---- XML_PARSER_ENTITY_VALUE, /* within an entity value in a decl */ XML_PARSER_ATTRIBUTE_VALUE, /* within an attribute value */ XML_PARSER_SYSTEM_LITERAL, /* within a SYSTEM value */ ! XML_PARSER_EPILOG, /* the Misc* after the last end tag */ ! XML_PARSER_IGNORE /* within an IGNORED section */ } xmlParserInputState; /** diff -cr old-libxml2-2.2.4/parserInternals.c libxml2-2.2.4/parserInternals.c *** old-libxml2-2.2.4/parserInternals.c Tue Sep 19 08:25:59 2000 --- libxml2-2.2.4/parserInternals.c Sun Oct 22 17:43:18 2000 *************** *** 3156,3161 **** --- 3156,3163 ---- case XML_PARSER_ATTRIBUTE_VALUE: /* ctxt->token = xmlParseCharRef(ctxt); */ return; + case XML_PARSER_IGNORE: + return; } return; } *************** *** 3227,3232 **** --- 3229,3236 ---- "Entity references are forbiden in DTDs!\n"); ctxt->wellFormed = 0; ctxt->disableSAX = 1; + return; + case XML_PARSER_IGNORE: return; }

---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Sun Nov 12 2000 - 12:44:07 EST