Re[4]: [xml] Encoding Problems with libxml 2.2.2

Date view Thread view Subject view Author view

From: Stefan Bambach (bambach@triplex.de)
Date: Wed Aug 30 2000 - 09:23:02 EDT


Hallo Daniel,

Wednesday, August 30, 2000, 12:10:19 PM, you wrote:

DV> On Wed, Aug 30, 2000 at 11:31:47AM +0200, Stefan Bambach wrote:
>> DV> Or EUC-JP if you're japanese or ISO-8859-2 if you're russian, etc ...
>> DV> Yes libxml is now consistant independantly of the kind of input.
>> I have to store some data from the XML file in MySQL. It's enough to
>> store it as 8859-1, because I know that the system is intended for
>> germans only :-) So I don't need all features.

DV> Just to give you an example, I'm french and we are supposed to use
DV> ISO-8859-1, except it lacks the o-e ligature œ character. You're
DV> lucky if all your data can be expressed with ISO-8859-1 :-)
Yes I am :-)

>> Yesterday, I used the UTF8toisolat1() function to do the job for me (I
>> have to convert each value I read from DOM tree by myself). Is there I
>> function like xmlNodeListGetString() with an additional parameter, the
>> encoding string, and I get the String as I need it ?

DV> Hum ... it's not that simple. This would work well with ISO-8859-1
DV> but a lot of character encoders need to maintain a state, which means
DV> that if we fallback to iconv we should try to keep the same encoder
DV> and not open/convert/close for each operation (iconv potentially
DV> ends-up looking for a shared lib and loading it when requesting a new
DV> encoder, this may become very heavy, very fast). So the function should
DV> rather had an xmlCharEncodingHandlerPtr as the second argument.
DV> I will look into this,
Fine.

Are there other similar functions for reading out the content of
tagnames, tagvalues, attributenames, attributevalues, ... ? Most of
the thing I get with 'node->name' or something like this. This is no
nice way of programming such critical code :-(

Personal question:
Where did you come from ?
You answer my question very fast. So is libxml your working place ?
Or is it only your personal pleasure ?

Some other changes in my libxml2.2.2 Version:

Please check it, if you want to place it in your cvs !

=====================================================================
1.) MEMORY LEAK:
        there will be a memory leak, if string is empty (strlen(str) == 0),
        because you allocate at least 1 byte with xmlMemStrdup(buf) for '\0'
        that will never be freed.
        Please check if it is right. I found this error some times ago, while
        checking for memory leaks in an older version of libxml. But
        it is still there.
        
        nanohttp.c: xmlNanoHTTPReadLine()
        ---------------------------------
        
        463 if(*bp == '\n') {
        464 *bp = 0;
        465 return(xmlMemStrdup(buf));
        466 }
        
        REPLACED BY:
        463 if(*bp == '\n') {
        464 if (bp == buf)
        465 return(NULL);
        466 else
        467 *bp = 0;
        468 return(xmlMemStrdup(buf));
        469 }

=====================================================================
2.) ERROR HANDLING:
        I need a better errorhandling that all errors/warnings,
        even validation errors/warnings are welcome
        
        tree.c: xmlNewDoc() (518 ff)
        ----------------------------
        INSERTED AT:
        518 cur->verr = 0; /* Validator Error Counter */
        519 cur->vwarn = 0; /* Validator Warning Counter */
        520 cur->perr = 0; /* Parser Error Counter */
        521 cur->pwarn = 0; /* Parser Warning Counter */

        tree.h: struct _xmlDoc (354 ff)
        -------------------------------
        INSERTED AT:
        355 int verr; /* Validator Error Counter */
        356 int vwarn; /* Validator Warning Counter */
        357 int perr; /* Parser Error Counter */
        358 int pwarn; /* Parser Warning Counter */

        error.c: xmlParserError()
        error.c: xmlParserWarning()
        error.c: xmlParserValidityError()
        error.c: xmlParserValidityWarning)
        ----------------------------------

        ADDED LINE "ctxt->myDoc->...++; /* increase Error or Warning Counter */":
        void
        xmlParserError(void *ctx, const char *msg, ...)
        {
            xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) ctx;
            xmlParserInputPtr input;
            xmlParserInputPtr cur = NULL;
            va_list args;
        
                if (ctxt->myDoc!=NULL)
                        ctxt->myDoc->perr++; /* increase Parser Error Counter */
        
                ...
        
        void
        xmlParserWarning(void *ctx, const char *msg, ...)
        {
            xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) ctx;
            xmlParserInputPtr input;
            xmlParserInputPtr cur = NULL;
            va_list args;
        
                if (ctxt->myDoc!=NULL)
                        ctxt->myDoc->pwarn++; /* increase Parser Warning Counter */

                ...
                
        void
        xmlParserValidityError(void *ctx, const char *msg, ...)
        {
            xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) ctx;
            xmlParserInputPtr input;
            va_list args;
        
                if (ctxt->myDoc!=NULL)
                        ctxt->myDoc->verr++; /* increase Validator Error Counter */

                ...
                
           
        void
        xmlParserValidityWarning(void *ctx, const char *msg, ...)
        {
            xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) ctx;
            xmlParserInputPtr input;
            va_list args;
        
                if (ctxt->myDoc!=NULL)
                        ctxt->myDoc->vwarn++; /* increase Validator Warning Counter */

                ...

=====================================================================
3.) IGNORE COMMENTS IN DOM TREE:
        I don't need comments in my DOM tree => added variable
        
        parser.c:
        ---------
        
        ADDED AT:
        51 int xmlGetCommentsDefaultValue = 1;

        parser.h:
        ---------
        
        ADDED AT:
        307 extern int xmlGetCommentsDefaultValue;

        SAX.c: xmlDefaultSAXHandlerInit() (1512 ff)
        -------------------------------------------
        
        ADDED AT:
    1512 if (xmlGetCommentsDefaultValue == 0)
    1513 xmlDefaultSAXHandler.comment = NULL;
    1514 else
    1515 xmlDefaultSAXHandler.comment = comment;

Mit freundlichen Grüssen.
Stefan Bambach

-- 
Stefan Bambach

triplex-agentur fuer neue medien GmbH herzog-heinrich-strasse 11-13 80336 muenchen

tel: 089-209 138 29 fax: 089-209 138 10

mailto:bambach@triplex.de http://www.triplex.de

---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Aug 30 2000 - 09:43:50 EDT