From: Stefan Bambach (bambach@triplex.de)
Date: Wed Aug 30 2000 - 09:23:02 EDT
Hallo Daniel,
Wednesday, August 30, 2000, 12:10:19 PM, you wrote:
DV> On Wed, Aug 30, 2000 at 11:31:47AM +0200, Stefan Bambach wrote:
>> DV> Or EUC-JP if you're japanese or ISO-8859-2 if you're russian, etc ...
>> DV> Yes libxml is now consistant independantly of the kind of input.
>> I have to store some data from the XML file in MySQL. It's enough to
>> store it as 8859-1, because I know that the system is intended for
>> germans only :-) So I don't need all features.
DV> Just to give you an example, I'm french and we are supposed to use
DV> ISO-8859-1, except it lacks the o-e ligature œ character. You're
DV> lucky if all your data can be expressed with ISO-8859-1 :-)
Yes I am :-)
>> Yesterday, I used the UTF8toisolat1() function to do the job for me (I
>> have to convert each value I read from DOM tree by myself). Is there I
>> function like xmlNodeListGetString() with an additional parameter, the
>> encoding string, and I get the String as I need it ?
DV> Hum ... it's not that simple. This would work well with ISO-8859-1
DV> but a lot of character encoders need to maintain a state, which means
DV> that if we fallback to iconv we should try to keep the same encoder
DV> and not open/convert/close for each operation (iconv potentially
DV> ends-up looking for a shared lib and loading it when requesting a new
DV> encoder, this may become very heavy, very fast). So the function should
DV> rather had an xmlCharEncodingHandlerPtr as the second argument.
DV> I will look into this,
Fine.
Are there other similar functions for reading out the content of
tagnames, tagvalues, attributenames, attributevalues, ... ? Most of
the thing I get with 'node->name' or something like this. This is no
nice way of programming such critical code :-(
Personal question:
Where did you come from ?
You answer my question very fast. So is libxml your working place ?
Or is it only your personal pleasure ?
Some other changes in my libxml2.2.2 Version:
Please check it, if you want to place it in your cvs !
=====================================================================
1.) MEMORY LEAK:
there will be a memory leak, if string is empty (strlen(str) == 0),
because you allocate at least 1 byte with xmlMemStrdup(buf) for '\0'
that will never be freed.
Please check if it is right. I found this error some times ago, while
checking for memory leaks in an older version of libxml. But
it is still there.
nanohttp.c: xmlNanoHTTPReadLine()
---------------------------------
463 if(*bp == '\n') {
464 *bp = 0;
465 return(xmlMemStrdup(buf));
466 }
REPLACED BY:
463 if(*bp == '\n') {
464 if (bp == buf)
465 return(NULL);
466 else
467 *bp = 0;
468 return(xmlMemStrdup(buf));
469 }
=====================================================================
2.) ERROR HANDLING:
I need a better errorhandling that all errors/warnings,
even validation errors/warnings are welcome
tree.c: xmlNewDoc() (518 ff)
----------------------------
INSERTED AT:
518 cur->verr = 0; /* Validator Error Counter */
519 cur->vwarn = 0; /* Validator Warning Counter */
520 cur->perr = 0; /* Parser Error Counter */
521 cur->pwarn = 0; /* Parser Warning Counter */
tree.h: struct _xmlDoc (354 ff)
-------------------------------
INSERTED AT:
355 int verr; /* Validator Error Counter */
356 int vwarn; /* Validator Warning Counter */
357 int perr; /* Parser Error Counter */
358 int pwarn; /* Parser Warning Counter */
error.c: xmlParserError()
error.c: xmlParserWarning()
error.c: xmlParserValidityError()
error.c: xmlParserValidityWarning)
----------------------------------
ADDED LINE "ctxt->myDoc->...++; /* increase Error or Warning Counter */":
void
xmlParserError(void *ctx, const char *msg, ...)
{
xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) ctx;
xmlParserInputPtr input;
xmlParserInputPtr cur = NULL;
va_list args;
if (ctxt->myDoc!=NULL)
ctxt->myDoc->perr++; /* increase Parser Error Counter */
...
void
xmlParserWarning(void *ctx, const char *msg, ...)
{
xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) ctx;
xmlParserInputPtr input;
xmlParserInputPtr cur = NULL;
va_list args;
if (ctxt->myDoc!=NULL)
ctxt->myDoc->pwarn++; /* increase Parser Warning Counter */
...
void
xmlParserValidityError(void *ctx, const char *msg, ...)
{
xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) ctx;
xmlParserInputPtr input;
va_list args;
if (ctxt->myDoc!=NULL)
ctxt->myDoc->verr++; /* increase Validator Error Counter */
...
void
xmlParserValidityWarning(void *ctx, const char *msg, ...)
{
xmlParserCtxtPtr ctxt = (xmlParserCtxtPtr) ctx;
xmlParserInputPtr input;
va_list args;
if (ctxt->myDoc!=NULL)
ctxt->myDoc->vwarn++; /* increase Validator Warning Counter */
...
=====================================================================
3.) IGNORE COMMENTS IN DOM TREE:
I don't need comments in my DOM tree => added variable
parser.c:
---------
ADDED AT:
51 int xmlGetCommentsDefaultValue = 1;
parser.h:
---------
ADDED AT:
307 extern int xmlGetCommentsDefaultValue;
SAX.c: xmlDefaultSAXHandlerInit() (1512 ff)
-------------------------------------------
ADDED AT:
1512 if (xmlGetCommentsDefaultValue == 0)
1513 xmlDefaultSAXHandler.comment = NULL;
1514 else
1515 xmlDefaultSAXHandler.comment = comment;
Mit freundlichen Grüssen.
Stefan Bambach
-- Stefan Bambachtriplex-agentur fuer neue medien GmbH herzog-heinrich-strasse 11-13 80336 muenchen
tel: 089-209 138 29 fax: 089-209 138 10
mailto:bambach@triplex.de http://www.triplex.de
---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net
This archive was generated by hypermail 2b29 : Wed Aug 30 2000 - 09:43:50 EDT