From: Marc Sanfacon (sanm@copernic.com)
Date: Tue Aug 01 2000 - 14:36:30 EDT
Hi there,
I am new to libxml (I've been using it for less than 1 week). I
have written a C++ interface on top of it. It is not yet finished, but it
includes most features I need for now. BTW, I am working under Windows 2000
using MSVC 6.0 SP3.
I have tried to parse a file using the html push interface and have
strange results.
Here is the code:
FILE *f = fopen(CGL::ConvertString(p_FileName).c_str(), "r");
if (f != NULL) {
int res, size = 4096;
char chars[4096];
htmlParserCtxtPtr ctxt;
res = fread(chars, 1, 4, f);
if (res > 0) {
ctxt = htmlCreatePushParserCtxt(NULL, NULL,
chars, res, 0, static_cast<xmlCharEncoding>(0));
InitContext(ctxt);
while ((res = fread(chars, 1, size, f)) > 0) {
htmlParseChunk(ctxt, chars, res, 0);
}
htmlParseChunk(ctxt, chars, 0, 1);
pDoc = ctxt->myDoc;
htmlFreeParserCtxt(ctxt);
}
fclose(f);
}
This is mainly the code presented in 'testHTML.c' from the package, except
that I use a bigger buffer. In my tests, one strange thing happened. When
using a buffer large enough to fit one of my document, the result of the
parsing is not complete. For now, I have only one document that does this
effect and I have attached it to this email.
For example, the document is 2001 bytes long. When reading using fread, it
strips the '\r' so this gives a total of 1971 bytes. When I put 1967 (1971
- 4 bytes for the header) or more, I get the error, a big chunk from my
document is skipped, but if I put 1966 or less, the document is parsed OK.
I even modified 'testHTML.c' to use buffer of 1967 bytes to ensure I was OK,
and I had the same error using: testHTML -debug -repeat -push doc2.htm
Anyone can help me ?
Regards,
Marc.
<<doc2.htm>>
---------------------------------------------------------------------
"If you choose not to decide, you still have made a choice."
Neil Peart
---------------------------------------------------------------------
Marc Sanfacon, Software developer Copernic.com
e-mail: sanm@copernic.com R&D Group
Tel : (418) 527-0528 ext 1212 ICQ #7355101
---- Message from the list xml@xmlsoft.org Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@xmlsoft.org
This archive was generated by hypermail 2b29 : Wed Aug 02 2000 - 12:30:24 EDT