From: Wayne Davison (wayned@blorf.net)
Date: Mon Aug 28 2000 - 14:23:57 EDT
On Mon, 28 Aug 2000, Daniel Veillard wrote:
> - or just reconvert to ISO-Latin-1 (if possible) using
> UTF8Toisolat1() which is part of libxml2
This function is currently not mentioned in the .h files (only the
function UTF8ToHtml() is mentioned in HTMLparser.h). Should it be? Or
should the user's code have to extern it manually?
Also, these decoders expect to get called by some UTF-8-savvy code that
handles things like outputting the numeric entities. It might be nice to
have a generic way to use the decoder functions for in-memory conversion
without having to know how to handle the more common error conditions of
the decoder functions. Such an interface could potentially remove the
need for my just-added htmlEncodeEntities() function in favor of using the
new interface and the actual UTF8ToHtml() function (though it would likely
be less efficient, since the conversion of things like '&' to "&"
would have to happen in a separate step).
Without this, the user needs to know that when the conversion routine
returns a -2, you need to add the returned "inlen" to your input buffer,
use xmlGetUTF8Char() to get the unicode value (and the length in bytes),
and then skip it in the input and do whatever you like to the output
buffer (such as adding a numeric entity).
..wayne..
---- Message from the list xml@xmlsoft.org Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@xmlsoft.org
This archive was generated by hypermail 2b29 : Mon Aug 28 2000 - 11:43:14 EDT