From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Fri Mar 03 2000 - 13:16:10 EST
On Fri, Mar 03, 2000 at 10:52:12AM +0100, Tobias Peters wrote:
>
> After doc = xmlParseFile(..) I have to read doc->encoding and convert all
> strings to utf8 myself. Is this true, and will this change in future?
Don't base any assumption on libxml-1.x code in this area.
Fetch a pre-2.0 snapshot if you plam to work with I18N
As explained a couple of time but maybe not clearly:
- internally everything will be kept using 8bit encoding or
UTF-8 (i.e ISO-Latin-X don't have to be converted)
- everything else is converted to UTF-8
- UTF-16 is decoded automatically after detecting the
byte order mark
- other enc should be processed by registering first
convertesr to/from utf8. libxml will take care of
doing the convertion on the fly
This is nearly completely implemented in 2.0 snapshots
Daniel
-- Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux XML libxml WWW Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind http://www.w3.org/People/all#veillard%40w3.org | RPM badminton Kaffe ---- Message from the list xml@xmlsoft.org Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@xmlsoft.org
This archive was generated by hypermail 2b29 : Wed Aug 02 2000 - 12:30:07 EDT