Re: [xml] Encoding question

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@w3.org)
Date: Fri Mar 03 2000 - 13:16:10 EST


On Fri, Mar 03, 2000 at 10:52:12AM +0100, Tobias Peters wrote:
>
> After doc = xmlParseFile(..) I have to read doc->encoding and convert all
> strings to utf8 myself. Is this true, and will this change in future?

  Don't base any assumption on libxml-1.x code in this area.
Fetch a pre-2.0 snapshot if you plam to work with I18N
As explained a couple of time but maybe not clearly:
  - internally everything will be kept using 8bit encoding or
    UTF-8 (i.e ISO-Latin-X don't have to be converted)
  - everything else is converted to UTF-8
  - UTF-16 is decoded automatically after detecting the
    byte order mark
  - other enc should be processed by registering first
    convertesr to/from utf8. libxml will take care of
    doing the convertion on the fly
This is nearly completely implemented in 2.0 snapshots

 Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe
----
Message from the list xml@xmlsoft.org
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@xmlsoft.org


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Aug 02 2000 - 12:30:07 EDT