What is libxml

XML

XML is a metalanguage

XML defines the structural rules but not the semantic of the markup

XML-1.0 is a W3C recommendation

Namespace in XML is another Recommendation

XML Example

<?xml version="1.0" encoding="ISO-8859-1"?>
<exemple>
 <titre>Un exemple</titre>
 <chapitre numéro="1">
  <titre>Introduction</titre>
  <p>Ceci est un exemple très succint</p>
  <img source="logo.gif"/>
 </chapitre>
 <chapitre numéro="2"/>
</exemple> 

Architecture

la famille XML

Parsing interfaces

Libxml embeds an XML and an HTML parser (SGML docbook available too).

SAX

DOM

DTD validation support

libxml does not try to validate by default

the API allows:

problem this is dependant on DOM

Memory management

all allocations and deallocations are centralized

an API allows to redefine the allocations functions

a debugging modules keep lists and reports leaks

I18N Support

Is a serious problem in libxml1

Fixed in libxml2:

the I/O API

The parser is progressive, allowing either pull or push data flow

multiple input mechanismes:

XPath

An XPath tree representation

Examples

XPointer

A few examples:

XML Base, XInclude

XSLT

XSLT is a transformation language.

an XSLT transform

an XSLT sheet is an XML document

processing is based on templates

recusive transitive closure on templates

allow output to XML, HTML or text

libxslt

libxslt is the library implementing XSLT on top of libxml2

the xsltproc program allows to run it on the command line

The API is relatively simple:

Should be compliant with XSLT-1.0 implement some 1.1 extensions

Relatively fast as long as one doesn't swap

A few examples

Future work in libxml and libxslt

finish XSLT, bugfixing

basic event support

support for non determinist Dtd models

support for different tree models, large files, databases

XML Schemas, validation with decent type support

Future work on top of libxml

XML-RPC, SOAP or other XML based protocols (Jabber)