Using XPath to Retrieve Element Content

In addition to walking the document tree to find an element, Libxml2 includes support for use of XPath expressions to retrieve sets of nodes that match a specified criteria. Full documentation of the XPath API is here.

XPath allows searching through a document for nodes that match specified criteria. In the example below we search through a document for the contents of all keyword elements.

[Note]Note

A full discussion of XPath is beyond the scope of this document. For details on its use, see the XPath specification.

Full code for this example is at Appendix D, Code for XPath Example.

Using XPath requires setting up an xmlXPathContext and then supplying the XPath expression and the context to the xmlXPathEvalExpression function. The function returns an xmlXPathObjectPtr, which includes the set of nodes satisfying the XPath expression.

	xmlXPathObjectPtr
	getnodeset (xmlDocPtr doc, xmlChar *xpath){
	
	1xmlXPathContextPtr context;
	xmlXPathObjectPtr result;

	2context = xmlXPathNewContext(doc);
	3result = xmlXPathEvalExpression(xpath, context);
	4if(xmlXPathNodeSetIsEmpty(result->nodesetval)){
		xmlXPathFreeObject(result);
                printf("No result\n");
		return NULL;
      

1

First we declare our variables.

2

Initialize the context variable.

3

Apply the XPath expression.

4

Check the result and free the memory allocated to result if no result is found.

The xmlPathObjectPtr returned by the function contains a set of nodes and other information needed to iterate through the set and act on the results. For this example, our functions returns the xmlXPathObjectPtr. We use it to print the contents of keyword nodes in our document. The node set object includes the number of elements in the set (nodeNr) and an array of nodes (nodeTab):

	1for (i=0; i < nodeset->nodeNr; i++) {
	2keyword = xmlNodeListGetString(doc, nodeset->nodeTab[i]->xmlChildrenNode, 1);
		printf("keyword: %s\n", keyword);
	        xmlFree(keyword);
	}
      

1

The value of nodeset->Nr holds the number of elements in the node set. Here we use it to iterate through the array.

2

Here we print the contents of each of the nodes returned.

[Note]Note

Note that we are printing the child node of the node that is returned, because the contents of the keyword element are a child text node.