XPath in JavaScript

This document describes the JavaScript interface to access XPath functions in plain vanilla HTML.

The XPath JavaScript Library implements much of the DOM 3 XPath. This allows XPath expressions to be run against both HTML and XML documents.

First, the simplest interface to the XPath is the evaluate function of the document object, which returns an object of type XPathResult:

var xpathResult = document.evaluate(xpathExpression, contextNode, namespaceResolver, resultType, result);

The evaluate function takes a total of 5 arguments:

  1. xpathExpression: A string containing an xpath expression to be evaluated
  2. contextNode: A node in the document against which the Xpath expression should be evaluated
  3. namespaceResolver: A function that takes a string containing a namespace prefix from the xpathExpression and returns a string containing the URI to which that prefix corresponds. This enables conversion between the prefixes used in the XPath expressions and the (possibly different) prefixes used in the document
  4. resultType: A numeric constant indicating the type of result that is returned. These constants are avaliable in the global XPathResult object and are defined in the relevaant section of the XPath Spec. For most purposes it's OK to pass in XPathResult.ANY_TYPE which will cause the results of the Xpath expression to be returned as the most natural type
  5. result:An existing XPathResult to use for the results. Passing null causes a new XPathResult to be created.

A simple example

A simple use of XPath is to extract the level 2 headings of a HTML document. The XPath Expression in this case is simply //h2. The code for this is then:

var headings = document.evaluate("//h2", document, null, XPathResult.ANY_TYPE,null);

Notice that, since HTML does not have namespaces, we have passed in null as the namespaceResolver. Since we wish to search over the entire document for headings, we have used the document object itself as the contextNode.

The result of this expression is an XPathResult object. If we wish to know the type of result returned, we may evaluated the resultType property of the returned object. In this case that will evaluate to 4, which, as per the ECMAScript language binding for XPath represents a UNORDERED_NODE_ITERATOR_TYPE. This is the default return type when the result of the XPath expression is a node set. It allows us access to a single node at a time and does not make any promises about the order in which the nodes will be returned. To access the returned nodes, we may use the iterateNext method of the returned object:

var thisHeading = headings.iterateNext();
var alertText = "Level 2 headings in this document are:\n"
while (thisHeading) {
alertText += thisHeading.textContent + "\n"
thisHeading = headings.iterateNext();
}
alert(alertText);

Page navigation