How to Filter and Traverse a DOM Tree with JavaScript

Did we know there’s a JavaScript API whose solitary goal is to filter out and iterate by a nodes we wish from a DOM tree? In fact, not one though there are dual such APIs: NodeIterator and TreeWalker. They’re utterly identical to one another, with some useful differences. Both can return a list of nodes that are benefaction underneath a given base node while complying with any predefined and/or tradition filter rules practical to them.

The predefined filters accessible in a APIs can assistance us target opposite kinds of nodes such as content nodes or component nodes, and tradition filters (added by us) can further filter a bunch, for instance by looking for nodes with specific contents. The returned list of nodes are iterable, i.e. they can be looped through, and we can work with all a particular nodes in a list.

How to use a NodeIterator API

A NodeIterator intent can be combined regulating a createNodeIterator() process of a document interface. This process takes 3 arguments. The initial one is required; it”s a root node that binds all a nodes we wish to filter out.

The second and third arguments are optional. They are a predefined and tradition filters, respectively. The predefined filters are accessible for use as constants of a NodeFilter object.

For example, if a NodeFilter.SHOW_TEXT consistent is combined as a second parameter it will lapse an iterator for a list of all a content nodes underneath a base node. NodeFilter.SHOW_ELEMENT will lapse only a component nodes. See a full list of all a accessible constants.

The third evidence (the tradition filter) is a function that implements a filter.

Here is an example formula snippet:

!doctype html
html lang='en'
  head
    meta charset='UTF-8'
    titleDocument/title
  /head
  body
    headerh1title/h1/header
    div id='wrapper'
      this is a page wrapper
      pHello/p
      pHow are you?/p
    /div
    spantxt/span
    a href='#'some link/a
    footercopyrights/footer
  /body
/html

Assuming we wish to extract a essence of all a content nodes that are inside a #wrapper div, this is how we go about it regulating NodeIterator:

var div = document.querySelector('#wrapper');
var nodeIterator = document.createNodeIterator(
  div,
  NodeFilter.SHOW_TEXT
);
while(nodeIterator.nextNode()) {
  console.log(nodeIterator.referenceNode.nodeValue.trim());
}
/* console output
[Log] this is a page wrapper
[Log] Hello
[Log]
[Log] How are you?
[Log]
*/

The nextNode() process of a NodeIterator API returns a subsequent node in a list of iterable content nodes. When we use it in a while loop to entrance any node in a list, we record a embellished essence of any content node into a console. The referenceNode skill of NodeIterator returns a node a iterator is now trustworthy to.

As we can see in a output, there are some content nodes with only dull spaces for their contents. We can avoid display these dull essence regulating a tradition filter:

var div = document.querySelector('#wrapper');
var nodeIterator = document.createNodeIterator(
  div,
  NodeFilter.SHOW_TEXT,
  function(node) {
    lapse (node.nodeValue.trim() !== "") ?
    NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_REJECT;
  }
);
while(nodeIterator.nextNode()) {
  console.log(nodeIterator.referenceNode.nodeValue.trim());
}
/* console output
[Log] this is a page wrapper
[Log] Hello
[Log] How are you?
*/

The tradition filter duty returns a consistent NodeFilter.FILTER_ACCEPTif a content node is not empty, that leads to a inclusion of that node in a list of nodes a iterator will be iterating over. Contrary, a NodeFilter.FILTER_REJECT consistent is returned in sequence to exclude a dull content nodes from a iterable list of nodes.

How to use a TreeWalker API

As we mentioned before, a NodeIterator and TreeWalker APIs are similar to any other.

TreeWalker can be combined regulating a createTreeWalker() process of a document interface. This method, only like createNodeFilter(), takes 3 arguments: the base node, a predefined filter, and a tradition filter.

If we use a TreeWalker API instead of NodeIterator a prior formula dash looks like a following:

var div = document.querySelector('#wrapper');
var treeWalker = document.createTreeWalker(
  div,
  NodeFilter.SHOW_TEXT,
  function(node) {
    lapse (node.nodeValue.trim() !== "") ?
    NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_REJECT;
  }
);
while(treeWalker.nextNode()) {
  console.log(treeWalker.currentNode.nodeValue.trim());
}
/* output
[Log] this is a page wrapper
[Log] Hello
[Log] How are you?
*/

Instead of referenceNode, a currentNode skill of a TreeWalker API is used to access a node to that a iterator is now attached. In further to a nextNode() method, Treewalker has other useful methods. The previousNode() process (also present in NodeIterator) returns a prior node of a node a iterator is now anchored to.

Similar functionality is achieved by a parentNode(), firstChild(), lastChild(), previousSibling(), and nextSibling() methods. These methods are only accessible in a TreeWalker API.

Here’s a formula instance that outputs a final child of a node a iterator is anchored to:

var div = document.querySelector('#wrapper');
  var treeWalker = document.createTreeWalker(
  div,
  NodeFilter.SHOW_ELEMENT
);
console.log(treeWalker.lastChild());
/*  output
[Log] pHow are you?/p
*/

Which API to choose

Choose a NodeIterator API, when we need only a elementary iterator to filter and loop by a comparison nodes. And, collect a TreeWalker API, when we need to entrance a filtered nodes’ family, such as their evident siblings.

Add Comment