Chapter 7 - Text Nodes · DOM Enlightenment

## 7.1 *Text* object overview Text in an HTML document is represented by instances of the *Text()* constructor function, which produces text nodes. When an HTML document is parsed the text mixed in among the elements of an HTML page are converted to text nodes. live code: [http://jsfiddle.net/domenlightenment/kuz5Z](http://jsfiddle.net/domenlightenment/kuz5Z) ~~~ <!DOCTYPE html> <html lang="en"> <body> <p>hi</p> <script> //select 'hi' text node var textHi = document.querySelector('p').firstChild console.log(textHi.constructor); //logs Text() //logs Text {textContent="hi", length=2, wholeText="hi", ...} console.log(textHi); </script> </body> </html> ~~~ The code above concludes that the *Text()* constructor function constructs the text node but keep in mind that*Text* inherits from *CharacterData*,* Node*, and *Object*. ## 7.2 *Text* object & properties To get accurate information pertaining to the available properties and methods on an *Text* node its best to ignore the specification and to ask the browser what is available. Examine the arrays created in the code below detailing the properties and methods available from a text node. live code: [http://jsfiddle.net/domenlightenment/Wj3uS](http://jsfiddle.net/domenlightenment/Wj3uS) ~~~ <!DOCTYPE html> <html lang="en"> <body> <p>hi</p> <script> var text = document.querySelector('p').firstChild; //text own properties console.log(Object.keys(text).sort()); //text own properties & inherited properties var textPropertiesIncludeInherited = []; for(var p in text){ textPropertiesIncludeInherited.push(p); } console.log(textPropertiesIncludeInherited.sort()); //text inherited properties only var textPropertiesOnlyInherited = []; for(var p in text){ if(!text.hasOwnProperty(p)){ textPropertiesOnlyInherited.push(p); } } console.log(textPropertiesOnlyInherited.sort()); </script> </body> </html> ~~~ The available properties are many even if the inherited properties were not considered. Below I've hand pick a list of note worthy properties and methods for the context of this chapter. * *textContent* * *splitText()* * *appendData()* * *deleteData()* * *insertData()* * *replaceData()* * *subStringData()* * *normalize()* * *data* * *document.createTextNode()* (not a property or inherited property of text nodes but discussed in this chapter) ## 7.3 White space creates *Text* nodes When a DOM is contstructed either by the browser or by programmatic means text nodes are created from white space as well as from text characters. After all, whitespace is a character. In the code below the second paragraph, conaining an empty space, has a child *Text* node while the first paragraph does not. live code: [http://jsfiddle.net/domenlightenment/YbtnZ](http://jsfiddle.net/domenlightenment/YbtnZ) ~~~ <!DOCTYPE html> <html lang="en"> <body> <p id="p1"></p> <p id="p2"> </p> <script> console.log(document.querySelector('#p1').firstChild) //logs null console.log(document.querySelector('#p2').firstChild.nodeName) //logs #text </script> </body> </html> ~~~ Don't forget that white space and text characters in the DOM are typically represented by a text node. This of course means that carriage returns are considered text nodes. In the code below we log a carriage return highlighting the fact that this type of character is in fact a text node. live code: [http://jsfiddle.net/domenlightenment/9FEzq](http://jsfiddle.net/domenlightenment/9FEzq) ~~~ <!DOCTYPE html> <html lang="en"> <body> <p id="p1"></p> //yes there is a carriage return text node before this comment, even this comment is a node <p id="p2"></p> <script> console.log(document.querySelector('#p1').nextSibling) //logs Text </script> </body> </html> ~~~ The reality is if you can input the character or whitespace into an html document using a keyboard then it can potentially be interputed as a text node. If you think about it, unless you minimze/compress the html document the average html page contains a great deal of whitespace and carriage return text nodes. ## 7.4 Creating & Injecting *Text* Nodes *Text* nodes are created automatically for us when a browser interputs an HTML document and a corresponding DOM is built based on the contents of the document. After this fact, its also possible to programatically create*Text* nodes using *createTextNode()*. In the code below I create a text node and then inject that node into the live DOM tree. live code: [http://jsfiddle.net/domenlightenment/xC9q3](http://jsfiddle.net/domenlightenment/xC9q3) ~~~ <!DOCTYPE html> <html lang="en"> <body> <div></div> <script> var textNode = document.createTextNode('Hi'); document.querySelector('div').appendChild(textNode); console.log(document.querySelector('div').innerText); // logs Hi </script> </body> </html> ~~~ Keep in mind that we can also inject text nodes into programmatically created DOM structures as well. In the code below I place a text node inside of an *`<p>`* element before I inject it into the live DOM. live code: [http://jsfiddle.net/domenlightenment/PdatJ](http://jsfiddle.net/domenlightenment/PdatJ) ~~~ <!DOCTYPE html> <html lang="en"> <div></div> <body> <script> var elementNode = document.createElement('p'); var textNode = document.createTextNode('Hi'); elementNode.appendChild(textNode); document.querySelector('div').appendChild(elementNode); console.log(document.querySelector('div').innerHTML); //logs <div>Hi</div> </script> </body> </html> ~~~ ## 7.5 Getting a *Text* node value with *.data* or *nodeValue* The text value/data represented by a *Text* node can be extracted from the node by using the *.data* or*nodeValue* property. Both of these return the text contained in a *Text* node. Below I demostrate both of these to retrive the value contained in the *`<div>`*. live code: [http://jsfiddle.net/domenlightenment/dPLkx](http://jsfiddle.net/domenlightenment/dPLkx) ~~~ <!DOCTYPE html> <html lang="en"> <p>Hi, <strong>cody</strong></p><body> <script> console.log(document.querySelector('p').firstChild.data); //logs 'Hi,' console.log(document.querySelector('p').firstChild.nodeValue); //logs 'Hi,' </script> </body> </html> ~~~ Notice that the *`<p>`* contains two *Text* node and *Element* (i.e. *`<strong>`*)node. And that we are only getting the value of the first child node contained in the *`<p>`*. ### Notes Getting the length of the characters contained in a text node is as simple as accessing the length proerty of the node itself or the actual text value/data of the node (i.e. *document.querySelector('p').firstChild.length* or*document.querySelector('p').firstChild.data.length* or*document.querySelector('p').firstChild.nodeValue.length*) ## 7.6 Maniputlating *Text* nodes with *appendData()*, *deleteData()*,*insertData()*, *replaceData()*, *subStringData()* The *CharacterData* object that *Text* nodes inherits methods from provides the following methods for manipulating and extracting sub values from *Text* node values. * *appendData()* * *deleteData()* * *insertData()* * *replaceData()* * *subStringData()* Each of these are leverage in the code example below. live code: [http://jsfiddle.net/domenlightenment/B6AC6](http://jsfiddle.net/domenlightenment/B6AC6) ~~~ <!DOCTYPE html> <html lang="en"> <p>Go big Blue Blue<body> <script> var pElementText = document.querySelector('p').firstChild;//add !pElementText.appendData('!');console.log(pElementText.data);//remove first 'Blue'pElementText.deleteData(7,5);console.log(pElementText.data);//insert it back 'Blue'pElementText.insertData(7,'Blue ');console.log(pElementText.data);//replace first 'Blue' with 'Bunny'pElementText.replaceData(7,5,'Bunny ');console.log(pElementText.data);//extract substring 'Blue Bunny'console.log(pElementText.substringData(7,10)); </script> </body> </html> ~~~ ### Notes These same manipulation and sub extraction methods can be leverage by *Comment* nodes ## 7.7 When mulitple sibling *Text* nodes occur Typically, immediate sibling *Text* nodes do not occur because DOM trees created by browsers intelligently combines text nodes, however two cases exist that make sibling text nodes possible. The first case is rather obvious. If a text node contains an *Element* node (e.g. *`<p>`Hi, `<strong>`cody`</strong>` welcome!`</p>`*) than the text will be split into the proper node groupings. Its best to look at a code example as this might sound more complicted than it really is. In the code below the contents of the *`<p>`* element is not a single *Text* node it is in fact 3 nodes, a *Text* node, *Element* node, and another *Text* node. live code: [http://jsfiddle.net/domenlightenment/2ZCn3](http://jsfiddle.net/domenlightenment/2ZCn3) ~~~ <!DOCTYPE html> <html lang="en"> <body> <p>Hi, <strong>cody</strong> welcome!</p> <script> var pElement = document.querySelector('p'); console.log(pElement.childNodes.length); //logs 3 console.log(pElement.firstChild.data); // is text node or 'Hi, ' console.log(pElement.firstChild.nextSibling); // is Element node or <strong> console.log(pElement.lastChild.data); // is text node or ' welcome!' </script> </body> </html> ~~~ The next case occurs when we are programatically add *Text* nodes to an element we created in our code. In the code below I create a *`<p>`* element and then append two *Text* nodes to this element. Which results in sibling *Text*nodes. live code: [http://jsfiddle.net/domenlightenment/jk3Jn](http://jsfiddle.net/domenlightenment/jk3Jn) ~~~ <!DOCTYPE html> <html lang="en"> <body> <script> var pElementNode = document.createElement('p');var textNodeHi = document.createTextNode('Hi ');var textNodeCody = document.createTextNode('Cody');pElementNode.appendChild(textNodeHi);pElementNode.appendChild(textNodeCody);document.querySelector('div').appendChild(pElementNode);console.log(document.querySelector('div p').childNodes.length); //logs 2 </script> </body> </html> ~~~ ## 7.8 Remove markup and return all child *Text* nodes using *textContent* The *textContent* property can be used to get all child text nodes, as well as to set the contents of a node to a specific *Text* node. When its used on a node to get the textual content of the node it will returned a concatenataed string of all text nodes contained with the node you call the method on. This functionality would make it very easy to extract all text nodes from an HTML document. Below I extract all of the text contained withing the *`<body>`* element. Notice that *textContent* gathers not just immediate child text nodes but all child text nodes no matter the depth of encapsulation inside of the node the method is called. live code: N/A ~~~ <!DOCTYPE html> <html lang="en"> <body> <h1> Dude</h2> <p>you <strong>rock!</strong></p> <script> console.log(document.body.textContent); //logs 'Dude you rock!' with some added white space </script> </body> </html> ~~~ When *textContent* is used to set the text contained within a node it will remove all child nodes first, replacing them with a single *Text* node. In the code below I replace all the nodes inside of the *`<div>`* element with a single*Text* node. live code: [http://jsfiddle.net/domenlightenment/m766T](http://jsfiddle.net/domenlightenment/m766T) ~~~ <!DOCTYPE html> <html lang="en"> <body> <div> <h1> Dude</h2> <p>you <strong>rock!</strong></p> </div> <script> document.body.textContent = 'You don\'t rock!' console.log(document.querySelector('div').textContent); //logs 'You don't rock!' </script> </body> </html> ~~~ ### Notes *textContent* returns *null* if used on the a document or doctype node. *textContent* returns the contents from *`<script>`* and *`<style>`* elements ## 7.9 The difference between *textContent* & *innerText* Most of the modern bowser, except Firefox, support a seeminly similiar property to *textContent* named*innerText*. However these properties are not the same. You should be aware of the following differences between *textContent* & *innerText*. * *innerText* is aware of CSS. So if you have hidden text *innerText* ignores this text, whereas *textContent*will not * Because *innerText* cares about CSS it will trigger a reflow, whereas *textContent* will not * *innerText* ignores the *Text* nodes contained in *`<script>`* and *`<style>`* elements * *innerText*, unlike *textContent* will normalize the text that is returned. Just think of *textContent* as returning exactly what is in the document with the markup removed. This will include white space, line breaks, and carriage returns * *innerText* is considered to be non-standard and browser specific while *textContent* is implemented from the DOM specifications If you you intend to use *innerText* you'll have to create a work around for Firefox. ## 7.10 Combine sibling *Text* nodes into one text node using *normalize()* Sibling *Text* nodes are typically only encountered when text is programaticly added to the DOM. To eliminate sibling *Text* nodes that contain no *Element* nodes we can use *normalize()*. This will concatenate sibling text nodes in the DOM into a single *Text* node. In the code below I create sibling text, append it to the DOM, then normalize it. live code: [http://jsfiddle.net/domenlightenment/LG9WR](http://jsfiddle.net/domenlightenment/LG9WR) ~~~ <!DOCTYPE html> <html lang="en"> <body> <div></div> <script> var pElementNode = document.createElement('p'); var textNodeHi = document.createTextNode('Hi'); var textNodeCody = document.createTextNode('Cody'); pElementNode.appendChild(textNodeHi); pElementNode.appendChild(textNodeCody); document.querySelector('div').appendChild(pElementNode); console.log(document.querySelector('p').childNodes.length); //logs 2 document.querySelector('div').normalize(); //combine our sibling text nodes console.log(document.querySelector('p').childNodes.length); //logs 1 </script> </body> </html> ~~~ ## 7.11 Splitting a text node using *splitText()* When *splitText()* is called on a *Text* node it will alter the text node its being called on (leaving the text up to the offset) and return a new *Text* node that contains the text split off from the orginal text based on the offset. In the code below the text node *Hey Yo!* is split after *Hey* and *Hey* is left in the DOM while *Yo!* is turned into a new text node are returned by the *splitText()* method. live code: [http://jsfiddle.net/domenlightenment/Tz5ce](http://jsfiddle.net/domenlightenment/Tz5ce) ~~~ <!DOCTYPE html> <html lang="en"> <body> <p>Hey Yo!</p> <script> //returns a new text node, taken from the DOM console.log(document.querySelector('p').firstChild.splitText(4).data); //logs Yo! //What remains in the DOM...console.log(document.querySelector('p').firstChild.textContent); //logs Hey </script> </body> </html> ~~~