• PHP Video Tutorials

PHP - DOM Parser Example



The DOM extension in PHP comes with extensive functionality with which we can perform various operations on XML and HTML documents. We can dynamically construct a DOM object, load a DOM document from a HTML file or a string with HTML tag tree. We can also save the DOM document to a XML file, or extract the DOM tree from a XML document.

The DOMDocument class is one the most important classes defined in the DOM extension.

$obj = new DOMDocument($version = "1.0", $encoding = "")

It represents an entire HTML or XML document; serves as the root of the document tree. The DOMDocument class includes definitions of a number of static methods, some of which are introduced here −

Sr.No Methods & Description
1

createElement

Create new element node

2

createAttribute

Create new attribute

3

createTextNode

Create new text node

4

getElementById

Searches for an element with a certain id

5

getElementsByTagName

Searches for all elements with given local tag name

6

load

Load XML from a file

7

loadHTML

Load HTML from a string

8

loadHTMLFile

Load HTML from a file

9

loadXML

Load XML from a string

10

save

Dumps the internal XML tree back into a file

11

saveHTML

Dumps the internal document into a string using HTML formatting

12

saveHTMLFile

Dumps the internal document into a file using HTML formatting

13

saveXML

Dumps the internal XML tree back into a string

Example

Let us use the following HTML file for this example −

<html>
<head> 
   <title>Tutorialspoint</title>
</head> 
<body> 
   <h2>Course details</h2> 
   <table border = "0"> 
      <tbody> 
         <tr> 
            <td>Android</td> 
            <td>Gopal</td> 
            <td>Sairam</td> 
         </tr>
         <tr> 
            <td>Hadoop</td> 
            <td>Gopal</td> 
            <td>Satish</td> 
         </tr> 
         <tr> 
            <td>HTML</td> 
            <td>Gopal</td> 
            <td>Raju</td> 
         </tr> 
         <tr> 
            <td>Web technologies</td> 
            <td>Gopal</td> 
            <td>Javed</td> 
         </tr> 
         <tr> 
            <td>Graphic</td> 
            <td>Gopal</td> 
            <td>Satish</td> 
         </tr> 
         <tr> 
            <td>Writer</td> 
            <td>Kiran</td> 
            <td>Amith</td> 
         </tr> 
         <tr> 
            <td>Writer</td> 
            <td>Kiran</td> 
            <td>Vineeth</td> 
         </tr> 
      </tbody> 
   </table> 
</body> 
</html>

We shall now extract the Document Object Model from the above HTML file by calling the loadHTMLFile() method in the following PHP code −

<?php 

   /*** a new dom object ***/ 
   $dom = new domDocument; 

   /*** load the html into the object ***/ 
   $dom->loadHTMLFile("hello.html");

   /*** discard white space ***/ 
   $dom->preserveWhiteSpace = false; 

   /*** the table by its tag name ***/ 
   $tables = $dom->getElementsByTagName('table'); 

   /*** get all rows from the table ***/ 
   $rows = $tables[0]->getElementsByTagName('tr'); 

   /*** loop over the table rows ***/ 
   foreach ($rows as $row) {
   
      /*** get each column by tag name ***/ 
      $cols = $row->getElementsByTagName('td'); 

      /*** echo the values ***/ 
      echo 'Designation: '.$cols->item(0)->nodeValue.'<br />'; 
      echo 'Manager: '.$cols->item(1)->nodeValue.'<br />'; 
      echo 'Team: '.$cols->item(2)->nodeValue; 
      echo '<hr />'; 
   }
   
?>

It will produce the following output

Designation: Android
Manager: Gopal
Team: Sairam
________________________________________
Designation: Hadoop
Manager: Gopal
Team: Satish
________________________________________
Designation: HTML
Manager: Gopal
Team: Raju
________________________________________
Designation: Web technologies
Manager: Gopal
Team: Javed
________________________________________
Designation: Graphic
Manager: Gopal
Team: Satish
________________________________________
Designation: Writer
Manager: Kiran
Team: Amith
________________________________________
Designation: Writer
Manager: Kiran
Team: Vineeth
________________________________________
Advertisements