Simple ScriptUsage Examples with PHP Simple HTML DOM Parser


If you want to parse HTML then let me tell you Regular Expressions is not the right way .There is a dream utility PHP Simple HTML DOM Parser for all the Web Developers which will function just perfect with both DOM and PHP because web developers can with ease find DOM elements by making use of PHP.The library of this parser is extensive with so many elements just like the early versions of JavaScript Frameworks and the Selector Engines. This parser has the capability to select content from various DOM nodes with PHP for analyzing any changes on the WebPages.

Functionality of PHP Simple HTML DOM Parser

This simple HTML DOM Parser has all the functions  that you need to manipulate HTML. With PHP Simple HTML DOM Parser you can extract all the contents from HTML in a single line , you can as well find tags on a HTML Web page with slectors just as the case with jQuery.One best thing about PHP Simple HTML DOM Parser is that it supports invalid HTML.

If you want to scrape data from a webpage or add or remove the various parts of a HTML document then Simple HTML DOM Parser is must download for you.

Here in this post on WDJ we will walk you through some great scripts that will help you make wonders with this PHP Simple HTML DOM Parser.

Quick Start

1
2
3
4
5
6
7
8
9
<?php
// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');
 
// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');
 
// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

Object Oriented Way:

1
2
3
4
5
6
7
8
9
10
11
12
<?php
// Create a DOM object
$html = new simple_html_dom();
 
// Load HTML from a string
$html->load('<html><body>Hello!</body></html>');
 
// Load HTML from a URL 
$html->load_file('http://www.google.com/');
 
// Load HTML from a HTML file 
$html->load_file('test.htm');

Find HTML Elements

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// Find all anchors, returns a array of element objects
$ret = $html->find('a');
 
// Find (N)th anchor, returns element object or null if not found (zero based)
$ret = $html->find('a', 0);
 
// Find lastest anchor, returns element object or null if not found (zero based)
$ret = $html->find('a', -1); 
 
// Find all <div> with the id attribute
$ret = $html->find('div[id]');
 
// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]'); 
 
// Find all element which id=foo
$ret = $html->find('#foo');
 
// Find all element which class=foo
$ret = $html->find('.foo');
 
// Find all element has attribute id
$ret = $html->find('*[id]'); 
 
// Find all anchors and images 
$ret = $html->find('a, img'); 
 
// Find all anchors and images with the "title" attribute
$ret = $html->find('a[title], img[title]');
 
// Find all <li> in <ul> 
$es = $html->find('ul li');
 
// Find Nested <div> tags
$es = $html->find('div div div'); 
 
// Find all <td> in <table> which class=hello 
$es = $html->find('table.hello td');
 
// Find all td tags with attribite align=center in table tags 
$es = $html->find(''table td[align=center]');

Modify HTML Element

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;
 
// Set a attribute(If the attribute is non-value attribute (eg. checked, selected...), set it's value as true or false)
$e->href = 'my link';
 
// Remove a attribute, set it's value as null! 
$e->href = null;
 
// Determine whether a attribute exist? 
if(isset($e->href)) 
        echo 'href exist!';
 
// Extract contents from HTML 
echo $html->plaintext;
 
// Wrap a element
$e->outertext = '<div class="wrap">' . $e->outertext . '<div>';
 
// Remove a element, set it's outertext as an empty string 
$e->outertext = '';
 
// Append a element
$e->outertext = $e->outertext . '<div>foo<div>';
 
// Insert a element
$e->outertext = '<div>foo<div>' . $e->outertext;

Dump contents of DOM object in Quick Way

1
2
3
4
5
6
 
// Dumps the internal DOM tree back into string 
$str = $html;
 
// Print it!
echo $html;

Dump contents of DOM object in Object oriented Way

1
2
3
4
5
// Dumps the internal DOM tree back into string 
$str = $html->save();
 
// Dumps the internal DOM tree back into a file 
$html->save('result.htm');

Customize the parsing behavior “Callback Function”

1
2
3
4
5
6
7
8
9
10
11
12
// Write a function with parameter "$element"
function my_callback($element) {
        // Hide all <b> tags 
        if ($element->tag=='b')
                $element->outertext = '';
} 
 
// Register the callback function with it's function name
$html->set_callback('my_callback');
 
// Callback function will be invoked while dumping
echo $html;

The simple code snippets listed above are self explanatory and do not need much detailed explanation.

Recently Published

»

Get Content Marketing Into Your System

The days of marketers hoisting up a brand campaign once every few ...

»

Combine HTML5 & Bootstrap As Key Ingredients For Rebooting Your Business

HTML5 has achieved the tremendous heights in the recent years owing ...

»

10 Tips On Social Media That Will Land Your Dream Job

A job search trends survey conducted by Jobvite showed that 48% of ...

»

10 Best Node.js Frameworks For Web And Apps Development

Node.js is counted among one of the most renowned JavaScript runtime ...

»

Fetch Your Customers : Retail Commerce

As listening the word commerce it comes in our mind that it is ...

»

Best App Design Practices for 2016

The year 2015 witnessed some groundbreaking revolution regarding app ...

»

9 Websites To Visit For Exemplary Footer Design Inspiration

When we set out to design a website, our primary focus is on the ...

»

What are the Cloud Computing trends to dominate in 2016

With Cloud computing, it has developed cell phones into a remote ...

»

How to Show Related Posts in WordPress Without a Plugin?

Making your visitors go through each post increases the browsing time ...