Parsing HTML in PHP Using Native Classes - CoralNodes
Parsing HTML in PHP Using Native Classes - CoralNodes
TOPICS
Parsing WordPress Tips (9)
HTML in Web Performance (9)
PHP using Plugins (8)
Native Web Hosting (5)
Classes ThemesDisclosure:
(4) A liate links used
Table of
Contents
1. What is Parsing &
What are its Uses?
2. Important DOM
classes in PHP
3. DOMDocument,
Nodes & Elements
4. Practical
Examples
4.1. Selecting by ID
4.2. Selecting a Tag
by Its Name
4.3. Find elements
with a particular
class
the document
4.5.2. Deleting an
document
4.6. Manipulating
Disclosure: A liate links used
Attributes
5. Conclusion
What is
Parsing &
What are
its Uses?
“ Parsing
(in
this
case)
is
the
process
of
extracting
or
modifying
useful
information
from
an
HTML
or
XML Disclosure: A liate links used
string.
A
parser
gives
us
easy
ways
to
query
raw
data
instead
of
using
regex.
Important
DOM
classes in
PHP
DOMDocument
(extends DOMNode
class)
DOMNode
DOMNodeList
DOMXPath
DOMElement
(extends DOMNode
class)
DOMDocument,
Nodes &
Elements
Disclosure: A liate links used
//examples
$documentElement = $d
//object of DOMElemen
Nodes
structure made up of
individual nodes. These
nodes can be of any
type, say an element,
text, comment, attribute
etc. DOMNode is the base
class from which all
types of node classes
inherit.
Elements
Practical
Examples
optimization.
Select element by Id
Find elements by
class
Inserting HTML
element
Deleting an element
Dealing with
attributes
header('Content-Type:
$url = "https://www.c
$ch = curl_init();
curl_setopt($ch, CURL
curl_setopt($ch, CURL
curl_setopt($ch, CURL
$res = curl_exec($ch)
curl_close($ch);
Selecting by ID
$table = $dom->getEle
$child_elements = $ta
$row_count = $child_e
Selecting a Tag
by Its Name
$h2s = $dom->getEleme
foreach( $h2s as $h2
echo $h2->textCon
}
The result:
Test Images
Results after Compre
ShortPixel
reSmush.it
Imagify
TinyPNG Compress JPE
Kraken.IO
EWWW Image Optimizer
WP Smush
Do you actually need
Consclusion
Find elements
with a particular
class Disclosure: A liate links used
In Javascript, the
querySelectorAll()
method makes it easy
to select any elements
using a CSS selector. In
PHP, it is not that
straightforward. Instead,
we have to use the
DOMXpath class to query
and traverse the DOM
tree.
Just like
getElementByTagName() ,
the query() method of
DOMXpath also returns a
DOMNodeList . It takes an
expression as an
argument. This XPath
expression is so
Disclosure: A liate links used
versatile that we can
perform almost any type
of queries.
Extract links
from a page
Suppose I want to nd
all the external links to a
particular website on a
web-page. In our
sample page, what I like
to do is to nd all the
outbound links to the
wordpress.org website
Disclosure: A liate links used
from the blog post. So,
this is how I did it.
$links = $dom->getEle
$urls = [];
foreach($links as $li
$url = $link->get
$parsed_url = par
if( isset($parsed
$urls[] = $ur
}
}
var_dump($urls);
Modifying &
Saving HTML
adding or deleting
elements and attributes.
Inserting new
HTML element into
the document
In this example, we will
see how to add an
image with a link after
the rst paragraph. This
is how you insert banner
ads between posts. Disclosure: A liate links used
$ps = $dom->getElemen
$first_para = $ps->it
$html_to_add = '<div>
$dom_to_add = new DOM
@ $dom_to_add->loadHT
$new_element = $dom_t
$imported_element = $
$first_para->parentNo
$output = @ $dom->sav
echo $output;
Deleting an element
from the document
To delete an element
from our HTML, we can
make use of the
removeChild() method
from the DOMElement
class.
$html = '<p>This is o
<div class="del">Dele
<p>This is our second
<p>This is our third
<div class="del">Dele
Disclosure: A liate links used
foreach( $elems as $e
$elem->parentNode
}
echo '<br><br>-------
echo $dom->saveHTML()
Here we have
performed an XPath
query to nd all the
elements with the class
del . Then we remove
each node from the
document by iterating
over the DOMNodeList
object using a foreach
loop.
-------after deletio
Manipulating
Attributes
getAttribute($attribute_name)
– get the value of an
attribute
setAttribute($attribute_name,
$attribute_value) –
hasAttribute($attribute_name)
– checks whether an
element has a
certain attribute and
returns a true or
false
if( $elem->hasAttribu
echo 'attribute v
$elem->setAttribu
echo '<br>updated
}
Conclusion
About the
author
Abhinav R
(Vishnu) is a
blogger with
a keen
interest in
learning
web trends
and
exploring
the world of
Disclosure: A liate links used
WordPress.
Apart from
that, he also
has a
passion for
nature
photography
and travel.
WP How
Super to
Cache Delete
vs and
WP Limit
Fastest WordPr
Cache Post
– Revisio
https://www.coralnodes.com/pars ng-html- n-php/ 18/20
30.09.2019 Pars ng HTML n PHP us ng Nat ve Classes | CoralNodes
Which
is
the
Best?
Leave a Reply
Your email address will
not be published.
Required elds are
marked *
Disclosure: A liate links used
Comment
Name *
Email *
Website
POST COMMENT
Disclaimer SEO
Disclosure