Inside this article we will see the concept of find and extract all headings from a web page in php. Concept of this article will provide very classified information to understand the things.
This PHP tutorial is based on how to find and extract all headings from a web page. In this guide, we will see how to fetch the HTML content of a web page by URL and then extract all the headings and list them. To do this, we will be use PHP’s DOMDocument class.
Header tags, also known as heading tags, are used to separate headings and subheadings on a webpage. They rank in order of importance, from H1 to H6, with H1s usually being the title. Header tags improve the readability and SEO of a webpage.
HTML headings tags from H1 to H6. Like
<h1>...</h1> <h2>...</h2> ... <h6>...</h6>
DOMDocument of PHP also termed as PHP DOM Parser. We will see step by step concept to find and extract all headings from a html using DOM parser.
Learn More –
- Find and Count All Headings From a Web Page in PHP
- Find and Extract All Images From a Web Page in PHP
- Find and Extract All links From a HTML String in PHP
- How To Generate Fake Data in PHP Using Faker Library
Let’s get started.
Find & Extract All Headings From a Web Page
Inside this example we will use web page URL to get all headings and extract them.
Create file index.php inside your application.
Open index.php and write this complete code into it.
<?php $htmlString = file_get_contents('https://onlinewebtutorblog.com/'); //Create a new DOMDocument object. $htmlDom = new DOMDocument; //Load the HTML string into our DOMDocument object. @$htmlDom->loadHTML($htmlString); //Extract all h1 elements / tags from the HTML. $h1Tags = $htmlDom->getElementsByTagName('h1'); //Extract all h2 elements / tags from the HTML. $h2Tags = $htmlDom->getElementsByTagName('h2'); //Extract all h3 elements / tags from the HTML. $h3Tags = $htmlDom->getElementsByTagName('h3'); //Extract all h4 elements / tags from the HTML. $h4Tags = $htmlDom->getElementsByTagName('h4'); //Extract all h5 elements / tags from the HTML. $h5Tags = $htmlDom->getElementsByTagName('h5'); //Extract all h6 elements / tags from the HTML. $h6Tags = $htmlDom->getElementsByTagName('h6'); // Arrays to store H1 to H6 headings $extractedH1Tags = []; $extractedH2Tags = []; $extractedH3Tags = []; $extractedH4Tags = []; $extractedH5Tags = []; $extractedH6Tags = []; // Loop for h1 foreach($h1Tags as $h1Tag){ //Get the node value of h1 tag $h1Value = trim($h1Tag->nodeValue); $extractedH1Tags[] = $h1Value; } // Loop for h2 foreach($h2Tags as $h2Tag){ //Get the node value of h2 tag $h2Value = trim($h2Tag->nodeValue); $extractedH2Tags[] = $h2Value; } // Loop for h3 foreach($h3Tags as $h3Tag){ //Get the node value of h3 tag $h3Value = trim($h3Tag->nodeValue); $extractedH3Tags[] = $h3Value; } // Loop for h4 foreach($h4Tags as $h4Tag){ //Get the node value of h4 tag $h4Value = trim($h4Tag->nodeValue); $extractedH4Tags[] = $h4Value; } // Loop for h5 foreach($h5Tags as $h5Tag){ //Get the node value of h5 tag $h5Value = trim($h5Tag->nodeValue); $extractedH5Tags[] = $h5Value; } // Loop for h6 foreach($h6Tags as $h6Tag){ //Get the node value of h6 tag $h6Value = trim($h6Tag->nodeValue); $extractedH6Tags[] = $h6Value; } $headingsArray = [ "h1" => $extractedH1Tags, "h2" => $extractedH2Tags, "h3" => $extractedH3Tags, "h4" => $extractedH4Tags, "h5" => $extractedH5Tags, "h6" => $extractedH6Tags ]; echo "<pre>"; print_r($headingsArray);
Output
When we run index.php. Here is the output
We hope this article helped you to Find and Extract All Headings From a Web Page in PHP Tutorial in a very detailed way.
If you liked this article, then please subscribe to our YouTube Channel for PHP & it’s framework, WordPress, Node Js video tutorials. You can also find us on Twitter and Facebook.