HTML manipulation with PHP

While it would be nice that WordPress plugins to always offer a way to overwrite their template files, that’s not the reality.

This is especially frustrating when working with libraries like Alpine.js, which makes use of directives.

Parenthesis: A directive is just a HTML attribute that has a special meaning for the JS library, for example:

<div x-bind:class="{ '--open': isOpen }"></div>

Without Alpine.js the x-bind:class is a meaningless attribute that doesn't do anything.

DomDocument and DOMXPath

When there is no direct way of modifying the HTML, I tend to reach for the DomDocument class.

With DomDocument you can parse HTML code and remove parts of it, add new elements or attributes.

If you never used DomDocument before, here's how it typically looks and works:

$normalizedHtml = mb_convert_encoding($htmlContentThatComesFromSomewhere, 'HTML-ENTITIES', 'UTF-8');
 
// New up the objects and pass the HTML that you have
$dom = new DOMDocument();
$dom->loadHTML($normalizedHtml);
 
$xPath = new DOMXPath($dom);
// Find all <sup> tags
$supTags = $xPath->query('//sup');
 
foreach ($supTags as $node) {
// Get whatever that is in the <sup> tag
$nodeContent = $node->nodeValue;
 
// Create a new <a> tag
$anchor = $dom->createElement('a');
$anchor->setAttribute('href', '#');
// Add the directives that are just attributes with certain value
$anchor->setAttribute('x-data', '{}');
$anchor->setAttribute('x-on:click', '$dispatch("open-footnotes")');
// Set contents of the <a> tag to be the same as the <sup> tag
$anchor->nodeValue = $nodeContent;
 
// Replace the <sup> tag content with the <a> tag
$node->nodeValue = '';
$node->appendChild($anchor);
}
 
// Save the modification
$modifiedHtml = $dom->saveHTML();

This turns a HTML like this:

<p>Lorem ipsizzle<sup>1</sup> dolizzle sit break it down...</p>

into this:

<p>Lorem ipsizzle<sup><a href="#" x-data="{}" x-on:click="$dispatch('open-footnotes')">1</a></sup> dolizzle sit break it down...</p>

By the way, there are many wrapper packages that offer the same functionality and more with nicer API.

It's also handy to know that tools for converting CSS selector to XPath queries exist. This is because XPath queries quickly get more verbose than you expect. Selecting a plain HTML tag is straightforward, but selecting an element that has the has-blue-color class looks like this:

$xPath->query('.//\*\[contains(concat(" ",normalize-space(@class)," ")," has-blue-color ")\]');