Parsing HTML with regular expressions is not ideal. A better alternative is to use DOMDocument
and DOMXpath
;
This code uses DOMDocument
and DOMXpath
to parse and modify the HTML without relying on regular expressions. Each link in the content will have the onclick
attribute added along with the appropriate window.open()
code using the URL pulled from the value of the href
attribute. The href
attribute is then removed from the link.
add_filter( 'the_content', 'wpse_cordova_links', 10, 1 );
function wpse_cordova_links( $content ) {
// Create an instance of DOMDocument.
$dom = new \DOMDocument();
// Suppress errors due to malformed HTML.
// See http://stackoverflow.com/a/17559716/3059883
$libxml_previous_state = libxml_use_internal_errors( true );
// Populate $dom with $content, making sure to handle UTF-8, otherwise
// problems will occur with UTF-8 characters.
$dom->loadHTML( mb_convert_encoding( $content, 'HTML-ENTITIES', 'UTF-8' ) );
// Restore previous state of libxml_use_internal_errors() now that we're done.
// Again, see http://stackoverflow.com/a/17559716/3059883
libxml_use_internal_errors( $libxml_previous_state );
// Create an instance of DOMXpath.
$xpath = new \DOMXpath( $dom );
// Query all links within our content.
$links = $xpath->query( '//a' );
// Iterate over the $links.
foreach ( $links as $link ) {
if ( $link->hasAttributes() ) {
// Get the value of the href attribute
$link_href = $link->getAttribute( 'href' );
// Create an onlick attribute and set the value
$link_onclick = $dom->createAttribute( 'onclick' );
$link_onclick->value = "window.open( '" . $link_href . "', '_blank', 'location=no' );";
$link->appendChild( $link_onclick );
// Remove the href attribute
$link->removeAttribute( 'href' );
}
}
// Save the updated HTML
$content = $dom->saveHTML();
return $content;
}
Example HTML before processing
<p><a href="http://example.com/">click</a></p>
<p>Lorem ipsum dolor sit amet, ecclesiam mittam est amet constanter approximavit te. Introivit gubernum defunctam vivum eum ego esse ait mea Christianis<br>
<a class="test-class test-other-class" href="http://example.com/1/">click me too</a> aedificatur ergo accipiet duxit ad te. Ascendi in modo invenit ubi diu requievit agi coepit. Apollonii appropinquat tation ulterius quod ait mea Christianis aedificatur ergo accipiet si mihi esse deprecor cum. Equidem deceptum in fuerat eum est in, quoque sed quod ait est in fuerat.</p>
<p><a data-test="55" href="http://example.com/2/">click me as well</a></p>
<p>Lorem ipsum dolor sit amet, ecclesiam mittam est amet constanter approximavit te. Introivit gubernum defunctam vivum eum ego esse ait mea Christianis aedificatur ergo accipiet duxit ad te. Ascendi in modo invenit ubi diu requievit agi coepit. Apollonii appropinquat tation ulterius quod ait mea Christianis aedificatur ergo accipiet si mihi esse deprecor cum. Equidem deceptum<br>
<a id="test-id" class="test-class" href="http://example.com/3/">clicky</a> in fuerat eum est in, quoque sed quod ait est in <a>This link has no href attribute</a> fuerat.</p>
Example HTML after processing
<p><a onclick="window.open( 'http://example.com/', '_blank', 'location=no' );">click</a></p>
<p>Lorem ipsum dolor sit amet, ecclesiam mittam est amet constanter approximavit te. Introivit gubernum defunctam vivum eum ego esse ait mea Christianis<br>
<a class="test-class test-other-class" onclick="window.open( 'http://example.com/1/', '_blank', 'location=no' );">click me too</a> aedificatur ergo accipiet duxit ad te. Ascendi in modo invenit ubi diu requievit agi coepit. Apollonii appropinquat tation ulterius quod ait mea Christianis aedificatur ergo accipiet si mihi esse deprecor cum. Equidem deceptum in fuerat eum est in, quoque sed quod ait est in fuerat.</p>
<p><a data-test="55" onclick="window.open( 'http://example.com/2/', '_blank', 'location=no' );">click me as well</a></p>
<p>Lorem ipsum dolor sit amet, ecclesiam mittam est amet constanter approximavit te. Introivit gubernum defunctam vivum eum ego esse ait mea Christianis aedificatur ergo accipiet duxit ad te. Ascendi in modo invenit ubi diu requievit agi coepit. Apollonii appropinquat tation ulterius quod ait mea Christianis aedificatur ergo accipiet si mihi esse deprecor cum. Equidem deceptum<br>
<a id="test-id" class="test-class" onclick="window.open( 'http://example.com/3/', '_blank', 'location=no' );">clicky</a> in fuerat eum est in, quoque sed quod ait est in <a>This link has no href attribute</a> fuerat.</p>