Update to email obfuscator

I have done a slightly novel Wordpress hack to get my Graceful Email Obfuscation plugin to work

properly. Wordpress does not quite have enough hooks to what I wanted, mainly because themes have too much variation between them. Ideally, I would like to filter the main content block of the page and replace it with my plugin’s text when it is needed (try turning off JavaScript in your browser and clicking on my email address in the sidebar on the left to get an example of this in action). To work reliably across themes, I kludged together a system which parsed the page as DOM, after some regular expressions to neaten it because there is no way in WP to grab the entire page. Then various heuristics are followed, looking for a div with various ids or roles to guess which bit of the page the main content block is. The tricky technicalities are in doing the parsing and outputting correctly to conform to XML/SGML/HTML5 rules. In the end, I gave up doing it perfectly, with the result that pages served to perfectly compliant user agents will notice that they are not valid, but I have managed to restrict the changes to the DOM structure just to ones which will not change the displayed page.

The relevant snippets are:

<?php
add_action('init','geo_hijackemailload');

function geo_hijackemailload() {
  if (isset($_GET['geo-address'])) {
    add_action('wp_head', 'geo_buffer_start');
    add_action('wp_footer', 'geo_buffer_end');
  }
}

function geo_buffer_start() {
  ob_start('geo_content_replace');
}

function geo_buffer_end() { ob_end_flush(); }

function geo_content_replace($content) {
  /*
   * Stage 1: Generate the text to return
   */
  //--Snip--
  $insert = "<div>Text to replace with</div>";


  /*
   * Stage 2: Work out where to put the text.
   *   Scan for </head>, grab everything
   *   after it; <body> is closed
   *   automatically
   */
  $split = explode('</head>',$content);
  $content = $split[count($split) - 1];

  $return = '';
  if (count($split) > 1) $return .= $split[0];
  $return .= '</head>';

  $doc = new DOMDocument();
  //Very grunky, but only way to get PHP DOM
  //to read in UTF-8
  $html_meta = '<html><head>
    <meta http-equiv="content-type"
        content="text/html; charset=utf-8">
    </head>';
  $content = "$html_meta$content</html>";
  if ($doc->loadHTML($content)) {
    $xpath = new DOMXPath($doc);
    $el = $xpath->query(
      '//div[@id=\'content\'] |
       //div[@role=\'main\']'
    );
    if ($el->length == 0) {
      $el = $xpath->query(
        '//div[@id=\'main\']'
      );
    }

    if ($el->length != 0) {
      $el = $el->item(0);
      //We are all set to go: use DOM
      //methods to put our text in place

      //Remove children
      while($el->hasChildNodes()) {
        $el->removeChild(
          $el->childNodes->item(0)
        );
      }

      $geo_rep = new DOMDocument();
      $geo_rep->loadHTML(
        "$html_meta
          <body>$insert</body>
        </html>"
      );
      //Have to get rid of <html> and
      //<body> levels inserted
      $imported = $doc->importNode(
        $geo_rep->documentElement->
                  childNodes->item(1)->
                  childNodes->item(0),
        TRUE
      );
      $el->appendChild($imported);

      $insert = $doc->saveXML(
        $doc->documentElement,
        LIBXML_NOEMPTYTAG
      );
      $insert = explode('</body>', $insert);
      $insert = $insert[0];
      $insert = explode('</head>', $insert);
      $insert = $insert[1];
    } else {
      $return .= '<body>';
    }
  } else {
    $return .= '<body>';
  }

  $return .= $insert;

  return "<!--MARKER-->
             $return
          <!--MARKER-->";
}
?>