Setting custom search engine indexing for a “dynamic WordPress page” with htaccess

You’ll need the Header set .... directive, but to set it conditionally based on the URL. One way of doing this is to use mod_rewrite to set an environment variable (eg. ROBOTS_INDEX) when your URL criteria are met (for the URLs you want indexed) and use the env= argument to the Header directive to conditionally set the X-Robots-Tag header when this env var is not set.

I found it easier to express the logic in this manner, rather than checking for the URLs that you don’t want indexed (and setting the opposite env var, eg. ROBOTS_NOINDEX). And setting the response header when the var is set. Although it may be worth researching this approach some more.

You’ll need to use mod_rewrite, as opposed to mod_setenvif, to set the env var, since you need to examine the query string portion of the URL. (The SetEnvIf directive only allows you to examine the URL-path portion of the URL.)

The complication is that these parameters can be in any order and there can be additional parameters not related that need to be ignored. And that the URL parameter values cannot be mixed, ie. para_a=1&para_b=2 is presumably a “noindex” situation.

  1. Set the env var ROBOTS_INDEX when the URLs you want indexed are requested. Note that these mod_rewrite directives must go before the WordPress front-controller. ie. before the # BEGIN WordPress section.

    # INDEXABLE: Any request that does not include a query string
    # Includes /example-page (no query string at all)
    RewriteCond %{QUERY_STRING} ^$
    RewriteRule ^ - [E=ROBOTS_INDEX:1]
    
    # INDEXABLE: /example-page?para_a=1&para_b=1 (parameters in any order)
    RewriteCond %{QUERY_STRING} (^|&)para_a=1($|&)
    RewriteCond %{QUERY_STRING} (^|&)para_b=1($|&)
    RewriteRule ^example-page$ - [E=ROBOTS_INDEX:1]
    
    # INDEXABLE: /example-page?para_a=2&para_b=2 (parameters in any order)
    RewriteCond %{QUERY_STRING} (^|&)para_a=2($|&)
    RewriteCond %{QUERY_STRING} (^|&)para_b=2($|&)
    RewriteRule ^example-page$ - [E=ROBOTS_INDEX:1]
    
  2. Conditionally set the X-Robots-Tag header when the env var is not set. Note the ! negation prefix on the env var.

    Header set X-Robots-Tag "noindex" env=!ROBOTS_INDEX
    

However, I do feel there is a better “WordPress” way of doing this, without using .htaccess?