dynamic page not displaying correctly when Varnish hosting ignores query string parameters

Caching content with query string parameters

Varnish hashes the URL and uses this value as its cache key. When a single value in the URL changes, the cache key changes. This would result in a cache miss.

Query string parameters are exceptionally prone to this: omitting a parameter, adding a parameter, or changing the order of parameters can cause a cache miss.

Adding the following VCL snippet, will ensure that query string parameters are sorted alphabetically, which will increase your hit rate:

import std;

sub vcl_recv {
    set req.url = std.querysort(req.url);
}

Dynamic logo

The VCL programming language, gives you the flexibility to decide how certain decisions are made on the edge. Despite the fact that values are cached, or that the origin server returns a certain value, you can still change what the client sees by writing some VCL code.

You could in fact capture requests for the logo, and still re-route the request to a different URL internally. You could even introspect query string parameters to compose the URL of the dynamic logo.

The VCL code you need to make this happens, depends on a lot of factors. It’s up to you to describe these rules and the required logic.

Cleaning up UTM parameters

Query string parameters that are used by JS libraries for tracking, can be stripped off in VCL. The origin server doesn’t need them to render the page, and adding these parameters will only cause more cache misses.

Here’s some VCL to clean up your URL:

sub vcl_recv {
  # Remove tracking parameters
  if (req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=") {
    set req.url = regsuball(req.url, "&(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "");
    set req.url = regsuball(req.url, "\?(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "?");
    set req.url = regsub(req.url, "\?&", "?");
    set req.url = regsub(req.url, "\?$", "");
  }

  # Strip HTML anchors
  if (req.url ~ "\#") {
    set req.url = regsub(req.url, "\#.*$", "");
  }

  # Strip a trailing ? if it exists
  if (req.url ~ "\?$") {
    set req.url = regsub(req.url, "\?$", "");
  }
}
  • First of all, some known tracking query string parameters are stripped using a regsuball() find and replace call
  • The next step is to remove HTML anchors from the URL
  • Finally we remove trailing question marks, because they imply that no query string parameters are used

Summary

VCL has the necessary syntax to tackle your problems. You’ll probably use the regsub() and regsuball() functions to remove the proverbial garbage from your URL to ensure a better hit rate.

You can also perform dynamic decision making using VCL, but you’ll have to describe the necessary logic and rules before we can talk about the VCL implementation.