Why is the loop not empty on some 404s?

You may be surprised, but there is nothing strange there.

First of all let’s clarify that in WordPress when you visit a frontend URL you trigger a query. Always.

That query is just a standard WP_Query, just like the ones run via:

$query = new WP_Query( $args );

There is only one difference: the $args variables are generated by WordPress using the WP::parse_request() method. What that method does is just look at the URL, and at the rewrite rules, and convert the URL into an array of arguments.

But what happens when that method is not able to do that because the URL is non-valid? The query args is just an array like this:

array( 'error' => '404' );

(Source here and here).

So that array is passed to WP_Query.

Now try to do:

$query = new WP_Query( array( 'error' => '404' ) );
var_dump( $query->request );

Are you surprised that the query is exactly the one in OP? I’m not.

So,

  1. parse_request() builds an array with an error key
  2. That array is passed to WP_Query, that just runs it
  3. handle_404() that runs after the query, looks at the 'error' parameter and sets is_404() to true

So, have_post() and is_404() are not related. The problem is that WP_Query has no system to short-circuit the query when something goes wrong, so once the object is built, pass some args to it and the query will run…

Edit:

There are 2 ways to overcome this problem:

  • Create a 404.php template; WordPress will load that on 404 URLs and there you don’t have to check for have_posts()
  • Force $wp_query to be empty on 404, something like:

    add_action( 'wp', function() {
        global $wp_query;
        if ( $wp_query->is_404() ) {
            $wp_query->init();
            $wp_query->is_404 = true; // init() reset 404 too
        }
    } );
    

Leave a Comment