What is the proper way to use pre_get_post?

Update

From your comments, it seemed that you’re having a hard time understanding about the WordPress main query and the is_main_query() method in the WP_Query class, so hopefully the following would help clear your doubts:

  1. Please check codex.wordpress.org/Query_Overview and learn how WordPress determines what posts or pages to display on a page. The main query is setup in step 4 and there you’ll see “$wp_query is an object of class WP_Query“.

    That object, however, is a copy of the actual object which holds the main query, and the object’s name is wp_the_query, i.e.

    $GLOBALS['wp_the_query'] = new WP_Query();
    
  2. So, the is_main_query method is used to check whether the query object – that’s passed as the 1st parameter to the callbacks/functions hooked on pre_get_posts – is the main query object or a custom/secondary query object.

    // WordPress runs the pre_get_posts hook like so, where the $this is an instance
    // of the WP_Query class:
    do_action_ref_array( 'pre_get_posts', array( &$this ) );
    
    // So a callback/function hooked on the pre_get_posts hook like so, will receive
    // the above $this as the first (and the only) parameter:
    function my_pre_get_posts_callback( $query ) {
        // do something here
    }
    
    // And inside of the above function, you can call the is_main_query METHOD if you
    // need to check whether the $query is for the main query or not:
    if ( $query->is_main_query() ) {
        // it's the main query object
    }
    
    // The above `if` block is equivalent to this, but do not do this and instead, use
    // the is_main_query METHOD when inside a filter or action hook callback which is
    // passed the WP_Query object:
    global $wp_the_query; // access the main query object which should NEVER be modified
    if ( $query === $wp_the_query ) {
        // it's the main query object
    }
    
  3. See also the global function named is_main_query, which calls the above class method and checks if the global variable $wp_query (the one mentioned in the above linked article in Codex) is or (still) references the main query object.

    // The global is_main_query FUNCTION will always compare the global $wp_query object
    // with the global $wp_the_query object, i.e.
    // This function call:
    if ( is_main_query() ) {
        // it's the main query object
    }
    
    // is equivalent to, but this is just a demo, so do not do this:
    global $wp_query, $wp_the_query;
    if ( $wp_query === $wp_the_query ) {
        // it's the main query object
    }
    

Remember, a global function is a function defined in the global scope and can be called anywhere in PHP, whereas a class method is a function defined inside a class.

<?php
/*
Plugin Name: Example Plugin
Version: 0.1
*/

class Example_Plugin {
    public function a_class_method() {
        // do something
    }
}

function a_global_function() {
    // do something
}

And you should read these, if you haven’t:


Original Answer

Is $query->is_main_query() needed?

From the pre_get_posts documentation:

Targeting the right query

Be aware of the queries you are changing when using the
pre_get_posts action. Make use of conditional
tags

to target the right query. For example, it’s recommended to use the
the
is_admin()
conditional to not change queries in the admin screens. With the
$query->is_main_query() conditional from the query object you can
target the main query of a page request. The main query is used by the
primary post
loop
that
displays the main content for a post, page or archive. Without these
conditionals you could unintentionally be changing the query for
custom loops in sidebars, footers, or elsewhere.

So that should answer your question, but I’d like to add some additional details:

  • pre_get_posts runs for all WP_Query requests and runs on pretty much everywhere in WordPress, e.g. the front-end/non-admin side of the site, the admin area (wp-admin), the REST API and feed requests, and many more including favicon requests (at /favicon.ico)!

    So if you’re modifying the query arguments such as posts_per_page and/or post_type, then they will by default be modified for all queries including the main WordPress query which runs automatically on page load before the template is determined.

    Therefore, if for example post_type is set to post, then the admin pages for managing posts in other post types would always show posts in the post type!

    And thus, make sure to target the right query, in order to prevent issues like the above one from happening.

If the aim is to only target the search page, I’m unclear as to why it
couldn’t/wouldn’t be written differently.

As for that if ( $query->is_search() ), you can of course do that or use such conditional, but remember that the $query could be the main query (the global $wp_query object) or a custom instance of the WP_Query class, e.g. $query = new WP_Query( 's=foo' );, hence you may or may not need to check for the main query, depending on your specific use case or what your code modifies/does.

As for your 1st example, though, which is in example 2 below, that can surely be done, and any of these will work, i.e. the posts_per_page is set to 5 only for the main query on the search results pages, e.g. at https://example.com/?s=keyword:

  • Example 1: The if block here is used to exit the function if the current query is not the main query or is not a search query.

    add_action( 'pre_get_posts', function ( $query ) {
        if ( is_admin() || !$query->is_search() || !$query->is_main_query() ) {
            return; // do nothing and exit the function
        }
    
        $query->set( 'posts_per_page', 5 );
    } );
    
  • Example 2: The if block here is used to do something only if the current query is the main query and is a search query.

    add_action( 'pre_get_posts', function ( $query ) {
        if ( $query->is_search() && $query->is_main_query() && ! is_admin() ) {
            $query->set( 'posts_per_page', 5 );
        }
    } );
    

So the difference is just in how you write the algorithm.