Query posts distinct authors

With what you are asking, it is really difficult to come up with some performant easy way that is also reliable. As @birgire already stated, his solution is not reliable, but from tests, it seem to be the fastest clocking in at 2 db queries in about 0.015s average.

From a quick discussion between @birgire and me in comments to his answer, I’ve decided to try and do it with what WordPress offers. It is the most reliable way by far, but does come at some price, clocking in at 20 db queries in about 0.05s average.

The real big problem is that you cannot use any kind of cache due to the request for random results, so I had to look at other ways to cut down on time and db calls.

For the first section, I used this solution by @birgire (also mentioned by him in comments) where we make use of pre_user_query to to alter the SQL query of WP_User_Query to get only the amount of authors needed randomly. This section of the code is really fast. To increase performance, I have also set the WP_User_Query to only get the author ID’s as this is all we need. This section clock in at 2 db queries in 0.002s average.

To get a single post from each author, we need to do a single WP_Query for each, which amount to a total of 4 in your case. This is the rather expensive side of the function. Contributing factors that really slows the queries down is the fact that we need random order and we are doing a tax_query as well which uses join clauses.

The only way I could speed this section up was to also just get the post ID’s. Overall, this came in quicker than getting the complete posts and displaying them than just taking the post ID’s and running another WP_Query afterwards. I got 7 queries less running an extra instance of WP_Query

OK, enough talk, here is the function: (Can maybe use some fine tuning)

function wpse177162_random_author_posts($number = 4, $args = [])
{

    function my_pre_user_query( $q )
    {

        $limit = preg_replace( '/[^\d]/', '', $q->query_limit );

        $from   = 'WHERE 1=1';
        $to     = sprintf( 'WHERE RAND()<(SELECT ((%d/COUNT(*))*10) FROM %susers)', 
                            $limit, 
                            $GLOBALS['wpdb']->prefix 
                 );

        $q->query_where   = str_replace( $from, $to, $q->query_where );
        $q->query_orderby = ' ORDER BY RAND() ';

        // remove the hook    
        remove_action( current_filter() , __FUNCTION__ );
    }

    $user_query_args = [
        'who'    => 'authors',
        'fields' => 'ID',
        'number' => $number,
    ];

    add_action( 'pre_user_query', 'my_pre_user_query' );
    $user_query = new WP_User_Query($user_query_args);
    remove_action( 'pre_user_query', 'my_pre_user_query' );

    $users = $user_query->results;
    $post_ids="";
    if ($users) {

        foreach ($users as $user) {

            $user_args = [
                'author' => $user,
                'fields' => 'ids',
             'no_found_rows' => true
        ];
            $combined_args = wp_parse_args($args, $user_args);
            $q = new WP_Query($combined_args);

            $q_posts[] = $q->posts;

        }

        foreach ($q_posts as $q_post) {

            foreach ($q_post as $v ) {

                $post_ids[] = $v;

            }

        }

    }
    return (array) $post_ids;

}

Just a few notes on the function

  • The first parameter $number is the amount of authors to get

  • The second parameter is $args which is the same as the parameters used by WP_Query and is also directly fed into WP_Query as query arguments. You can use this in the same exact way, with one exception, don’t set the author parameters, this will break the function

To come to how it is used, you would then use it in your template like this

$author_posts = wpse177162_random_author_posts(4, array('posts_per_page' => 1, 'cat' => 1, 'orderby' => 'rand'));
$q = new WP_Query(array('post__in' => $author_posts));

As already said, the reason for the extra WP_Query is for performance, as doing it this way gave better numbers in testing

EDIT

On recommendation from @birgire, I have updated the code above with no_found_rows to save on db calls. I did get a slight performance enhancement, on test I saved 4 db calls although the time basically stayed the same.

update_post_meta_cache and update_post_term_cache actually doubled the time taken to complete the queries, and the queries stayed fixed at 20 throughout. So not a way to go 🙂

Leave a Comment