Sorting a list of posts displayed under a list of associated terms (which should be sorted without initial articles)

This is a really interesting question, which can result in a very expensive, resource intensive process which can really slow your page down dramatically

PRELUDE

All my code in this question is based on the following post type and taxonomy, so before you run any code, make sure to adjust the post type and taxonomy to your own requirements

  • Custom post type -> release

  • Custom taxonomy -> game

IMPORTANT UPDATE

Since you need to include all terms and the post belonging to these terms, and not just sort and include the first term assigned to a post, we need to completely look at another solution, but still keeping performance in mind.

The new solution would fit the solution by @birgire to your previous question perfectly as we would now sort the query itself, and not the returned posts.

My initial solution to the same question you’ve asked are therefor not needed anymore in order to solve this issue, so it is up to you whether or not you would still want to create and sort the loop with the hidden custom field. My solution using the custom field is still totally valid though. Because my original answer to this question involved my custom field solution, I will carry on using that, although I will try to show changes which you can make if you decide to use the @birgire’s solution rather than my custom field solution

PART ONE POINT ONE

Your first block of code where you query your terms and then run a custom query for each term to get the posts assigned to a term is extremely expensive. As you say, you have a large amount of terms. With any amount of terms, you are really hitting the db really really hard.

You can check this by doing echo get_num_queries() after your code. This will display just how much db queries you are actually doing on that specific page load. I would suggest you sit down before doing this 😉

(NOTE: First get a benchmark. Remove all your code, then run echo get_num_queries(), make a note, replace your code and run echo get_num_queries() again and subtract the two to get an accurate count)

Before we start anything, lets first look at a helper function which will handle the removal of our banned words from term and post names. This function is important and will be used everywhere to keep our code DRY and organised

/**
 * Function get_name_banned_removed()
 *
 * A helper function to handle removing banned words
 * 
 * @param string $tring  String to remove banned words from
 * @param array  $banned Array of banned words to remove
 * @return string $string
 */
function get_name_banned_removed( $string = '', $banned = [] )
{
    // Make sure we have a $string to handle
    if ( !$string )
        return $string;

    // Sanitize the string
    $string = filter_var( $string, FILTER_SANITIZE_STRING );

    // Make sure we have an array of banned words
    if (    !$banned
         || !is_array( $banned )
    )
        return $string; 

    // Make sure that all banned words is lowercase
    $banned = array_map( 'strtolower', $banned );

    // Trim the string and explode into an array, remove banned words and implode
    $text          = trim( $string );
    $text          = strtolower( $text );
    $text_exploded = explode( ' ', $text );

    if ( in_array( $text_exploded[0], $banned ) )
        unset( $text_exploded[0] );

    $text_as_string = implode( ' ', $text_exploded );

    return $string = $text_as_string;
}

This code should go into functions.php or inside a custom plugin (preferably)

Now that we have that covered, lets look at the next part

PART ONE POINT TWO

NOTE: If you are going to use @birgire’s method to remove banned words and sort by that modified post titles, then you can skip this part completely as we will be setting the hidden custom field which we will sort by)

Just again, in short, here is my solution from the link above, this code (which should go into a plugin or functions.php) and only needs to be run once to set a hidden custom field called _custom_sort_post_title to every post. This saves the post title with the leading banned word removed

(BIG NOTE: I have heavily edited version of the original code from my answer in the link)

add_action( 'wp', function ()
{
    add_filter( 'posts_fields', function ( $fields, \WP_Query $q ) 
    {
        global $wpdb;

        remove_filter( current_filter(), __FUNCTION__ );

        // Only target a query where the new custom_query parameter is set with a value of custom_meta_1
        if ( 'custom_meta_1' === $q->get( 'custom_query' ) ) {
            // Only get the ID and post title fields to reduce server load
            $fields = "$wpdb->posts.ID, $wpdb->posts.post_title";
        }

        return $fields;
    }, 10, 2);

    $args = [
        'post_type'        => 'release',       // Set according to needs
        'posts_per_page'   => -1,              // Set to execute smaller chucks per page load if necessary
        'suppress_filters' => false,           // Allow the posts_fields filter
        'custom_query'     => 'custom_meta_1', // New parameter to allow that our filter only target this query
        'meta_query'       => [
            [
                'key'      => '_custom_sort_post_title', // Make it a hidden custom field
                'compare'  => 'NOT EXISTS'
            ]
        ]
    ];
    $q = get_posts( $args );

    // Make sure we have posts before we continue, if not, bail
    if ( !$q ) 
        return;

    foreach ( $q as $p ) {
        $new_post_title = strtolower( $p->post_title );

        if ( function_exists( 'get_name_banned_removed' ) )
            $new_post_title = get_name_banned_removed( $new_post_title, ['the'] );

        // Set our custom field value
        add_post_meta( 
            $p->ID,                    // Post ID
            '_custom_sort_post_title', // Custom field name
            $new_post_title            // Custom field value
        );  
    } //endforeach $q
});

After this code is removed, the only code you will need in functions.php or a plugin will be the following: (NOTE: We will be adding some more work to this action in PART TWO)

add_action( 'transition_post_status', function ( $new_status, $old_status, $post )
{
    // Make sure we only run this for the release post type
    if ( 'release' !== $post->post_type )
        return;

    $text = strtolower( $post->post_title );   

    if ( function_exists( 'get_name_banned_removed' ) )
        $text = get_name_banned_removed( $text, ['the'] );

    // Set our custom field value
    update_post_meta( 
        $post->ID,                 // Post ID
        '_custom_sort_post_title', // Custom field name
        $text                      // Custom field value
    );
}, 10, 3 );

This code will make sure that each and every new post published, or every post edited, that the _custom_sort_post_title custom field is set accordingly.

PART TWO

This is the part that can really get expensive. Simply getting all the terms and looping through them and querying the posts accordingly is NOT an option here as it is a really expensive db operation.

What we will need to do is, we will only run one query here to get all our post ID’s, update the term cache, use get_object_term_cache() to retrieve the post terms, sort the array of ID’s according to the terms they belong to and then safe our results in a transient. We will only safe post ID’s in out transient as we do not want to cramp a huge amount of post data into a messy serialized array

We will rely a lot on the get_name_banned_removed() function to create term names and post titles with the banned words removed.

FEW NOTES

  • I have made this section’s code a bit more dynamic. Instead of hard-coding the post type and taxonomy, I have made them arguments you can pass to the function. You should read the code blocks that come with the code

THE CODE

/**
 * Function get_sorted_post_ids_terms_and_fields()
 *
 * Return a sorted array of post ids. These ID's are sorted according to
 * - Post title with the banned words removed before sorting
 * - Post terms with the banned words removed before sorting
 *
 * @param string $post_type Post type to get posts from Default 'release'
 * @param string $taxonomy  Taxonomy  to get posts from Default 'game'
 * @return array $ids
 */
function get_sorted_post_ids_terms_and_fields( $post_type="release", $taxonomy = 'game' )
{
    $array_combine = [];

    // Sanitize our post type and taxonomy names
    if ( 'release' !== $post_type )
        $post_type = filter_var( $post_type, FILTER_SANITIZE_STRING );

    if ( 'game' !== $taxonomy )
        $taxonomy = filter_var( $taxonomy, FILTER_SANITIZE_STRING );

    // Make sure that the taxonomy exist to avoid bugs later on
    if ( !taxonomy_exists( $taxonomy ) )
        return $array_combine;

    // Our taxonomy exist, let's continue
    // Create a unique transient name
    $transient_name="spbtaf_" . md5( $taxonomy.$post_type );    

    if ( false === ( $array_combine = get_transient ( $transient_name ) ) ) {

        // Set our query arguments. Note, we will not do any sorting here
        $args = [
            'fields'         => 'ids', // Only get post ID's
            'post_type'      => $post_type,
            'posts_per_page' => -1,
            'meta_key'       => '_custom_sort_post_title', // Remove if you use @birgire's solution
            'orderby'        => 'meta_value',               // Change to '_custom' if you use @birgire's solution
            'order'          => 'ASC',
            'tax_query'      => [
                [
                    'taxonomy' => $taxonomy,
                    'operator'  => 'EXISTS'
                ]
            ],     
        ];
        $ids = get_posts( $args );

        // Make sure we have posts
        if ( $ids ) {

            // Update the object term cache, saves plenty db time and calls
            update_object_term_cache( $ids, $post_type );

            $term_post_ids_array = [];
            $term_names          = [];

            // Loop through the posts and save in an array
            foreach ( $ids as $id ) {

                // Get the post terms from our post term cache
                $terms = get_object_term_cache( $id, $taxonomy );

                // Loop through the terms. We definitely have terms
                foreach ( $terms as $term ) {

                    // Remove the banned words from the term name
                    $term_name = strtolower( $term->name );

                    if ( function_exists( 'get_name_banned_removed' ) ) 
                        $term_name = get_name_banned_removed( $term_name, ['the'] );

                    // Save the term name and post ids in an array
                    $term_post_ids_array[$term_name][] = $id;

                    // Save the real term names in an array
                    $term_names[$term_name] = $term->name;

                } //endforeach $terms 
                unset( $term ); 

            } //endforeach $ids
            unset( $id );

            // Sort the array according to our modified term ids
            ksort( $term_post_ids_array );
            ksort( $term_names );

            // Lets replace the modified term names with their proper names
            $array_combine = array_combine( $term_names, $term_post_ids_array );

        } // endif $ids

        // Set the transient
        set_transient( $transient_name, $array_combine, 30*DAY_IN_SECONDS );    
    } // endif get_transient

    return $array_combine;
}

This should function should return a sorted array of post ID’s which is sorted according to

  • the term they belong to with the leading banned words (like the) removed

  • post title with the leading banned word removed

The array is also sorted according to term

The transient is set for 30 days, you can adjust this as necessary.

We need to remove and reset the transient when a new post is published, or when a post is updated, deleted or undeleted. For this we will use the transition_post_status action hook. (To keep everything neat and together, lets combine the action from PART ONE POINT TWO together with this action. I have marked a section which you can remove if you are sorting with @birgire’s custom filter)

add_action( 'transition_post_status', function ( $new_status, $old_status, $post )
{
    /* ----------START DELETE IF YOU USE @birgire's SORTING FILTER------------*/
    // Make sure we only run this for the release post type
    if ( 'release' !== $post->post_type )
        return;

    $text = strtolower( $post->post_title );   

    if ( function_exists( 'get_name_banned_removed' ) )
        $text = get_name_banned_removed( $text, ['the'] );

    // Set our custom field value
    update_post_meta( 
        $post->ID,                 // Post ID
        '_custom_sort_post_title', // Custom field name
        $text                      // Custom field value
    );
    /* -------------END DELETE IF YOU USE @birgire's SORTING FILTER------------*/

    global $wpdb;

    // Delete the transients
    $wpdb->query( "DELETE FROM $wpdb->options WHERE `option_name` LIKE ('_transient%_spbtaf_%')" );
    $wpdb->query( "DELETE FROM $wpdb->options WHERE `option_name` LIKE ('_transient_timeout%_spbtaf_%')" );

    // Reset the transient
    if ( function_exists( 'get_sorted_post_ids_terms_and_fields' ) )
        get_sorted_post_ids_terms_and_fields(); //REMEMBER TO SET POST TYPE AND TAXONOMY

}, 10, 3 );

As you can see, we reset the transient in the transition_post_status action, this also takes a lot of load away from the front end

PART THREE

The loop is a bit tricky. We have a multidimensional array of term names and post ID’s coming from the get_sorted_post_ids_terms_and_fields() function. We need to be clever here to keep our db calls etc as low as possible. The tricky part is to get the full post objects from their respective ID’s. Also, because posts belongs to multiple terms, we have duplicate ID’s as well.

THE PLAN

We will use a custom query to get all the posts. We have a slight issue here as WP_Query does not return duplicate posts. This is where the trick comes in, WP_Query adds the posts it returns into a cache. Once a post is in the cache, we can query it with get_post() over and over without making a db call. This is the clever part of all the code.

First, we also need a way to flatten the multidimensional array to grab all the post_ids before passing it to WP_Query

/**
 * Function flatten_array()
 *
 * Function to flatten an array and get all array values
 * Special thanks to zdenko
 * @link https://gist.github.com/kohnmd/11197713
 *
 * @param array  $array The multidimensional array to flatten
 * @return array $array The flattened array
 */
function flatten_array( $array ) 
{
    // Make sure $array is an array, if not return $array as an array
    if ( !is_array( $array ) )
        return [$array];

    return array_reduce( $array, function ( $a, $b ) {
            return array_merge( $a, flatten_array( $b ) ); 
        }, [] );
}

What we will do now is to query our posts from the flatten array of ID’s in order to store them in the post cache. For this we will use a custom SQL query fro the following reasons

  • It is superfast

  • We do not need any filters or action to alter this query

  • All the hard work was already done in the get_sorted_post_ids_terms_and_fields(). All we need to do now is to take the ID’s from that function and get the full post objects.

Once we have the full posts, we can use update_post_cache to add our posts in the post cache

/**
 * Function set_posts_to_cache()
 *
 * Function to query all the full post objects who's ID's is in the 
 * get_sorted_post_ids_terms_and_fields() function and then to add these
 * post objects into the post cache so we can query them over and over again 
 * with get_post()
 *
 * @param string $post_type Post type to get posts from Default 'release'
 * @param string $taxonomy  Taxonomy  to get posts from Default 'game'
 */
function set_posts_to_cache( $post_type="release", $taxonomy = 'game' )
{
    global $wpdb;

    // Check if the taxonomy exists
    if ( !taxonomy_exists( $taxonomy ) ) 
        return false;

    // Sanitize the taxonomy name
    $taxonomy = filter_var( $taxonomy, FILTER_SANITIZE_STRING );

    // Sanitize the post type
    if ( 'release' !== $post_type )
        $post_type = filter_var( $post_type, FILTER_SANITIZE_STRING );

    // Get our post ID's

    if ( function_exists( 'get_sorted_post_ids_terms_and_fields' ) ) {

        $combined_array = get_sorted_post_ids_terms_and_fields( $post_type, $taxonomy );

        if ( $combined_array ) {

            if ( function_exists( 'flatten_array' ) ) {

                // Flatten our array in order to pass it to WP_Query    
                $flatten_array = flatten_array( $combined_array );
                $unique_ids    = array_unique(  $flatten_array  );
                $string_ids    = implode( ', ', array_map( 'absint', $unique_ids ) );

                /**
                 * Run our custom SQL query and add our posts in cache
                 *
                 * We only need to get the posts by ID and post type. Remember, the function
                 * get_sorted_post_ids_terms_and_fields() has already taken care of all the hard 
                 * work. All this query needs to do is to retrieve the posts which ID's are returned 
                 * by get_sorted_post_ids_terms_and_fields() to add the posts in cache
                 */
                $posts_to_cache = $wpdb->get_results( 
                    $wpdb->prepare( 
                        "SELECT $wpdb->posts.*
                        FROM $wpdb->posts
                        WHERE 1=1
                        AND $wpdb->posts.ID IN ($string_ids)
                        AND $wpdb->posts.post_type = %s
                        ORDER BY $wpdb->posts.post_date DESC
                        LIMIT 0, %d",
                        $post_type,
                        count( $unique_ids )
                    )
                );  
                // Update the post cache
                update_post_caches( $posts_to_cache );
            } // endif function_exists( 'flatten_array' )
        } // endif if ( $combined_array )
    } // endif ( function_exists( 'get_sorted_post_ids_terms_and_fields' ) )
}

Lets look at the loop now

if ( function_exists( 'get_sorted_post_ids_terms_and_fields' ) ) {

    $combined_array = get_sorted_post_ids_terms_and_fields();

    // Make sure we have a valid array
    if ( $combined_array ) {

        if ( function_exists( 'set_posts_to_cache' ) ) {
            // Set all our posts into the post cache. remember to pass the correct post type and taxonomy
            set_posts_to_cache( 'release', 'game');

            // Set a variable to hold the first letter of the term name
            $first_letter="";
            foreach ( $combined_array as $term_name=>$post_ids ) {
                // Display the first letter from the terms
                $term_name_modified = strtolower( $term_name );
                if ( function_exists( 'get_name_banned_removed' ) ) 
                    $term_name_modified = get_name_banned_removed( $term_name_modified, ['the'] );

                $starting_letter = strtoupper( mb_substr( $term_name_modified, 0, 1 ) );

                if ( $first_letter !== $starting_letter )
                    echo '<p>' . $starting_letter . '</p>'; 

                // Update the $first_letter variable

                $first_letter = $starting_letter;
                // Display the term name above the posts
                echo $term_name . '</br>';

                // Apply the get_post() function to all post ids to get full posts objects
                $posts_array = array_map( 'get_post', $post_ids );

                // Now that we have full posts, lets display them in our loop
                foreach ( $posts_array as $post ) {
                    setup_postdata( $post );

                    // APPLY YOUR LOOP AS PER NORMAL AS PER YOUR LINK. 
                    echo '<li>' . get_the_title() . '</li>';

                } // endforeach $posts_array

                wp_reset_postdata(); // VERY VERY IMPORTANT
            } // endforeach $combined_array

        } // endif function_exists flatten_array

    } // endif $combined_array

} // endif function_exists get_sorted_post_ids_terms_and_fields

Your posts should now show as follow:

  • All posts are alphabetically sorted under the term they belong to according to post name with banned word removed in the initial sorting process

  • All terms are sorted alphabetically with the banned words removed from initial sorting. All terms have their posts sorted beneath them

  • All terms are sorted under the letter their name starts with with the banned words removed in initial sorting

You would just need to apply your own custom styling and mark up to the loop

TEST RESULTS – THE PROOF IS IN THE PUDDING, NJAMMIE!!!

After this huge marathon, lets crunch the numbers. On 24 posts, with all the huge amount of work done to strip and sort post titles and term names, I ended up with doing

  • 6db calls in about 0.16 seconds

  • 5db calls in about 0.14 seconds

and we did not abuse our transient. Pretty impressive for such a huge job, lol.

Just for interest sake, I copied the loop from your first block of code, and got the following result

  • 73 queries in 0.5 seconds

Huge difference, proofed my point 😉

PART UNDER CONSTRUCTION

If we want to page the query according to the first letter of the term, we would need some extra work here. This section will stay under construction until I get a proper solution to this.

Whatever solution I come up with will not affect parts 1 and 2 of my answer. Part three will definitely be affected and will need some kind of rewriting

Leave a Comment