WP REST API V2 – Retrieve sub page by full slug (URL/Path)

Unfortunately, this functionality is not natively supported out of the box. In detail, the problem is that most post types including page use the base WP_REST_Posts_Controller which maps the slug parameter to the post_name__in WP_Query argument, which does not facilitate resolving hierarchical slugs. The pagename query variable does, however – but only one per query, which might be why it’s not already leveraged for REST requests for hierarchical post types.

There are a number of solutions and work-arounds. Please note that the code below has not been thoroughly tested, and the JavaScript in particular neglects important authentication and error-handling practices – it is intended for illustrative purposes only.


Make Multiple Requests

Your client can simply work through each path part and use the _fields parameter to minimize server load by only requesting ancestor posts’ ID, then using that post ID as the parent argument for the subsequent request:

async function wpse261645_fetchPage( path ) {
  const parts = path.split( "https://wordpress.stackexchange.com/" );
  const uri = '/wp-json/wp/v2/pages';
  let parent_id;

  for( let i = 0; i < parts.length; i++ ) {
    const params = new URLSearchParams( { slug: parts[i] } );

    if( i < parts.length - 1 )
      params.append( '_fields', 'id' );

    if( parent_id )
      params.append( 'parent', parent_id );
    
    const res = await fetch(
      `${uri}?${params}`,
      {
        method: 'GET',
        headers: { 'Content-Type': 'application/json' }
      }
    ).then( res => res.json() );

    if( i === parts.length - 1 )
      return res;

    parent_id = res[0].id;
  }
}

Modify slug REST Param/QV Mapping

This is probably the most appealing solution as it can effectively address the original issue directly without requiring special handling client-side. But it’s a bit convoluted and experimental due to the number of moving parts involved and my own lack of familiarity with the REST API – I’m totally open to suggestions and improvements!

Before the WP_REST_Posts_Controller executes a query to retrieve the items corresponding to the request, it runs the query args and the request object through the rest_{$this->post_type}_query filter. We can leverage this filter to selectively re-map slug as necessary for chosen post-types or controllers.

It’s also necessary to adjust the slug parameter’s schema, parsing and sanitization routine for the relevant controllers such that it doesn’t strip out /s or %2Fs from the slug value or trip validation errors. I don’t think that this should create any compatibility issues as the only parts of the schema and parameter registration that are touched are swapping out their sanitization callbacks – the changes should be near invisible to clients and discovery, and wholly backwards-compatible with the original functionality (unless someone’s been relying on the REST API to transform /s in slugs into -s) – but I haven’t tested it thoroughly.

Parsing, Schema, and Sanitization Adjustments (to keep the /s)

function wpse261645_sanitize_nested_slug( $slug ) {
  // Exploding slugs, as one does.
  $slug_parts = array_map( 'sanitize_title', explode( "https://wordpress.stackexchange.com/", $slug ) );

  return implode( "https://wordpress.stackexchange.com/", $slug_parts );
}

function wpse261645_parse_nested_slug_list( $slugs ) {
  $slugs = wp_parse_list( $slugs );

  return array_unique( array_map( 'wpse261645_sanitize_nested_slug', $slugs ) );
}

function wpse261645_nested_slug_schema( $schema ) {
  $schema['slug']['arg_options']['sanitize_callback'] = 'wpse261645_sanitize_nested_slug';

  return $schema;
}
add_filter( 'rest_page_item_schema', 'wpse261645_nested_slug_schema' );

function wpse261645_nested_slug_collection_params( $params ) {
  $params['slug']['sanitize_callback'] = 'wpse261645_parse_nested_slug_list';

  return $params;
}
add_filter( 'rest_page_collection_params', 'wpse261645_nested_slug_collection_params' );

Remapping the slug REST Param to Query Variables

Now that /s are persisted in the slug parameter, we can map the values into a variety of different queries:

function wpse261645_remap_slug_param_qv( $args, $request ) {
  $slugs = $request->get_param( 'slug' );

  // If the `slug` param was not even set, skip further processing.
  if( empty( $slugs ) )
    return $args;

  // Pull out hierarchical slugs into their own list.
  $nested_slugs = [];
  foreach( $slugs as $index => $slug ) {
    if( strpos( $slug, "https://wordpress.stackexchange.com/" ) !== false ) {
      $nested_slugs[] = $slug;
      unset( $slugs[ $index ] );
    }
  }

  if( count( $slugs ) ) {
    $args['post_name__in'] = $slugs;

    if( count( $nested_slugs ) ) {
      $args['wpse261645_compound_query'] = true;
      $args['post__in'] = array_map( 'url_to_postid', $nested_slugs );

      add_filter( 'posts_where', 'wpse261645_compound_query_where', 10, 2 );
    }
  }
  else {
    unset( $args['post_name__in'] );

    if( count( $nested_slugs ) === 1 )
      $args['pagename'] = $nested_slugs[0];
    elseif( count( $nested_slugs > 1 ) )
      $args['post__in'] = array_map( 'url_to_postid', $nested_slugs );
  }

  return $args;
}
add_filter( 'rest_page_query', 'wpse261645_remap_slug_param_qv', 10, 2 );

function wpse261645_compound_query_where( $where, $query ) {
  global $wpdb;

  if( ! isset( $query->query['wpse261645_compound_query'] ) )
    return $where;

  return preg_replace(
    "/ AND ({$wpdb->posts}.post_name IN \([^)]*\)) AND ({$wpdb->posts}.ID IN \([^)]*\))/",
    ' AND ($1 OR $2)',
    $where
  );
}

The logic above handles a number of different situations depending on the value of slug:

  • Any number of flat slugs are mapped into the post_name__in QV as per normal.
  • A single hierarchical slug will be mapped into the pagename QV, letting WP_Query natively handle the path resolution.
  • A list of hierarchical slugs will be resolved via url_to_postid() and mapped into the post__in QV. The lookups add additional overhead.
  • A list of intermixed hierarchical and flat slugs will be mapped into the post__in and post_name__in QVs respectively and the WHERE clause modified to OR these conditions instead of ANDing them. Hierarchical slugs will be resolved to IDs, adding additional overhead.

In summary, the most efficient queries are the product of passing slug as a single hierarchical slug, or any number of flat slugs in a list. Lists of hierarchical or intermixed slugs will result in additional overhead.

As an added benefit, this implementation also inherently facilitates explicitly requesting a top-level slug by including a /. E.g. ?slug=foobar would return all posts with the slug foobar as per usual, but ?slug=/foobar will return just the post with the slug foobar which has no parent.


Use a HEAD Request to the Web Path

This is a horrible dirty hack that relies on conventions outside of the REST API – ones which may incur a substantial overhead and which some sites may be prone to disabling to boot – I strongly recommend against doing this. Doubly so in any code intended for distribution; you will end up with a lot of unhappy users.

By default, WordPress returns a number of Link HTTP headers in responses to content requests, one of which being the REST route to the resource or collection corresponding to the content. As such it’s possible to resolve any nested slug path to a REST resource within 2 requests by leveraging WordPress’s frontend permalink routing:

async function wpse261645_fetchPage( path ) {
  let uri = `/wp-json/wp/v2/pages/?slug=${path}`;

  if( path.includes( "https://wordpress.stackexchange.com/" ) ) {
    const web_res = await fetch( `/${path}`, { method: 'HEAD' } );
    const link_header = web_res.headers.get( 'Link' ).split( ', ' )
      .find( val => val.includes( ' rel="alternate"; type="application/json"' ) );

    uri = link_header.substring( 1, link_header.indexOf( '>' ) );
  }

  return fetch(
    uri,
    {
      method: 'GET',
      headers: { 'Content-Type': 'application/json' }
    }
  ).then( res => res.json() );
}

Leave a Comment