Remove special characters in a URL

http://www.example.com/ar/shop-listing/health-beauty~@/#$&&&%%$%3Cscript%3E/

In this particular URL, everything after the # (hash) is the fragment identifier and is not sent to the server by the browser, so cannot be blocked server-side in .htaccess.

The server only sees:

/ar/shop-listing/health-beauty~@/

If the # was to be removed from this URL then this would be a wholly invalid URL (because of the improperly encoded % chars) and the server will respond with a “400 Bad Request”.

Aside: Not sure why WordPress would still resolve this URL to the “correct” URL? This is technically a different URL.

Rather than blocking special characters, it’s probably much easier to allow a whitelist of characters in the URL-path. Most URL-paths will only consist of lowercase a-z, - (hyphen) and / (slash – path separator), so we could simply block the request if any other character is present in the URL-path. This could be implemented using the following mod_rewrite directive at the top of your .htaccess file, before the existing WordPress directives:

RewriteRule [^a-z/-] - [R=404]

If the URL-path contains any characters other than those mentioned then it will trigger a 404.

Note that this only checks the URL-path, not the query string part of the URL.