Strange characters – despite everything being UTF-8

This is typically caused when you are copying/pasting MS Word information into the WordPress content editor. WordPress uses something called “Smart Quotes”, via a function named wptexturize().

Ideal Solution

The ideal solution would be to go back through your content, and replace all single/double quotes using the keyboard.

However, if you’re working with massive copy/pastes, this may not be feasible.

Disable wptexturize() Filter

Another option is to disable the wptexturize() filter from running; which can be accomplished by placing the following code in your child theme functions.php file:

remove_filter('the_content', 'wptexturize');

You may also wish to remove the filter from comments and/or excerpts:

remove_filter('comment_text', 'wptexturize');
remove_filter('the_excerpt', 'wptexturize');

Or for titles:

remove_filter ('single_post_title', 'wptexturize');
remove_filter ('the_title', 'wptexturize');
remove_filter ('wp_title', 'wptexturize');

Clean Database

For existing content which has already saved the “weird” characters into the database; you may need to clean the database by running the following queries from PHPMyAdmin (be sure to take a database backup first):

UPDATE wp_posts SET post_content = REPLACE(post_content, '“', '“');
UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€', '”');
UPDATE wp_posts SET post_content = REPLACE(post_content, '’', '’');
UPDATE wp_posts SET post_content = REPLACE(post_content, '‘', '‘');
UPDATE wp_posts SET post_content = REPLACE(post_content, '—', '–');
UPDATE wp_posts SET post_content = REPLACE(post_content, '–', '—');
UPDATE wp_posts SET post_content = REPLACE(post_content, '•', '-');
UPDATE wp_posts SET post_content = REPLACE(post_content, '…', '…');

Plugins

Well… it’s WordPress. You can always use a plugin to help manage the wptexturize() filter. Take a look through This List, and see if one is right for you.

Leave a Comment