Thousands of 404 errors on old posts due to embedded links

I am about 99% sure the reason for much of this is that someone has used relative links in the post body.

A link that looks like this:

<a href="www.whufc.com/">Some link Test</a>

Will end up looking like this:

2010/premiership-forecast-title-race-hots-up-gunners-prepare-for-adebayor/www.whufc.com/

If it shows up on this page of the site:

2010/premiership-forecast-title-race-hots-up-gunners-prepare-for-adebayor/

You need to have the http:// part.

Some of our editors have driving me nearly insane with exactly this.

As far as the url encoding, you have your % encoded. %25 is %. Look at your string– %25E2%2580%2593— and try to decode that: http://urldecode.org/ See what is happening? The correctly encode string should be:–%E2%80%93— at least I think that is what you going for and you will note that that is exactly what FireFox and Chrome resolve the string to by correctly decoding the only percent-encoded characters (the %). I don’t know how the encoding got the way it is.