What is a good robots.txt?

FWIW, trackback URLs issue redirects and have no content, so they won’t get indexed.

And at the risk of not answering the question, RE your points 2 and 3:

http://googlewebmastercentral.blogspot.com/2008/09/demystifying-duplicate-content-penalty.html

Put otherwise, I think you’re wasting your time worrying about dup content, and your robots.txt should be limited to:

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-content/cache

Leave a Comment