Age of Article Warning:
This article was originally published 46 months ago. The tips and techniques explained may be outdated, or information may longer be applicable. Please consider this when viewing the below content.

An essential aspect of SEO is to ensure that Google and other search engines know which content from your website is important to you.

How to prevent duplicate content affecting your search engine rankings?

By adding meta robots noindex to the header of your pages that you don’t want indexed.

WordPress is a great platform for websites and blogging, however by default there is a tendency to end up with a lot of duplicate content being indexed by Google. When you write a post, and give it a category and some tags, this post is displayed on its own page, as well as tag archive pages, category archive pages, date archive pages, author pages and so on.

It’s Déjà vu for the Google bots

The result is too much duplicated content. When the Google bots crawl your website, they find exactly the same content repeated on page after page. So how can Google decide which pages are most important and require the highest ranking. Stop diluting search engine results and making it harder for your pages to rank well..

How to fix duplicate content?

For starters you want to make sure that these archive pages are only displaying the excerpt of the posts, not the whole content. If your archive.php template is showing the whole post content, that is, listing all the posts on the archive page in full, then you can quickly fix this by editing the archive.php template.

Change “the_content” to “the_excerpt”.

Check of course that your WordPress theme doesn’t have a theme settings page that allows you to make this change without editing your archive.php page.

Secondly, you have to decide which pages you want Google to index and which pages you don’t want indexed. Let’s say that on your website, your home page shows your latest posts, and you have some category pages in your navigation. Then the best way would be to add:

meta name=”robots” content=”noindex,follow”

to the header of date archives, author archives and tag archives. Also you only want to index the first page of each category, not page 2, page 3 and so on. Another page to noindex, depending on how your theme handles images, is the attachment pages.

Use “noindex” to instruct search engines to not index the page, but use “follow” to let the search engine know that it’s ok to crawl through the page and pass link juice along.

Add code to header.php

<?php if($paged > 1 || is_author() || is_tag() || is_date() || is_attachment()){
  echo '<meta name="robots" content="noindex,follow" />';
} ?>

Adding the above code snippet to your header.php file will output the meta content onto your date archives, tag archives, author archives, and onto the subsequent pages of your individual category pages.

Don’t forget your sitemap.xml file

Also only list in your sitemap.xml file the pages that you would like to encourage the search engines to discover and crawl.

This will help to reduce issues with duplicate content and improve SEO. Feel like sharing your tips for duplicate content, then let us know in the comments below.

Updates:

update 10/3/2013
working on a website where the theme was producing excess pages with query strings, such as domainName.com/nav?=most_viewed&=most_commented. As this was producing duplicate content it was necessary to tell googlebots and other search bots not to crawl these pages.
Create a robots.txt file and upload to your website root. See this post for an example robots.txt file. Add the following to the bottom of the file.

Disallow: /*?
How to add noindex, follow to pages in WordPress to stop duplicate content for SEO was last modified: September 6th, 2016 by David Tiong
How to add noindex, follow to pages in WordPress to stop duplicate content for SEO