How to add noindex, follow to pages in WordPress to stop duplicate content for SEO

Age of Article Warning:
This article was originally published 67 months ago. The information, tips and techniques explained may outdated. Examples shown on this page may no longer work. Please consider this when viewing the below content.

An essential aspect of SEO is to ensure that Google and other search engines know which content from your website is important to you.

How to prevent duplicate content affecting your search engine rankings?

By adding meta robots noindex to the header of your pages that you don’t want indexed.

WordPress is a great platform for websites and blogging, however by default there is a tendency to end up with a lot of duplicate content being indexed by Google. When you write a post, and give it a category and some tags, this post is displayed on its own page, as well as tag archive pages, category archive pages, date archive pages, author pages and so on.

It’s Déjà vu for the Google bots

The result is too much duplicated content. When the Google bots crawl your website, they find exactly the same content repeated on page after page. So how can Google decide which pages are most important and require the highest ranking. Stop diluting search engine results and making it harder for your pages to rank well..

How to fix duplicate content?

For starters you want to make sure that these archive pages are only displaying the excerpt of the posts, not the whole content. If your archive.php template is showing the whole post content, that is, listing all the posts on the archive page in full, then you can quickly fix this by editing the archive.php template.

Change “the_content” to “the_excerpt”.

Check of course that your WordPress theme doesn’t have a theme settings page that allows you to make this change without editing your archive.php page.

Secondly, you have to decide which pages you want Google to index and which pages you don’t want indexed. Let’s say that on your website, your home page shows your latest posts, and you have some category pages in your navigation. Then the best way would be to add:

meta name=”robots” content=”noindex,follow”

to the header of date archives, author archives and tag archives. Also you only want to index the first page of each category, not page 2, page 3 and so on. Another page to noindex, depending on how your theme handles images, is the attachment pages.

Use “noindex” to instruct search engines to not index the page, but use “follow” to let the search engine know that it’s ok to crawl through the page and pass link juice along.

Add code to header.php

<?php if($paged > 1 || is_author() || is_tag() || is_date() || is_attachment()){
  echo '<meta name="robots" content="noindex,follow" />';
} ?>

Adding the above code snippet to your header.php file will output the meta content onto your date archives, tag archives, author archives, and onto the subsequent pages of your individual category pages.

Don’t forget your sitemap.xml file

Also only list in your sitemap.xml file the pages that you would like to encourage the search engines to discover and crawl.

This will help to reduce issues with duplicate content and improve SEO. Feel like sharing your tips for duplicate content, then let us know in the comments below.

Updates:

update 10/3/2013
working on a website where the theme was producing excess pages with query strings, such as domainName.com/nav?=most_viewed&=most_commented. As this was producing duplicate content it was necessary to tell googlebots and other search bots not to crawl these pages.
Create a robots.txt file and upload to your website root. See this post for an example robots.txt file. Add the following to the bottom of the file.

Disallow: /*?

15 thoughts on “How to add noindex, follow to pages in WordPress to stop duplicate content for SEO

    • Hi Afzal, the comments strip code, you need to add the code snippet to your header.php WordPress file, inside the head section before the closing head tag .

    • Hi Adam,
      thanks for your comment. The answer really comes down to what pages you want indexed on Google and other search engines. The idea is to prevent duplicate content on your website, however if you are just showing excerpts then this is not a complete duplicate of a post’s text.
      You can of course customise the text that goes onto category pages, so you could make each page contain some unique text as well as the excerpts. If you look at each of my category pages under the “webwork” navigation tab, you will see that each of these pages contains a short description that is different for each category. As well you can modify the meta descriptions for each category page.
      Only use meta robots “noindex, follow” if you don’t want that particular page indexed.

      Hope this makes sense,
      David

  1. I am using a short code for a post/slider to show excerpts of my posts on my static home page.
    I would like to put a no index, follow, but do not know where to place it.

    thank you for your time

    • Hi Brian,

      “Noindex, follow” is for a page itself, not for an excerpt. You don’t want your home page to have noindex in the meta either, otherwise you could remove your home page from the Google index. If you wanted to add nofollow to the post links themselves, then you would need to edit the function that is called by the shortcode, and add nofollow to the link. However it’s probably not necessary if these are posts on your own website.

  2. david, thank you for this informative post. I looked at the code the you instruct inserting on the header.php. I’m using yoast wordpress seo which already handles well the archives, categories, etc… or so I guess. what I’m having issues with is that when I run screaming frog it’s giving me duplicate titles for blog/page2, page3 and category/page2. how can I get rid of just that? thank you for your assistance.

    • Hi Souleye,

      thanks for your comment. If you add the header code to instruct search engines not to index the 2,3,4,etc pages of your WordPress pages, then you shouldn’t need to worry about duplicate titles, as those pages aren’t indexed.

      Otherwise if you still want to change the titles, you could look at appending page numbers to the titles. Not sure how this interacts with Yoast seo, but you could try adding a function like on http://codex.wordpress.org/Function_Reference/wp_title

  3. thank you so much david for your feedback. I relentlessly pursued the matter and I came up with a robot.txt directive that apparently worked
    User-agent: *
    Disallow: */page/*
    but I needed to make sure that I understood your function as I’m dabbling into php. so what you’re saying is ‘if $paged and the page number is greater than 1, or, or, or.. then apply robot directive ‘no index, follow’. since toast already takes care of authors, tags, attachments, can your function be just simplified to:

    1){
    echo ”;
    } ?>
    because that’s the only part of it that I’m interested in. second, I’m a photographer. so, attachments (photographs) are nearly as – if not more important than text. so my question is: why would I want to no index some of my most important assets? thank you again.

    • Hi Souleye,

      yes if is a subsequent page greater than page 1, then apply robots “no-index” to the page. You can just simplify it to:

      if($paged > 1){


      alternatively if you would prefer not to edit the header.php, you could also setup a simple function in your functions.php file of your WordPress childtheme, to do apply the “no-index” to the applicable pages.

      If you don’t want to block the attachments pages from search engines, then that’s ok. It all comes down to how you have setup your photography pages. When you add a photo to the media library, note that “media items are also ‘Posts’ in their own right” – http://codex.wordpress.org/Using_Image_and_File_Attachments

      Blocking the attachments pages from search engines, is to prevent indexing of those media attachment posts. So if your photos are displayed in an actual gallery page, or custom portfolio posts, then you don’t need to have the media attachment posts indexed as well.

  4. thank you again for your clarifications. I’m gonna read the codex page to get a full understanding. most of the images are attached to actual pages or posts. I don’t have gallery pages or portfolios. I used to have galleries but when I found that they were all flash based, I did away with them since flash pages are not google friendly. one thing I do want to make sure of is that I don’t want to be penalized through improper use of images. I wouldn’t want either to have a setup that would prevent my images being indexed since my main product is images. thanks again.

    • Hi Ameer,

      The code snippet is for adding to the header.php file of your WordPress themes files. This is a standard theme file, but you would want to setup a childtheme for these sort of customisations, then copy across the parent theme’s header.php file across to your childtheme. Then add the code snippet there.

      Alternatively you could modify it and create a function, then you would add to your functions file, of your childtheme.

Comments are closed.