The word on blogs is that they’re built for SEO.  The trick to indexing your entire site is to link between pages on the site.  If you create a new article post, you should think about linking to an older article that hasn’t gotten a lot of traction online or isn’t indexed.  The value of blogs is that interlinking is automatic: on any blog page, a spider can travel from the main page, to the recent posts, and through the archives, helping to increase indexing of the site.

Except there’s a problem with this (there always is).  Creating additional archive and category pages can actually be anti-optimization.  Why?  Because each archive or category page will repeat the same content over again.  So if you click on Link Building, you’ll find the same content with a different url: /category/link-building, as opposed to a post in that category, such as: Dofollow Blogs and Directories.  This can be read as duplicate content.  After all, there are two URLs on your site that post the exact same content.

Problems with Indexing

What is bad about this is that sometimes your index pages can be indexed first and rank higher than your main blog post.  Have you ever clicked on a Google result for a search term only to find that you end up in a general archive, making it difficult to find the content you’re looking for.  Archive indexing is the problem.  Granted, a small percentage of your traffic is going to come through archives, but SEO is all about playing the percentages.

There are a couple of ways you can go about this problem:

  1. Put a nofollow tag on your archives so spiders won’t follow the archives and dock you for duplicate content.
  2. Use a smart archives plugin to create archive pages that don’t show content.

Option number one should never be used.  But if you’re interested, here’s the plugin: Duplicate Content Cure.  I would say that it’s better to have traffic from your archives than no traffic at all.  Google is still going to index your archive and post pages and it is better to have more articles filling up Google, as it gives your site better authority.  Yes, this is true even if the archives have duplicate content.  Sometimes you have to bite the bullet and weigh consequences.

The other method is to create an archive page that merely lists links to archive pages: November 18, 2008, plus post titles, and so forth, with no additional content.  This is a better method.  Here’s a good post on the issue, which also advocates using smart archives.

Frankly, it’s annoying that this is even an issue.  You’d think that Google would be able to tell the difference between a main blog post and an archive or category post on something so common as a blog.  My guess is that Google will improve post indexing and not dock bloggers for duplicate content from the default linking structure.  Google’s indexing process is something of a mystery, but it’s more than likely that duplicate content within a blog framework is not going to have lasting effects.  But if this concerns you, I’d say go with Smart Archives Reloaded, an easier-to-use plugin than the original.

Leave a Reply