Sometimes the best blog posts are a result of day to day interactions with one’s clients. This past week was very odd inasmuch as I discovered the exact same issue with 4 of our clients’ websites. The stars must have been aligned or something. Who knows?

Duplicate content is alive and well on the Web! And here’s why you as a website owner should care.

Real life examples are great fodder for Blog posts. So let’s dig into the topic of ‘duplicate content’ and why it can be harmful to your website’s rankings with Search Engines. I’ll be using Google in my explanation. But my comments hold true for all Search Engines.

First and foremost, Search Engines are not “smart”. Blasphamy? No, they are not! They are too smart as in as they will index everything they can find.

Sidenote: Indexing is a related topic which I will cover in a future Blog Post. : “To Index or Not to Index, That is the Question!”.

Back to duplicate content. In the simplest of terms, duplicate content happens when the same website content is indexed via more then one URL. What do I mean? Let’s look at some examples.

Example 1: Canonical URLs
Here we are looking at the same website where duplicate content is hosted on 2 different URLs:

URL 1: www.myniftybusiness.ca
URL 2: www.myniftybusiness.ca/ home.html or www.myniftybusiness.ca/index.php

Looking at the above example, if you see the exact same content in your web browser with both URLs, you can be sure that Google will also. And that, dear reader is ‘duplicate content’.

Example 2: www and non-www versions
www.myniftybusiness.ca and myniftybusiness.ca. You should be able to reach your website by typing either example into your browser’s address bar. So far so good. Here’s a 2 step test to check if your website is at risk of being identified as having duplicate content:

Type www.myniftybusiness.ca into your browser’s address bar. The site loads and www.myniftybusiness.ca still shows in the browser’s address bar – okay. Now type myniftybusiness.ca (no www) into your browser’s address bar. If myniftybusiness.ca still shows up in the address bar, you have a problem that needs to be resolved. I call this the worst case of ‘duplicate content’ as basically your website’s entire contents are at risk of being double-indexed by Google (and Bing, and Yahoo… etc…). In short, your website should be able to be accessed by either the www version of your website’s URL or the non www version of your website’s URL. But not both!

I have also seen websites that are constructed in such a way as to allow duplicate content scenarios solely through the coding that is done when the site is built.

Example 3: Coding errors
A well laid out site has a form and flow to it that allows for expandibility. Most good website designers understand this concept. After all it is logical. Things, however, can fall off the rails in a scenario such as what follows. A designer is working with a client and they come up with a well thought out structure that categorizes a large volume of products: www.myniftywebsite.ca/product1, www.myniftywebsite.ca/product2, www.myniftywebsite.ca/product3, etc… You get the idea. Then, because of the programming language they are using they may end up having to do something like this www.myniftywebsite.ca/product1/index.aspx, www.myniftywebsite.ca/product2/index.aspx, www.myniftywebsite.ca/product3/index.aspx. In short each product page can be reached by keying in www.myniftywebsite.ca/product1 or www.myniftywebsite.ca/product1/index.aspx (1 example). Google could index both. Duplicate content!

Example 4: Duplicate websites
Another example, and one that really gets under my skin, are these cookie cutter websites that are being flogged all over the Internet. You know the ones I mean. The sales pitch is usually something like, “Let us build your website and fill it with relevant content. $XXXX and you will be off and running.” Sounds good eh? Not! Why? Because they will be selling the exact same content to everyone who is in the same line of business as you are. Google will eventually find it and will have to decide which content is more relevant (you don’t get a say in this). My favorite is real estate agents in the same city who buy into this scenario. In a competitive business such as real estate, you want unique and engaging content…. not duplicate content.

So why is duplicate content so bad?

Duplicate content is bad because it can dilute your website’s positioning within the Search Engines. How bad is that? This is tougher to answer as the algorythms that each search engine uses are constantly changing. So one cannot say something like: “My site has duplicate content and it has gone to page 400 on Google’s SERP!”. Rather you would say, “Thanks Helen for letting me know I have a duplicate content situation on my website. We better work together to get this resolved.” I say this tongue in cheek.

How to check for scenarios where duplicate content may be getting indexed by Google (and the other Search Engines)

Review the above examples and see if what I outline above is happening. Look at your Google Analytics reports. It will jump out at you. And if it shows up in GA, then you know that you have an issue that needs fixing.

How do I fix duplicate content?

I won’t tell you how to fix it as there are so many scenarios to take into consideration. However, the first step in fixing a problem is understanding what that problem is. So # 1: Identify whether or not you have a duplicate content scenario. #2: Work with your SEO professional (Me!) and your Webmaster to work through the technical issues that are causing your woes. Together you will get through it!

More information on solving duplicate content issues

If you aren’t pooped by now, here’s a great video (featuring Google Geek Greg Grothaus) on duplicate content and solving related issues. Grab some popcorn and prepare to learn some great stuff in the next 15 minutes!

Hint: Skip to about 4 minutes in for the real meat.

So…. do you have this SEO technical issue with your site?

Tagged with:  
  • Wow! All of a sudden I felt like I needed to give my site a shower! Dirty dirty …

    But seriously, I can understand how and why this is important! I think I am OK — but I had better watch my coding as my site grows!

    Can you talk to us about how use of subdomains and/or subdirectories can influence SEO? :o)

    Thanks again Helen — your are sailing above the SEO clouds!

    Greg

    • Les

      Hey Greg – I’ll take this one…..

      In the strictest sense you are talking apples and oranges. For example http://www.gothamglassworks.com/stuff is a subdirectory. It is part of your domain http://www.gothamglassworks.com. Whereas stuff.gothamglassworks.com would be a subdomain of http://www.gothamglassworks.com.

      In the SEO world you want to look at subdomains as a separate entity from your domain whereas a subdirectory is part of your domain. Since anything in a subdirectory is part of a domain, you need to look at things holistically. Whereas you can look at a subdomain as a separate entity. That is why you sometimes see people putting their blog in a subdomain versus a subdirectory. They can then track each instance on its own. Plus there are all sorts of smart techie reasons for doing this.

      IE Never host a secondary site in a sub-directory of your domain. Rather use a subdomain. This way everything is on its own – including your SEO efforts!

  • Pingback: Tweets that mention SEO and Duplicate Content: Why should I care? | Search Engine Optimization (SEO) and Internet Marketing Company | WebFuel | Ottawa -- Topsy.com()