The question of duplicate content comes up time and again. In fact, I hear the exact same question (what is duplicate content?) repeated over and over!!!
If you just missed my attempt at SEO humour in the opening 2 lines you are excused, I am no Eddie Murphy and search engine optimisation is hardly the best fuel for witty repartee.
If you are website owner with duplicate content issues, neither is it a joke.
When you use duplicate content in an illicit scheme to trick the search engines, such as creating multiple copies of a single domain (more on that later) then you jump the queue for a ban by the search engines. Even if you don’t inadvertently trick the search engines with shady tactics, you could still fall foul of a duplicate content problem if you are not careful.
So, what is duplicate content?
Duplicate content, as the name implies is an exact copy of a webpage (or part thereof) reproduced in more than one place on the internet. For the most part, it is not a huge issue, the search engines just pick one of the duplicated pages to list in their index, and they ignore the others.
No big deal?
Well, consider this scenario. Imagine your competitor copies content from your site, puts it on their website, and then the search engines choose to index their content, not yours. Problem? You bet it is. Or, your website’s content management system duplicates your product pages, choosing to display them in multiple categories, and because the search engine finds so many duplicates it decides not to index your site at all. Problem? You bet!
How can duplicate content issues occur?
Mostly duplicate content problems occur because of technology errors or when you distribute content for wide exposure, but sometimes they occur out of ill intent. Here’re some examples;-
1. Duplicate websites – A “shady” tactic is to make multiple copies of the same website and host them on different domains. The sites would be fully functional and the content on each site whilst not identical would be almost the same. The idea behind this is to trick the search engines into indexing all the sites giving the webmaster multiple listings for their search terms.
2. Doorway pages – Another tactic is to have some content on a domain which you show to users and search engines, but when a visitor reaches that page they are very quickly redirected to a second domain. Sneaky stuff!
Needless to say, do not employ either of these tactics. If the search engines discover you are the owner of sites like this, you are in big trouble and your sites will get banned.
Assuming you don’t partake in this kind of trickery, how might you innocently get caught with duplicate content? Here are just a few examples of that;-
1. Syndication Arrangements – If you have deals whereby your content gets distributed to other websites, or even if you distribute to your own social media profiles, be aware that you are creating duplicate content. Know also, that the search engines decide which content they index, and they may decide to index a copy other than that on your own site.
2. Posting To More Than One Category On Your Own Site – You might have perfectly innocent reasons for cross-publishing the same piece of content to multiple places on your own site, but again be aware that this is considered duplicate content.
3. Site Architecture Errors – Duplicate content issues will occur if your website generates multiple paths to the same page. (http://www.mywebsite.com/prod=1&type=6) leading to the same page (http://www.mydomain.com?type=6). Different URL’s, same content.
4. No 301 Redirect In Place – http://www.mywebsite.com is the same as http://mywebsite.com: This is a very common problem and can be easily be fixed by implementing a 301 redirect from the “non-www” version of your site to the “www” version of your site (or vice versa).
Duplicate content problems can also occur because of parameters added to affiliate links and subdomain issues. There are many reasons why the problem may occur, and the above list is by no means exhaustive.
How to detect duplicate content:
A simple tool we use in our own business is CopyScape. CopyScape scours the internet and returns a list of URLs with the same, or similar content as your own. You might find these pages on your own site, in which case you can remove the duplicate pages and redirect them to the primary page.
If you find duplicates of your own website pages on other domains, and you are the originating author, you should write to the website owner and request for its removal. Unleash fury if it’s your competitor!
In the event that you discover duplicates on your own social profiles or partner websites, first, check that your site is ranking higher in the SERP’s for your target terms. If it is you need not be concerned, if it is not, then you may want to reconsider rewriting the content to make it unique.
Best practices going forward:
If you plan to distribute your content beyond your own website (and there are certainly advantages for doing so) then here are some tips that can you save you from duplicate content problems;-
1. Publish To Your Own Site First – This doesn’t guarantee that your content will get preference, but it does tell the search engine you are the originating author. The original author will normally be given preference unless the duplicate content piece is placed on a site considerably more powerful than the authors own. Wait 7-10 days before syndicating to other websites to give the search engines time to index the content on your own site.
2. Make Syndicated Content Unique – Even if only a slight edit, try and make the syndicated content pieces different to the original.
3. Link Back To Your Own Site – By linking from your syndicated content back to the original content you identify your website as the original source.
4. Implement rel=canonical – This tag tells the search engines that the given page (where your syndicated content sites) should be treated as if it were a copy of the original content. This tag can be added to the header of your syndication partner’s website and informs the search engine that all of the links and content metrics the engines apply should be credited toward the provided URL.
Here’s what the rel=canonical tag looks like:
<link href=”http://www.example.com/canonical-version-of-page/” rel=”canonical” />
For the most part, duplicate content is not a big issue and more is made of it than should be. However, as described above there can be times where inadvertently it causes you a problem.
By implementing a 301 redirect from the “non-www” version of your site to the “www” version of your site, and by checking CopyScape for duplicate content and Google Webmaster Tools for duplicate page titles you ensure your site is clean. Then all that remains is to adopt a sensible content distribution strategy where your content remains king.