What is Duplicate Content?
Duplicate Content – This refers to instances where portions of the text are found in at least two different places on the web. When the same content is found on multiple websites, it can cause ranking issues for one or all of the websites, as Google does not want to show multiple websites in search results that have the exact same information. Generally, the site that indexed the content first is considered to be the original content and would not be penalized. Duplicate content can result from plagiarism, automated content scrapers, or lazy web design. Duplicate content can also be a problem within one website — if multiple versions of a page exist, Google may not understand which version to show in search results, and the pages are competing against each other, this is also known as keyword cannibalization. Issues like this can occur when new versions of pages are added, without deleting or forwarding the old version, or through poor URL structures.
Why does duplicate content matter?
For search engines
Duplicate content can present three main issues for search engines:
- They don’t know which version(s) to include/exclude from their indices.
- They don’t know whether to direct the link metrics (trust, authority, anchor text, link equity, etc.) to one page, or keep it separated between multiple versions.
- They don’t know which version(s) to rank for query results.
For site owners
- To provide the best search experience, search engines will rarely show multiple versions of the same content and thus are forced to choose which version is most likely to be the best result. This dilutes the visibility of each of the duplicates.
- Link equity can be further diluted because other sites have to choose between the duplicates as well. instead of all inbound links pointing to one piece of content, they link to multiple pieces, spreading the link equity among the duplicates. Because inbound links are a ranking factor, this can then impact the search visibility of a piece of content.
« Back to Glossary Index