Unique Content
These texts appear to be original and distinct from each other.
Highlighted Matches
How to use the Duplicate Content Checker
Duplicate content fragments your ranking signals across multiple URLs. The checker quantifies similarity between two pages so you can decide whether to consolidate, canonicalize, or differentiate.
Paste two pieces of content
Either URLs (the tool fetches and extracts main content) or raw text blocks. The comparison ignores nav, footer, and boilerplate by default.
Read the similarity score
0–100% match. Above 70% Google may treat the pages as duplicates and pick one to index. 30–70% is "similar" — review to see if differentiation is possible. Under 30% is fine.
Review the sentence diff
The tool highlights which sentences are identical, which are paraphrased, and which are unique to each page. Identical sentences are the highest-priority targets for rewriting.
Decide: consolidate, canonical, or rewrite
If the duplicates are intentional (categories, paginated lists), use canonical to point all to one. If accidental, rewrite the duplicate page or 301 it. If both deserve to rank for different intents, rewrite to differentiate.
Why duplicate content is the silent ranking killer
Duplicate content rarely earns a manual penalty, but it almost always splits ranking signals. Two pages targeting the same keyword each rank weaker than one consolidated page would. Google picks one and may choose poorly.
Where duplicates come from
- URL parameters — /product?color=red and /product?color=blue showing the same content.
- Trailing slash inconsistency — /page and /page/ accessible separately.
- http vs https — both versions indexed before HTTPS migration completed.
- www vs non-www — same site reachable at both subdomains.
- Pagination — page 1, 2, 3 all containing the same intro.
- Faceted navigation — filtered category pages with mostly identical content.
- Syndication — your article republished on partner sites without canonical.
- Scrapers — third-party sites scraping your content.
Three fixes for duplicate content
- Canonical tag — both URLs reachable, signals consolidate to canonical.
- 301 redirect — dupe URL physically routes to canonical.
- Rewrite — make the pages genuinely different (different angles, different audiences, different depth).
Is duplicate content a penalty?
No formal penalty exists for unintentional duplicates — Google just picks one version to rank and ignores the others. Manual penalties only kick in for clear spam patterns: scraping at scale, doorway pages, or auto-generated content. The harm of unintentional duplicates is dilution, not punishment.
Frequently asked questions
What is duplicate content?
Content that appears at multiple URLs — either on your own site (internal duplicates) or across multiple sites (external duplicates). Google deduplicates by picking one version to rank and consolidating signals there. The non-canonical versions get effectively deindexed.
Will duplicate content get me penalized?
Not for unintentional duplicates — Google just picks one version to rank. Manual penalties only apply to clear spam (scraped content, auto-generated doorway pages, content syndicated at scale without permission). Your real cost is signal dilution, not punishment.
What's the duplicate content threshold?
There's no published number. Empirically, pages above ~70% similarity start being treated as duplicates by Google's near-duplicate detection. The match is calculated on shingles (overlapping word sequences), not whole-page comparison.
How do I fix duplicate content?
Three options: (1) canonical tag — both URLs stay reachable but signals consolidate; (2) 301 redirect — physically merge URLs; (3) rewrite — make the pages substantively different. Pick based on whether both URLs need to remain reachable to users.
Are short identical sections (boilerplate, nav, footer) duplicate content?
No — Google ignores boilerplate (nav, footer, sidebar, legal disclaimers) when calculating duplicate content. Duplicate detection focuses on the main content area. This is why your nav being on every page doesn't make every page a duplicate.