Duplicate Content Checker

Paste two texts side by side to compare similarity. See matching sentences highlighted and get a clear duplicate/unique verdict.

Unique Content

These texts appear to be original and distinct from each other.

0%
Overall Similarity
0
Matching Sentences
0
Words in Text A
0
Words in Text B

Highlighted Matches

Text A
Text B

How to use the Duplicate Content Checker

Duplicate content fragments your ranking signals across multiple URLs. The checker quantifies similarity between two pages so you can decide whether to consolidate, canonicalize, or differentiate.

1

Paste two pieces of content

Either URLs (the tool fetches and extracts main content) or raw text blocks. The comparison ignores nav, footer, and boilerplate by default.

2

Read the similarity score

0–100% match. Above 70% Google may treat the pages as duplicates and pick one to index. 30–70% is "similar" — review to see if differentiation is possible. Under 30% is fine.

3

Review the sentence diff

The tool highlights which sentences are identical, which are paraphrased, and which are unique to each page. Identical sentences are the highest-priority targets for rewriting.

4

Decide: consolidate, canonical, or rewrite

If the duplicates are intentional (categories, paginated lists), use canonical to point all to one. If accidental, rewrite the duplicate page or 301 it. If both deserve to rank for different intents, rewrite to differentiate.

Why duplicate content is the silent ranking killer

Duplicate content rarely earns a manual penalty, but it almost always splits ranking signals. Two pages targeting the same keyword each rank weaker than one consolidated page would. Google picks one and may choose poorly.

Where duplicates come from

Three fixes for duplicate content

Is duplicate content a penalty?

No formal penalty exists for unintentional duplicates — Google just picks one version to rank and ignores the others. Manual penalties only kick in for clear spam patterns: scraping at scale, doorway pages, or auto-generated content. The harm of unintentional duplicates is dilution, not punishment.

Frequently asked questions

What is duplicate content?

Content that appears at multiple URLs — either on your own site (internal duplicates) or across multiple sites (external duplicates). Google deduplicates by picking one version to rank and consolidating signals there. The non-canonical versions get effectively deindexed.

Will duplicate content get me penalized?

Not for unintentional duplicates — Google just picks one version to rank. Manual penalties only apply to clear spam (scraped content, auto-generated doorway pages, content syndicated at scale without permission). Your real cost is signal dilution, not punishment.

What's the duplicate content threshold?

There's no published number. Empirically, pages above ~70% similarity start being treated as duplicates by Google's near-duplicate detection. The match is calculated on shingles (overlapping word sequences), not whole-page comparison.

How do I fix duplicate content?

Three options: (1) canonical tag — both URLs stay reachable but signals consolidate; (2) 301 redirect — physically merge URLs; (3) rewrite — make the pages substantively different. Pick based on whether both URLs need to remain reachable to users.

Are short identical sections (boilerplate, nav, footer) duplicate content?

No — Google ignores boilerplate (nav, footer, sidebar, legal disclaimers) when calculating duplicate content. Duplicate detection focuses on the main content area. This is why your nav being on every page doesn't make every page a duplicate.

Want AI-generated blog content that ranks? Try Autorank free.

Get Started Free →