Content Similarity Checker
Compare two URLs and get a plagiarism score with shared-passage evidence.
0%
Plagiarism score
—
Cosine TF-IDF
—
Word-frequency overlap. The primary plagiarism indicator.
Jaccard 5-gram
—
Overlap of 5-word phrases. High = long verbatim runs.
—
Words on page A
—
Words on page B
Shared passages
A:
B:
Fetched:
Shared passages (≥ 10 consecutive words)
No passages of 10+ consecutive matching words were found.
How the score works
- Both pages are fetched server-side and stripped to plain text
- Cosine TF-IDF measures how similar the word distributions are
- Jaccard 5-gram checks how many 5-word phrases overlap
- Score > 70% = duplicate, > 85% = verbatim copy
- Shared passages of 10+ consecutive words are listed as evidence