A similarity score measures text overlap against a comparison database: it is a matching metric, not a plagiarism verdict.
A similarity score measures text overlap against a comparison database: it is a matching metric, not a plagiarism verdict. An AI-writing score measures how statistically predictable your writing is, a probabilistic estimate of resemblance to machine-generated text, not proof of AI use (Foltýnek et al., 2020; Liang et al., 2023). The two numbers answer completely different questions, and neither is an automatic finding of misconduct on its own.
Author: MAAS Academic Skills Publishing Desk · Reviewed by a Principal Academic Mentor
Last updated: 2026-07-04
Category: writing-tips
What does a similarity score actually measure?
Direct answer: A similarity score is the percentage of your submitted text that a text-matching tool finds identical or near-identical to text already stored in its comparison database, including other student papers, journals, websites, and books. It is a lexical-overlap calculation, not an assessment of originality or intent, and it is generated the same way regardless of why the text matches.
Evidence: A large, methodologically detailed comparison of fifteen web-based text-matching systems concluded that such software cannot determine whether plagiarism has occurred; it can only flag text similarity that may constitute plagiarism, with human judgement required to interpret every match (Foltýnek et al., 2020). The University of Melbourne's academic-integrity guidance likewise treats a high similarity result as a prompt for further review, not evidence in itself, because extensive quotation that is properly cited can inflate the same percentage as uncited copying (University of Melbourne, n.d.-a).
Why it matters: Because the score is purely mechanical, a high number can be entirely innocent: a long reference list, direct quotations, or standard technical phrasing all count as "matches" even when fully and correctly cited. The companion guide on what your similarity score actually means walks through reading a report match by match.
What does an AI-writing score actually measure?
Direct answer: An AI-writing indicator estimates the probability that a passage was generated or paraphrased by an AI system, based on statistical patterns in the writing itself, chiefly how predictable each word choice is (perplexity) and how much sentence length and structure vary (burstiness). It does not compare your text to any external source at all; it evaluates the internal statistical fingerprint of the writing.
Evidence: The University of Melbourne's guidance describes the AI indicator as the proportion of a submission the tool calculates as "98% or more likely" to have been generated or paraphrased by AI, an internally computed likelihood rather than a source-matched fact (University of Melbourne, n.d.-b). Because the estimate rests on a statistical model rather than a database lookup, its accuracy is inherently different from, and on current evidence less stable than, a matching calculation (Weber-Wulff et al., 2023).
Why it matters: A similarity score can be verified against a specific source. An AI-writing score cannot be verified the same way: there is no underlying document to point to, only a probability the model assigns to a pattern. That distinction is the reason the two numbers cannot be read against the same mental yardstick.
Why are the two scores computed in completely different ways?
Direct answer: Similarity checking is a retrieval-and-comparison problem: the system searches a reference collection for candidate sources, then aligns your text against them to measure overlap. AI detection is a classification problem: the system has no reference document to search for at all, so it scores the internal statistical texture of your own sentences against patterns learned from known human and machine writing. One method looks outward at other documents; the other looks inward at your writing style.
Evidence: Systematic reviews of plagiarism-detection research describe text-matching as a two-stage process of candidate retrieval followed by detailed textual alignment against a large reference corpus (Foltýnek et al., 2020). AI-text detectors, by contrast, rely on language-model-derived signals such as perplexity and burstiness rather than any corpus comparison, and independent testing has found their accuracy drops sharply once text is even lightly edited, a vulnerability that has no equivalent in similarity matching, since matching does not depend on modelling "how AI writes" (Weber-Wulff et al., 2023).
Why it matters: Because the mechanisms do not overlap, a document can score high on one metric and low on the other for entirely unrelated reasons. A heavily quoted, meticulously cited literature review can carry a high similarity score and a low AI score. A short, plain, evenly structured paragraph a student wrote from scratch can carry a near-zero similarity score and still trigger an AI flag.
What does a high score on either metric NOT mean?
Direct answer: Neither score is proof of misconduct. A high similarity result frequently reflects correctly quoted and cited material rather than copying. A high AI-writing result frequently reflects genuine human writing that happens to be unusually uniform, a pattern that is common in careful academic or second-language prose, not exclusive to machine text.
Evidence: Academic-integrity guidance is explicit that a high similarity result "should not, in and of itself, be taken as evidence that an assessment offence has occurred," since extensive properly cited quotation produces the same headline number as uncited copying (University of Melbourne, n.d.-a). On the AI side, the same guidance instructs staff to weigh whether a student's writing style tends to be "regular, routine or formulaic," because that quality raises the likelihood of a false positive rather than indicating AI use (University of Melbourne, n.d.-b). A peer-reviewed study of several widely used AI detectors found they misclassified authentic essays written by non-native English speakers as AI-generated at far higher rates than essays by native speakers (Liang et al., 2023). A likely reason is that the plain, uniform style many second-language writers are taught lowers exactly the perplexity signal these detectors read as a marker of machine text.
Why it matters: For ESL and Vietnamese students in particular, the writing habits that make second-language prose clear and grammatically safe, shorter sentences, simpler vocabulary, consistent structure, are the same habits that statistically resemble AI output. That is a documented limitation of the detection method, not a reflection of how the essay was actually produced. The companion guide on why original writing gets flagged as AI explains this mechanism in more depth.
How are institutions advised to treat each score?
Direct answer: As a starting point for human judgement, not an automatic verdict. Both similarity and AI-writing results are meant to prompt a closer look by a person, and academic-integrity procedures generally require corroborating evidence before any finding of misconduct is made on either metric.
Evidence: University of Melbourne staff guidance states plainly that "a high AI detection score alone does not constitute grounds for making an allegation of academic misconduct," and lists additional evidence staff should seek, such as inconsistencies with a student's earlier work or unusual file metadata, before proceeding (University of Melbourne, n.d.-b). The same principle applies to similarity results, which the guidance says must be "balanced with further evidence" rather than treated as conclusive on their own (University of Melbourne, n.d.-a). Independent testing of detection tools reached a parallel conclusion from the research side: automated tools can support an investigation, but they cannot themselves determine whether misconduct occurred (Foltýnek et al., 2020; Weber-Wulff et al., 2023).
Why it matters: If your institution's process is working as designed, a number on either report should trigger a conversation, not a punishment. Knowing that in advance can make the process feel less like an accusation and more like what it is meant to be: a check.
What should you do if you are flagged on either score?
Direct answer: Respond calmly, ask exactly what was reported, and be ready to show your process rather than argue with the number. For a similarity flag, review the matches yourself and identify which are quotations, references, or standard phrasing versus genuine uncited overlap. For an AI flag, gather evidence of authorship, such as draft history, notes, outlines, and annotated sources, since that evidence speaks to a question the statistical score cannot answer.
Evidence: Guidance for both types of flag treats them as an invitation to investigate rather than a finding, and explicitly allows students to explain their process, provide drafts, or discuss how the work was produced before any allegation is made (University of Melbourne, n.d.-a, n.d.-b). Because AI-detection accuracy is known to be inconsistent, particularly for non-native English writers, having a documented drafting trail is the strongest and most direct way to answer a flag that a statistical score cannot resolve on its own (Liang et al., 2023; Weber-Wulff et al., 2023).
Why it matters: Understanding which score you were flagged on changes what evidence is actually relevant. A similarity flag is answered by pointing to citations. An AI flag is answered by showing how the work was written. Confusing the two wastes time you could spend addressing the actual question being asked.
MAAS SUPPORTS UNDERSTANDING, NOT SHORTCUTS
If a similarity or AI-writing report has left you unsure what it actually means for your assignment, academic integrity support at MAAS can help you read the result correctly, distinguishing genuine issues from artefacts of how each score is calculated, and pointing you toward what to fix versus what is already fine. A mentor can ask questions, point out where a citation is missing, and give feedback on your reasoning, but the work stays yours: MAAS advises and reviews, it does not write, rewrite, or submit anything on a student's behalf.
Frequently asked questions
Can a document have a high similarity score and a low AI score at the same time?
Yes, and it is common. A heavily quoted, carefully cited literature review can show a high similarity percentage from matched quotations and references while scoring low on AI-writing likelihood, because the two metrics measure unrelated things (Foltýnek et al., 2020).
Can a document have a low similarity score and a high AI score?
Yes. Text can be entirely original in the sense that it matches no existing source, and still read as statistically uniform enough to trigger an AI-writing indicator, since that indicator does not check for matching sources at all (University of Melbourne, n.d.-b).
Is one score more reliable than the other?
They are unreliable in different ways. Similarity matching can miss paraphrased plagiarism and can also flag properly cited material as a "match." AI detection can misclassify genuine human writing, and its accuracy is documented to fall further once text is lightly edited (Weber-Wulff et al., 2023). Neither is designed to be read as a standalone verdict.
Why do non-native English writers get flagged on the AI score more often?
Because writing habits commonly taught to second-language writers, such as shorter sentences, simpler vocabulary, and consistent structure, statistically resemble the low-variation pattern AI detectors associate with machine text. A peer-reviewed study found significantly higher false-positive rates for non-native English writers than for native speakers (Liang et al., 2023).
Does a citation fix an AI-writing flag the way it fixes a similarity match?
No. A citation addresses a similarity match because similarity is about attributing matched text to its source. An AI-writing flag is not about a missing citation; it is a statistical estimate about how the text was produced, so the relevant response is evidence of your own drafting process, not a reference list.
Should I try to change my writing style to avoid an AI-writing flag?
Do not distort your work to "beat" a detector. Focus on genuine authorship and keep the natural evidence of your process, such as draft versions and notes. Institutions treat both scores as a prompt for a human conversation, not a target to optimise against.
Can MAAS interpret my similarity and AI-writing reports for me?
Yes. A MAAS mentor can walk through both reports with you, explain what each number is and is not measuring, and help you identify what genuinely needs addressing in your own work, without writing or altering the work itself.
References
- Foltýnek, T., Dlabolová, D., Anohina-Naumeca, A., Razı, S., Kravjar, J., Kamzola, L., Guerrero-Dib, J., Çelik, Ö., & Weber-Wulff, D. (2020). Testing of support tools for plagiarism detection. International Journal of Educational Technology in Higher Education, 17, Article 46. https://doi.org/10.1186/s41239-020-00192-4
- Liang, W., Yuksekgonul, M., Mao, Y., Wu, E., & Zou, J. (2023). GPT detectors are biased against non-native English writers. Patterns, 4(7), Article 100779. https://doi.org/10.1016/j.patter.2023.100779
- University of Melbourne. (n.d.-a). Advice for students regarding Turnitin and AI writing detection. Academic Integrity. Retrieved July 4, 2026, from https://academicintegrity.unimelb.edu.au/plagiarism-and-collusion/advice-for-students-regarding-turnitin-and-ai-writing-detection
- University of Melbourne. (n.d.-b). Turnitin's AI writing detection tool. Academic Integrity. Retrieved July 4, 2026, from https://academicintegrity.unimelb.edu.au/staff-resources/turnitins-ai-writing-detection-tool
- Weber-Wulff, D., Anohina-Naumeca, A., Bjelobaba, S., Foltýnek, T., Guerrero-Dib, J., Popoola, O., Šigut, P., & Waddington, L. (2023). Testing of detection tools for AI-generated text. International Journal for Educational Integrity, 19, Article 26. https://doi.org/10.1007/s40979-023-00146-z
This article is academic-integrity guidance and does not replace your own work or your institution's policy. MAAS coaches students to produce genuinely their own work through the Outline → Draft → Final model; we do not write, submit, or disguise work on a student's behalf.
