AI Detection · 6 min read

How AI Detection Tools Work — and What They Actually Detect

By the Humanizor TeamJanuary 8, 20256 min read

AI detection tools are everywhere now — Turnitin, GPTZero, Originality.ai, and dozens more. But how do they actually work? What do they measure? And how accurate are they really? This article cuts through the hype with a clear, honest look at the technology.

The two core signals: perplexity and burstiness

Most AI detection tools rely on two core statistical signals derived from language model research.

Perplexity

Perplexity measures how "surprising" a piece of text is to a language model. When an AI writes text, it naturally chooses high-probability, low-surprise word sequences — the words that statistically fit best in context. This produces low-perplexity text. Human writers, by contrast, make unexpected word choices, use unusual structures, and take creative risks — producing higher-perplexity text overall.

Detection tools measure this: low perplexity suggests AI authorship; high perplexity suggests human authorship.

Burstiness

Burstiness refers to variation in sentence length and complexity. Human writing is "bursty" — we write long complex sentences, then short punchy ones, then medium ones. AI writing tends to be uniform: each sentence is roughly the same length and complexity, creating low burstiness. Detectors measure this variation (or lack of it) as an additional signal.

The major AI detection tools compared

ToolPrimary audienceMethodReported accuracy
GPTZeroEducatorsPerplexity + burstiness scoring~85% on clean AI text
Turnitin AIUniversitiesProprietary ML + similarity~98% claimed, debated
Originality.aiPublishers, SEOFine-tuned detection model~94% on GPT-4 content
CopyleaksEnterprisesMulti-model detection~99% claimed
ZeroGPTGeneral publicText analysis heuristicsInconsistent in testing

Note: Accuracy figures are self-reported or from limited testing. Real-world performance on edited or humanized text is significantly lower for all tools.

The false positive problem

This is the most important and least-discussed issue with AI detection: false positives are common and consequential.

Research has consistently shown that certain types of human writing score very high on AI detection tools — not because they were AI-generated, but because they happen to share stylistic traits with AI output:

A 2023 study found that GPTZero flagged over 50% of essays written by non-native English speakers as AI-generated. This is a serious problem when detection results are used to make academic misconduct decisions.

"AI detectors don't detect AI — they detect writing that resembles AI. That's a crucial distinction, and most institutions aren't making it clearly enough."

What makes text harder to detect?

Several factors consistently make AI-generated text harder for detection tools to catch:

The arms race problem

AI detection is fundamentally an arms race that detection tools are losing. As language models improve, their output naturally becomes harder to distinguish from human writing. Each new model generation produces more varied, nuanced text — and the detection tools trained on older model outputs are less accurate on new ones.

OpenAI released and then quietly retired its own AI classifier in 2023, noting it was "not reliable enough." Anthropic has not released a detection tool, citing similar concerns about accuracy.

What this means for writers

If you're a writer using AI tools for any purpose — whether that's drafting, editing, research, or brainstorming — the most important things to know are:

  1. Detection tools are imperfect and produce false positives, especially for non-native speakers
  2. No tool can prove that text is AI-generated — only that it shares statistical properties with AI output
  3. Editing your AI drafts meaningfully reduces detection scores and, more importantly, improves writing quality
  4. Humanizer tools change the statistical properties that detectors measure — primarily perplexity and burstiness
Our recommendation Use AI tools to accelerate and assist your writing, then genuinely edit and personalise the output. This produces better writing than either pure AI or pure manual effort — and it reflects your actual voice.

Make your writing sound naturally human

Humanizor rewrites AI text to have natural variation, specific vocabulary, and human rhythm. Free, no sign-up required.

✦ Try Humanizor free
← All articles How to Humanize AI Text →