ICML Diversity Boosts AI-Generated Text Detection

Poster
in
Workshop: DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)

Diversity Boosts AI-Generated Text Detection

Advik Basani · Pin-Yu Chen

Keywords: [ llms ] [ ai text detection ] [ interpretability ] [ zero-shot ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 19 Jul 3 p.m. PDT — 3:45 p.m. PDT

Abstract:

Detecting AI-generated text is increasingly important to prevent misuse in education, journalism, and social media, where synthetic fluency can obscure misinformation. Existing detectors often rely on likelihood heuristics or black-box classifiers, which struggle with high-quality outputs and lack interpretability. We propose DivEye, a novel detection framework that leverages surprisal-based features to capture fluctuations in lexical and structural unpredictability, a signal more prominent in human-authored text. DivEye outperforms existing zero-shot detectors by up to 33.2%, matches fine-tuned baselines, and boosts existing detectors by up to 18.7% when used as an auxiliary signal. DivEye is also robust to paraphrasing and adversarial attacks, generalizes across domains, and offers interpretable insights into rhythmic unpredictability as a key indicator of AI-generated text.

Chat is not available.

Poster in Workshop: DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)

Diversity Boosts AI-Generated Text Detection

Advik Basani · Pin-Yu Chen

Poster
in
Workshop: DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)