Hitting a Moving Target: Test-Time Adaptation for AI Text Detection under Continual Distribution Shift

arXiv cs.CL·Kevin Ren, Manish Raghavan, Nikhil Garg

12h ago

·~1 min·6/25/2026·en·0

Quick Answer

Quick Take

The proposed test-time adaptation (TTA) approach significantly enhances AI text detection under distribution shifts, achieving 90.5% detection of adversarial AI-generated text compared to just 24.1% by Pangram. This method leverages inference-time homogeneity and semi-supervised learning to address vulnerabilities in existing models, which fail during shifts in human and AI-generated writing. The code is publicly available for further research.

Key Points

Test-time adaptation (TTA) leverages unlabeled samples for improved detection.
Existing models fail under adversarial and natural distribution shifts.
Pangram detects only 24.1% of adversarial AI-generated text.
TTA achieves 90.5% detection rate in the same scenario.
Code for TTA is available on GitHub for public use.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2606. 25152v1 Announce Type: new Abstract: Deployed approaches for AI text detection often rely on training-time access to labeled datasets of both human-written and AI-generated text. This approach is vulnerable to three types of distribution shifts that occur continually post-deployment, and for which labeled data is often unavailable: adversarial humanization, new LLMs being released, and temporal drift in human writing.

Simultaneously, existing approaches do not leverage a key signal of LLM usage: inference-time homogeneity. We propose a test-time adaptation (TTA) approach, using semi-supervised learning, that adapts to distribution shifts by leveraging homogeneity among unlabeled samples observed at inference time.

Empirically, we find that state-of-the-art supervised detectors systematically fail when they encounter distribution shifts in AI-generated and human writing, both adversarial and natural, while test-time adaptation with semi-supervised learning is largely robust; e. g. , the commercial model Pangram detects just 24. 1% of our adversarial AI-generated text, compared to 90. 5% for our test-time approach. We establish that test-time adaptation is a promising framework for AI text detection in the wild.

We publicly release our code (which includes code for model training, evaluation, and plots) at https://github. com/kkr36/llm_detection.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Barak Or

1d ago

FeaturedOriginal

Quantifying Prior Dominance in Systems

AI Summary

The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.

#LLM #AI Coding #Inference #AI Startup

Hitting a Moving Target: Test-Time Adaptation for AI Text Detection under Continual Distribution Shift

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.CL

Quantifying Prior Dominance in Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

When Plausible Is Not Realistic: Evaluating Human Mobility in LLM-Based Urban Simulation

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.CL

Quantifying Prior Dominance in RAG Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

When Plausible Is Not Realistic: Evaluating Human Mobility in LLM-Based Urban Simulation

Quantifying Prior Dominance in Systems