Gaming AI-Assisted Peer Reviews Poses New Risks to the Scientific Community
Quick Answer
This paper shows that AI-assisted peer reviews, particularly with models like Gemini 3 Flash and GPT 5.4 Mini, are vulnerable to manipulation through superficial rephrasing, leading to significant acceptance rate increases of up to 38%.
Quick Take
AI-assisted peer reviews, particularly with models like Gemini 3 Flash and GPT 5.4 Mini, are vulnerable to manipulation through superficial rephrasing, leading to significant acceptance rate increases of up to 38%. This raises concerns about the integrity of scientific evaluation, as inflated AI reviews may bias editorial decisions towards acceptance, emphasizing the need for robust testing and oversight in AI tools.
Key Points
- AI-mediated peer review can be manipulated by simple abstract rephrasing.
- Acceptance ratings increased by +1.31 for Gemini 3 Flash reviewers.
- Manipulation requires only 5 minutes and $1 for a 10-page submission.
- Success rate exceeds 50% when original AI review suggests 'reject'.
- AI tools should not be viewed as neutral without proper safeguards.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 10159v1 Announce Type: new Abstract: AI is increasingly used to support scientific peer review, from manuscript screening, reviewer assistance to editorial triage. Although such systems promise to reduce reviewer burden and accelerate publication, their robustness to strategic manipulation remains poorly understood. Here we show that AI-mediated peer review is vulnerable to a simple, low-cost manipulation: superficial rephrasing of the manuscript abstract.
Without changing the underlying scientific content and communication, and even without knowledge of the reviewing model, adversarially rewritten abstracts substantially improve AI review outcomes. We see this across disciplines and publication venues, for both human-written and AI-generated papers. Our strongest attack achieves an attack-success-rate of about 38%, increasing acceptance ratings by +1. 31 for Gemini 3 Flash reviewers and by +0. 88 for GPT 5. 4 Mini reviewers on a 10-point scale.
When the original AI review suggests 'reject', the success rate rises to more than 50%. This effect extends beyond overall score inflation, increasing review confidence and scores on core scientific criteria such as soundness, significance and perceived contribution. The attack is practical, requiring only about 5 minutes and $1 for a 10-page AI conference submission, and is hard to distinguish from ordinary scientific editing.
Inflated AI reviews could bias downstream human decision-making, shifting editorial recommendations from rejection towards acceptance. These findings reveal a general vulnerability in AI-assisted scientific evaluation: when AI-generated review influence editorial decisions, authors may be incentivized to optimize manuscripts for AI judgment rather than scientific merit.
Our results suggest that AI tools should not be treated as neutral evaluators in high-stakes peer review without systematic robustness testing, transparent safeguards and careful human oversight.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.