Detecting undisclosed LLM-generated content in parliamentary texts
Quick Answer
This study reveals a rising trend of undisclosed LLM-generated content in UK and Swedish parliamentary texts since 2022, raising concerns about transparency.
Quick Take
This study reveals a rising trend of undisclosed LLM-generated content in UK and Swedish parliamentary texts since 2022, raising concerns about transparency. An interpretable text classifier was developed to assess the extent of AI usage, highlighting the need for clearer disclosure guidelines in parliamentary writing.
Key Points
- Interpretable text classifier trained on pre-LLM and LLM-generated parliamentary texts.
- Steady increase in undisclosed LLM usage observed from 2022 onwards.
- Study emphasizes the need for transparency in parliamentary writing.
- Guidelines on AI disclosure in parliamentary texts are currently vague.
- Research conducted on texts from the UK and Sweden parliaments.
Paper Resources
Article Excerpt
From source RSS / original summaryarXiv:2606. 14209v1 Announce Type: new Abstract: In this paper, we evaluate the extent of undisclosed LLM-generated content in texts from the parliaments of the United Kingdom and Sweden. In many areas, such as in journalism or in academic writing, there are often requirements to clearly disclose whether AI tools, such as LLMs, have been used. In the case of parliamentary texts, the guidelines on disclosure of AI use are more vague.
However, in order to maintain transparency and retain public trust, it is generally recommended that parliamentarians should state whether or not they have used AI when writing texts, such as parliamentary motions. Here, we train an interpretable (glass-box) text classifier using pre-LLM parliamentary texts and LLM-generated versions of such texts.
We then apply the classifier to a test set containing recent parliamentary texts, finding a steady increase in undisclosed LLM use, in both parliaments, from 2022 onwards.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.