Decomposing how prompting steers behavior

arXiv cs.AI·Fan L. Cheng, Nikolaus Kriegeskorte

4h ago

·~2 min·6/3/2026·en·0

Quick Take

This study introduces a geometric decomposition framework to analyze how prompts influence the internal representations of LLMs and VLMs, revealing that translation and rigid transformations are key to aligning model behavior with task instructions. The framework shows that prompts reshape representations effectively, with affine transformations yielding significant behavioral improvements across various datasets.

Key Points

Introduces a nested geometric decomposition framework for analyzing prompting effects.
Demonstrates that prompts reshape representations toward instructed task structures.
Translation and rigid transformations are crucial for improving model behavior.
Affine transformations significantly recover target-prompt task geometry.
Study spans three LLMs, three VLMs, and six diverse datasets.

Article Content

From source RSS / original summary

arXiv:2606. 03093v1 Announce Type: new Abstract: Prompting steers large language models (LLMs) and vision-language models (VLMs) without weight updates, but it remains unclear how instruction changes reshape internal representations to produce behavior. We introduce a nested geometric decomposition framework that treats prompting as a transformation of the representational geometry of the content following the prompt.

For each prompt pair, we align representations of the same stimuli under two prompts using increasingly expressive stimulus-invariant maps: translation, rigid transformation with uniform scaling, sequential axis scaling, affine transformation, and nonlinear transformation. We then causally test each map by replacing a single layer's prompt-A hidden state for held-out stimuli with its mapped counterpart and measuring recovery of prompt-B representational geometry and behavior.

Across three LLMs, three VLMs, and six text or image datasets spanning style, emotion, scene content, and number, prompts consistently reshape representations toward the instructed task structure. Cross-validated variance decomposition shows that much prompt-induced activation change is captured by shape-preserving maps, especially translation and rigid transformation with uniform scaling, while tier profiles reveal model- and task-specific routing strategies across layers.

Crucially, although translation and rigid tiers already improve behavioral agreement, affine transformation is the first tier to nearly recover target-prompt task geometry and yields corresponding behavioral gains. This suggests that cross-dimensional linear mixing is a key mechanism by which prompts reorganize representations toward instructed task structure.

Our framework decomposes prompt-induced representational change into interpretable geometric components and reveals how models route task-relevant structure to produce prompt-driven behavior.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Yan Wang, Xuguang Ai, Jaisal Patel, Xueqing Peng, Fengran Mo, Yupeng Cao, Haohang Li, Mingyu Cao, Lingfei Qian, V\'ictor Guti\'errez-Basulto

4h ago

FeaturedOriginal

AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification

AI Summary

AuditFlow introduces a multi-agent framework for structured financial reporting verification, achieving 82.09% accuracy with GPT-5.5, outperforming the baseline by 14.93 points. It utilizes a symbolic environment for effective audit processes, demonstrating the necessity of deterministic checks for reliable verification.

#Agent #AI Coding #Inference #Enterprise AI