Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States

arXiv cs.CL·Subramanyam Sahoo, Vinija Jain, Aman Chadha, Divya Chaudhary

6/3/2026

·~1 min·6/3/2026·en·1

Quick Answer

This paper shows that Linear probing of the Qwen3-14B model reveals that high accuracy in distinguishing reasoning types is influenced by task format rather than underlying computational structures.

Quick Take

Linear probing of the Qwen3-14B model reveals that high accuracy in distinguishing reasoning types is influenced by task format rather than underlying computational structures. Probes achieved 100% accuracy on benchmarks like LogiQA 2.0, but residualizing factors like source identity reduced accuracy to chance levels, indicating shared reasoning across tasks.

Key Points

Linear probes on Qwen3-14B achieved 100% accuracy on LogiQA 2.0, ARC-Challenge, and αNLI.
Accuracy drops to chance levels when controlling for source identity and response length.
Trace-anchor similarity indicates 42.5% agreement in reasoning across tasks.
Causal steering shows no link between geometry and reasoning mode with p=0.286.
Findings suggest the need for routine format deconfounding in mechanistic interpretability.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Excerpt

From source RSS / original summary

arXiv:2606. 02907v1 Announce Type: new Abstract: Linear probing of (LLM) hidden states is widely used to claim that models learn distinct representations for different reasoning types. We test this by probing Qwen3-14B on three benchmarks spanning the classical trichotomy: LogiQA 2. 0 (deductive), ARC-Challenge (inductive), and $\alpha$NLI (abductive). At layer 32 of 40, linear probes achieve 100\% cross-validated accuracy with well-separated geometry (intrinsic dimensionalities: 20. 6, 28.

5, 33. 6; convex hull contamination $\leq$1. 5\%). However, this separation is entirely driven by format confounds. Residualizing source identity, option count, and response length reduces accuracy to chance. Trace-anchor similarity indicates largely shared reasoning across tasks (42. 5\% agreement vs. \ 33. 3\% chance), and causal steering with random controls ($n=20$) shows no functional link between geometry and reasoning mode ($p=0. 286$).

Thus, high probe accuracy reflects task format rather than computational structure, motivating routine format deconfounding in mechanistic interpretability.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Miguel Arana-Catania, Catherine Conisbee, Matthew Kidd

5d ago

FeaturedOriginal

Letting the Data Speak: Extracting Keywords from Crowdsourced Collections with AI

AI Summary

The study evaluates three NLP approaches—Named Entity Recognition, Keyword Extraction, and Topic Modelling—using the Their Finest Hour Online Archive to automate keyword extraction from crowdsourced WWII collections. Findings suggest that while NLP methods show promise, no single approach is sufficient, and ethical considerations in automated keyword extraction are crucial for responsible stewardship.

#AI Coding #Inference #Open Source #Policy

Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States

Quick Answer

Quick Take

Key Points

Paper Resources

Article Excerpt

Want this in your inbox every morning?

More from arXiv cs.CL

Letting the Data Speak: Extracting Keywords from Crowdsourced Collections with AI

Quantifying Prior Dominance in Systems

Time to REFLECT: Can We Trust Judges for Evidence-based Research Agents?

Quick Answer

Quick Take

Key Points

Paper Resources

Article Excerpt

Want this in your inbox every morning?

More from arXiv cs.CL

Letting the Data Speak: Extracting Keywords from Crowdsourced Collections with AI

Quantifying Prior Dominance in RAG Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

Quantifying Prior Dominance in Systems

Time to REFLECT: Can We Trust Judges for Evidence-based Research Agents?