Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency

arXiv cs.AI·Anis Radianis

17h ago

·~2 min·5/20/2026·en·0

Quick Take

LBW-Guard enhances LLM training stability and efficiency under stress without replacing the optimizer.

Key Points

Introduces a governance layer above AdamW.
Achieves 18.7% perplexity reduction in tests.
Maintains trainability under aggressive learning rates.

📖 Reader Mode

~2 min read

[Submitted on 18 May 2026]

View PDF HTML (experimental)

Abstract:Modern language-model training is increasingly exposed to instability, degraded runs, and wasted compute, especially under aggressive learning-rate, scale, and runtime-stress conditions. This paper introduces Learn-by-Wire Guard (LBW-Guard), a bounded autonomous training-control governance layer that operates above AdamW. Rather than replacing the optimizer update rule, LBW-Guard observes training telemetry, interprets instability-sensitive regimes, and applies bounded control to optimizer execution while preserving fixed training objectives.
We evaluate LBW-Guard in a Qwen2.5-centered stress-and-robustness suite using WikiText-103, with Qwen2.5-7B as the empirical anchor, model-size comparisons against Qwen2.5-3B and Qwen2.5-14B, learning-rate stress tests, gradient-clipping baselines, and a no-LoRA TinyLlama-1B full-parameter sanity check. In the 7B reference setting, LBW-Guard reduces final perplexity from 13.21 to 10.74, an 18.7% improvement, while reducing end-to-end time from 392.54s to 357.02s, a 1.10x speedup. Under stronger learning-rate stress, AdamW degrades to 1885.24 final perplexity at LR=3e-3 and 659.76 at LR=1e-3, whereas LBW-Guard remains trainable at 11.57 and 10.33, respectively. Gradient-clipping baselines do not reproduce this effect.
These results support a scoped systems conclusion that stability-sensitive LLM training can benefit from a governance plane above the optimizer. LBW-Guard provides evidence that bounded runtime control can preserve productive compute under stress while remaining distinct from optimizer replacement and local gradient suppression.

Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2605.19008 [cs.AI]
	(or arXiv:2605.19008v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2605.19008 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Anis Radianis [view email]
[v1] Mon, 18 May 2026 18:32:25 UTC (24 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.AI

From Prompts to Protocols: An AI Agent for Laboratory Automation

Agentic Trading: When LLM Agents Meet Financial Markets

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems

Related in this space

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

Verifiable Agentic Infrastructure: Proof-Derived Authorization for Sovereign AI Systems

Trustworthy Agent Network: Trust in Agent Networks Must Be Baked In, Not Bolted On