Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems

arXiv cs.AI·Deepak Panigrahy, Aakash Tyagi

5/25/2026

·~2 min·5/25/2026·en·4

Quick Answer

The A-LEMS framework introduces Energy per Successful Goal (EpG) for agentic AI systems, revealing that workflows consume 4.33x more energy per goal than linear baselines (888.1 J vs 205.3 J).

Quick Take

The A-LEMS framework introduces Energy per Successful Goal (EpG) for agentic AI systems, revealing that workflows consume 4.33x more energy per goal than linear baselines (888.1 J vs 205.3 J). This shift in measurement highlights the orchestration overhead, emphasizing that traditional energy benchmarks are inadequate for complex AI tasks.

Key Points

EpG aggregates energy across all execution attempts, including failures and retries.
Orchestration Overhead Index (OOI) isolates orchestration costs from linear execution.
Agentic workflows consume 4.33x more energy per successful goal than linear baselines.
For tool-augmented tasks, OOI indicates cheaper agentic execution compared to linear.
A-LEMS provides a reproducibility protocol linking measurements to hardware configurations.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2605. 22883v1 Announce Type: new Abstract: Current AI energy benchmarks measure consumption at the granularity of a single model invocation or training run. For classical single-turn workloads this unit remains coherent.

For agentic systems - where a single user goal may trigger multi-step orchestration, tool calls, retries, and failure-recovery cycles - the invocation count is an implementation artifact rather than a task property, and inference-level normalization misrepresents the energy cost of goal completion. We present A-LEMS (Agentic LLM Energy Measurement System), a cross-layer measurement framework that redefines the unit of AI energy accounting from energy per inference to Energy per Successful Goal (EpG).

EpG aggregates total workflow energy across all execution attempts, including failures and retries, normalized by successfully completed goals. A-LEMS formalizes energy attribution through a temporal boundary model, a five-layer observation pipeline mapping RAPL signals to workflow-level energy, and a reproducibility protocol binding every measurement to hardware and runtime configuration.

Building on EpG, we define the Orchestration Overhead Index (OOI), isolating the energy cost of orchestration relative to linear execution under identical task criteria. Across five reasoning and three tool-augmented task families, agentic workflows consume 4. 33x higher mean energy per successful goal than linear baselines (888. 1 J vs 205. 3 J). This overhead is driven by orchestration structure, not inference compute. For tool-augmented tasks, OOI inverts below 1.

0x: agentic execution is cheaper than linear, confirming the metric captures orchestration structure rather than a fixed upward bias. These findings establish that energy-per-inference is insufficient for agentic AI. EpG and OOI provide the measurement foundation for accurate benchmarking, where orchestration structure is the primary determinant of energy cost.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Cheng Qian

1d ago

FeaturedOriginal

Information Limits and Attractor Dynamics in Economies of Frontier LLM Agents: A Pre-Registered Test

AI Summary

A pre-registered experiment on Claude Opus 4.8 investigates wealth growth and population misalignment in economies, revealing that relative growth aligns with claimed information but fails to demonstrate expected noise-maintained dispersion. The experiment cost $138.76 and is fully reproducible from cached outputs.

#LLM #Agent #Open Source #AI Startup

Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.AI

Information Limits and Attractor Dynamics in Economies of Frontier LLM Agents: A Pre-Registered Test

Onnes: A Physics-Grounded LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure

Procedural Memory Distillation: Online Reflection for Self-Improving Language Models

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.AI

Information Limits and Attractor Dynamics in Economies of Frontier LLM Agents: A Pre-Registered Test

Onnes: A Physics-Grounded Multi-Agent LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure

Procedural Memory Distillation: Online Reflection for Self-Improving Language Models

Onnes: A Physics-Grounded LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure