Guide

What is Tool Use in LLMs?

A guide to LLM tool use: browsing, code execution, APIs, MCP, agents, function calls, guardrails and evaluation.

Tool use in LLMs refers to the integration of functionalities like browsing, code execution, APIs, and agent workflows to extend large language models' capabilities. This is crucial as it enables enhanced reasoning and practical application without additional training, improving efficiency and safety. For example, the MAVEN framework boosted GPT-OSS-120b's accuracy from 48% to 71%, while Microsoft's Agent Governance Toolkit ensures safe AI agent operations (30 articles, 13 citations, 2026).

Quick Answer

in LLMs refers to the ability of large language models to interact with external systems and perform tasks through APIs, code execution, and other mechanisms. This capability is increasingly important as AI applications expand, with models like GPT-OSS-120b achieving a 71% accuracy on MAVEN-Bench. Recent developments highlight the growing integration of governance frameworks, such as Microsoft's Agent Governance Toolkit, to ensure safe tool use.

Evidence base: 30 filtered articles
Cited sources: 13 citations across 5 sources
Refresh cadence: Weekly
Last updated: Jun 1, 2026

FAQ

What is tool use in LLMs?

Tool use in LLMs refers to their ability to interact with external systems and perform tasks through APIs, code execution, and other mechanisms.

Why is tool use important?

Tool use is crucial for enhancing the functionality and applicability of LLMs across various sectors, including finance, healthcare, and software development.

What recent advancements have been made in LLM tool use?

Recent advancements include the MAVEN framework improving GPT-OSS-120b accuracy to 71% and the implementation of governance frameworks like Microsoft's Agent Governance Toolkit.

Current Read

Tool use in large language models (LLMs) encompasses various functionalities, including browsing, code execution, and API interactions, which enhance their utility across different applications. For example, the MAVEN framework has improved the accuracy of the GPT-OSS-120b model from 48% to 71% on MAVEN-Bench, showcasing significant advancements in agentic . Furthermore, the integration of governance measures, such as Microsoft's Agent Governance Toolkit, emphasizes the need for safety and compliance in AI agent workflows, ensuring that actions are evaluated based on identity and trust scores before execution.

Recent trends indicate a growing reliance on AI agents in various sectors, with tools like OpenAI's Codex being utilized to streamline processes in tax filing and software development. Companies like Endava have reported reducing software delivery timelines from weeks to hours by leveraging Codex, while Cisco and OpenAI's collaboration aims to enhance enterprise engineering through AI-native development. As the landscape evolves, the focus on secure and efficient tool use in LLMs will continue to shape AI's role in business and technology.

Key Takeaways

MAVEN improves GPT-OSS-120b accuracy from 48% to 71% on MAVEN-Bench.
Microsoft's Agent Governance Toolkit enhances safety in AI agent workflows.
Codex is being used to automate tax filings and improve software delivery timelines.
Endava reduced software delivery from weeks to hours using Codex.
Nvidia's Vera CPU sets a new benchmark for agentic workloads in AI factories.

Topic Map

Understanding Tool Use in LLMs

Tool use in LLMs involves the integration of various functionalities that allow models to interact with external systems and perform tasks. This includes browsing capabilities, code execution, and API interactions. Recent advancements, such as the MAVEN framework, have demonstrated significant improvements in accuracy and reasoning capabilities, with models like GPT-OSS-120b achieving a 71% accuracy on MAVEN-Bench without additional training.

MAVEN: Improving Generalization in Agentic Tool Calling

Governance Frameworks for AI Agents

The implementation of governance frameworks is crucial for ensuring safe tool use in AI agents. Microsoft's Agent Governance Toolkit serves as a model for creating governed workflows, where actions are evaluated based on identity and trust scores before execution. This approach enhances the safety and reliability of AI agents in various applications.

An Implementation of the Microsoft Agent Governance Toolkit for Safe AI Agent Tool Use with Policies, Approvals, Audit Logs, and Risk Controls

Related Guides

What is Function Calling?

A guide to function calling in LLMs: structured tool calls, schemas, APIs, agent workflows, reliability and safety checks.

What are AI Agents?

A living guide to AI agents: how they work, where they are useful, what can fail, and the latest agent news from trusted AI sources.

What is Agentic AI?

A guide to agentic AI: planning, tool use, memory, workflows, autonomy levels, risks and the latest agent product signals.

Source-Linked Articles

MAVEN: Improving Generalization in Agentic Tool Calling

MAVEN (Modular Agentic Verification and Execution Network) enhances reasoning in agentic tool-calling environments, improving GPT-OSS-120b accuracy from 48% to 71% on MAVEN-Bench without extra training. This lightweight framework also remains competitive against proprietary models at a cost ratio of 1/10, highlighting its potential for better compositional reasoning.

arXiv cs.AI · Jun 1, 2026

An Implementation of the Microsoft Agent Governance Toolkit for Safe AI Agent Tool Use with Policies, Approvals, Audit Logs, and Risk Controls

This tutorial demonstrates the implementation of Microsoft's Agent Governance Toolkit to create a governed AI-agent workflow. The framework ensures that all actions by AI agents pass through a governance layer that evaluates identity, trust score, risk tier, and other factors before execution, enhancing safety in tool use.

MarkTechPost · May 31, 2026

What is Tool Use in LLMs?

Quick Answer

FAQ

Current Read

Key Takeaways

Topic Map

Understanding Tool Use in LLMs

Governance Frameworks for AI Agents

Related Guides

What is Function Calling?

What are AI Agents?

What is Agentic AI?

Source-Linked Articles

MAVEN: Improving Generalization in Agentic Tool Calling

An Implementation of the Microsoft Agent Governance Toolkit for Safe AI Agent Tool Use with Policies, Approvals, Audit Logs, and Risk Controls

Related evidence

AI Research Papers This Week

Best Authentication Platforms for AI Agents and MCP Servers in 2026

VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model with Curriculum Learning and Native Tool Use

Building self-improving tax agents with Codex

How Endava builds an agentic organization with Codex

Cisco and OpenAI redefine enterprise engineering with Codex

Boston Children’s uses AI to unlock new diagnoses

Warp’s big bet on building open source with GPT-5.5

The next phase of OpenAI’s Education for Countries

OpenAI and Dell partner to bring Codex to hybrid and on-premise enterprise environments

OpenAI named a Leader in enterprise coding agents by Gartner

We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks