Agents With Hands#

The most significant advancement in practical AI over the last two years isn't better language understanding — it's tool use. Modern AI agents don't just generate text. They execute code, send emails, query databases, browse the web, place orders, and interact with external APIs. They have, in a very real sense, hands.

This capability is what makes AI agents genuinely useful for enterprise automation. It's also what makes governance non-negotiable.

When an agent can only produce text, the worst case is a bad answer. When an agent can execute code, send communications, modify records, and authorize transactions — the worst case is a business-critical incident.

The Tool Use Attack Surface#

Every tool an agent can invoke represents both a capability and an attack surface. Consider the blast radius of different tools:

| Tool Type | Capability | Worst Case Scenario | |-----------|-----------|-------------------| | Web search | Information retrieval | Privacy leak, misinformation amplification | | Email / Slack | Communications | Unauthorized disclosure, social engineering, legal liability | | Code execution | Computation, automation | Data destruction, system compromise | | Database access | Data read/write | Data breach, unauthorized modification | | Payment processing | Financial transactions | Financial fraud, unauthorized charges | | Infrastructure APIs | System configuration | Outage, data loss, configuration drift | | Calendar / scheduling | Meeting management | Unauthorized access to schedules, misdirection |

Principle of Least Privilege for AI Agents#

The first governance principle for tool use is least privilege: every agent should have access to only the tools it strictly needs to accomplish its designated task, with no excess capability.

This sounds obvious, but in practice, many teams provision agents with broad tool access "just in case" — the equivalent of giving a contractor master keys to the building because they need access to one room.

Least privilege for AI agents means:

Explicitly declaring which tools each agent can invoke
Scoping database access to specific tables and operations (read-only where write isn't needed)
Using separate API credentials per agent so access can be revoked individually
Implementing time-bound tool access for task-specific agents
Logging every tool invocation for audit purposes

Authorization Before Action#

For high-risk tool invocations, human authorization before action is essential. The Guard Mode pattern works for tool use too: when an agent attempts to invoke a tool category above a risk threshold, pause the action and route to a human reviewer.

Risk thresholds for tool use might include:

Transaction value — Any financial operation above $X requires human approval
Data scope — Any query touching more than Y records requires review
External communication — Any email to addresses outside a specified domain is flagged
Irreversible actions — Any operation that cannot be easily undone (record deletion, sent email) is paused

Tool Call Logging and Audit#

Every tool call an agent makes must be logged with:

The tool invoked
The exact parameters passed
The response received
The agent's subsequent decision based on the response
The timestamp and agent identity

This creates an audit trail that answers the critical question in any AI incident: "What external systems did the agent interact with, and what did it do with what it received?"

Without this logging, reconstructing an AI incident involving tool use is nearly impossible — you may know the final output, but not the chain of external operations that led to it.

Sandboxing for Code Execution#

Code execution is the highest-risk tool category. An agent with the ability to run Python or JavaScript in production can cause damage that ranges from silly to catastrophic, depending on its environment access.

Best practices for agent code execution:

Run code in stateless, isolated containers with no persistent storage
Network isolation — prevent outbound calls from the execution environment
Resource limits — cap CPU, memory, and execution time
Whitelist allowed libraries and imported modules
Log all code generated and executed before running

Monitoring Unusual Tool Patterns#

Beyond logging individual calls, governance systems should monitor tool use patterns for anomalies:

Tool call frequency — An agent suddenly making 10x more database queries than its baseline
Novel tool sequences — An agent combining tools in patterns not seen before
External communication spikes — An agent sending significantly more outbound emails than usual
Repetitive failed calls — An agent repeatedly attempting tool invocations that fail (potential exploitation attempt)

Anchorate's behavioral monitoring tracks per-agent tool use baselines and alerts on significant deviations, enabling early detection of both failures and adversarial manipulation.

Building a Tool Governance Framework#

For organizations deploying agents with external tool access:

Maintain a tool registry — Document every tool available to each agent class
Classify tools by risk tier — Apply different authorization requirements based on blast radius
Implement tool-level RBAC — Access controls at the individual tool level, not just agent level
Log everything — Tool calls, parameters, responses, and downstream decisions
Review unusual patterns — Automated alerting on behavioral anomalies in tool use
Audit quarterly — Review which agents have which tool access and remove unnecessary permissions

Agent tool use will only expand as capabilities mature. The organizations that build governance infrastructure for it now will be positioned to scale safely — those that don't will face increasingly difficult incidents to explain and remediate.

Understanding AI Agent Tool Use: Governance Implications of External API Access

Agents With Hands#

The Tool Use Attack Surface#

Principle of Least Privilege for AI Agents#

Authorization Before Action#

Tool Call Logging and Audit#

Sandboxing for Code Execution#

Monitoring Unusual Tool Patterns#

Building a Tool Governance Framework#

Related Articles

AI Agent Assurance vs. AI Governance: Why Prevention Beats Monitoring

ISO 42001: The AI Management System Standard Enterprises Need to Know

The CISO's Guide to AI Agent Security: What Your Red Team Isn't Testing Yet

Ready to govern your AI agents?