Tool Authority and Approval

Agent Tool Permission Protocol

A practical protocol for AI agents that can use tools. Prompt hardening tells an agent what to treat as authority. Tool permissioning decides what the agent can actually do when it gets confused, hijacked, or overconfident.

Agent security is a privilege-boundary problem. A language model without tools can mislead. A language model with tools can send, delete, spend, publish, modify, expose, or execute.

The institution should therefore treat every agent as a junior operator with a written job description, limited keys, an audit trail, and mandatory approval for consequential actions.

The Rule

No agent receives more authority than the smallest task requires.

Do not give an agent broad access because it is convenient. Convenience is how prompt injection becomes a data breach, a bad publication, a deleted archive file, or an unauthorized public statement.

Permission Classes

Every tool belongs to one of six classes.

Class Examples Default
Read Public public web search, public documents, public site files allowed with logging
Read Internal unpublished drafts, internal notes, working docs approval required
Read Restricted testimony, donor records, incident notes, private contact records prohibited unless specifically approved
Write Draft local draft edits, private notes, non-public summaries allowed only in scoped workspace
Write Consequential publish, send email, update CRM, change permissions, delete files, make purchases human approval required
Execute shell, code execution, API mutation, deployment, database writes, MCP server actions prohibited by default; approval and sandbox required

The permission class is assigned before the agent runs. The agent should not decide its own class while working.

Action Gates

Some actions always require a human gate, even if the agent is trusted.

Require explicit approval before an agent:

The approval request should state:

Action:
Tool:
Data involved:
Destination:
Reason:
Risk:
Rollback plan:

Approval should not be a vague “continue?” button. It should describe the operation in human language.

Tool Allowlist

Each workflow should have an allowlist.

Example: public research article.

Allowed:

Not allowed:

The allowlist should be written in the task brief, not inferred from the agent’s capabilities.

Tool Pairing Risks

Some tools are safe alone and dangerous together.

High-risk pairings:

If an agent has both read access and outbound action, assume exfiltration is possible. If an agent has both untrusted input and code execution, assume command injection is possible.

New connectors, MCP servers, plugins, external services, and agent tools must also pass Vendor and Platform Governance before they become institutional infrastructure.

MCP and Plugin Rule

Model Context Protocol servers, plugins, connectors, and skills are tools with their own supply chains. Treat them like software dependencies plus delegated authority.

Before enabling one, ask:

  1. Who maintains it?
  2. What permissions does it request?
  3. Can it read local files?
  4. Can it execute commands?
  5. Can it send data externally?
  6. Can it install or load other tools?
  7. Does it log actions?
  8. Does it support approval gates?
  9. Does it expose secrets in descriptions, examples, or errors?
  10. How is it removed?

OpenAI’s developer guidance for MCP apps warns that unsafe or untrusted MCP servers can increase exposure to risks including prompt injection. OWASP’s MCP Top 10 similarly treats tool poisoning, command injection, context spoofing, and insecure memory references as core concerns. The correct default is narrow enablement, not curiosity.

Agent Identity

Agents should not borrow human accounts unless there is no alternative.

Preferred:

Avoid:

An agent should be attributable. If no one can say which agent did what, the tooling is not ready.

Memory and Persistence

Memory is a tool. Treat it as one.

Rules:

Prompt injection can persist if malicious instructions are stored as memory, notes, comments, issue descriptions, or retrieval documents. A clean prompt is not enough if the agent keeps retrieving poisoned context.

Spiralism Tool Register

Every institutional agent should have a register entry:

Agent name:
Owner:
Purpose:
Model/provider:
Tools enabled:
Permission classes:
Data allowed:
Data prohibited:
Can read:
Can write:
Can send:
Can execute:
Can deploy:
Approval gates:
Logging location:
Retention:
Review date:
Shutdown process:

No register, no production use.

Default Spiralism Policy

During the founding period:

Review Questions

Before giving an agent a new tool, answer:

  1. What harm can happen if the agent follows an instruction from untrusted content?

  2. What private data could the tool expose?

  3. What external action could the tool take?
  4. Can the agent complete the task without this tool?
  5. Can the permission be temporary?
  6. Is there a human approval gate?
  7. Is there a log?
  8. Can the action be undone?
  9. Who owns the risk?
  10. When will access be removed?

If these questions feel excessive, the tool is probably too powerful for the workflow.

Sources Checked