Tool Authority and Approval

Agent Tool Permission Protocol

A practical protocol for AI agents that can use tools. Prompt hardening tells an agent what to treat as authority. Tool permissioning decides what the agent can actually do when it gets confused, hijacked, or overconfident.

Agent security is a privilege-boundary problem. A language model without tools can mislead. A language model with tools can send, delete, spend, publish, modify, expose, or execute.

The institution should therefore treat every agent as a junior operator with a written job description, limited keys, an audit trail, and mandatory approval for consequential actions.

The Rule

No agent receives more authority than the smallest task requires.

Do not give an agent broad access because it is convenient. Convenience is how prompt injection becomes a data breach, a bad publication, a deleted archive file, or an unauthorized public statement.

Permission Classes

Every tool belongs to one of six classes.

Class	Examples	Default
Read Public	public web search, public documents, public site files	allowed with logging
Read Internal	unpublished drafts, internal notes, working docs	approval required
Read Restricted	testimony, donor records, incident notes, private contact records	prohibited unless specifically approved
Write Draft	local draft edits, private notes, non-public summaries	allowed only in scoped workspace
Write Consequential	publish, send email, update CRM, change permissions, delete files, make purchases	human approval required
Execute	shell, code execution, API mutation, deployment, database writes, MCP server actions	prohibited by default; approval and sandbox required

The permission class is assigned before the agent runs. The agent should not decide its own class while working.

Action Gates

Some actions always require a human gate, even if the agent is trusted.

Require explicit approval before an agent:

sends a message;
publishes or deploys;
deletes or overwrites files;
changes permissions;
shares private or internal data;
calls a payment, donation, banking, or purchasing tool;
modifies a database or CRM;
opens a ticket, complaint, report, or legal record;
runs code from untrusted content;
installs packages, plugins, skills, extensions, or MCP servers;
connects to a new external service;
contacts a person outside the institution;
summarizes restricted data for an external destination.

The approval request should state:

Action:
Tool:
Data involved:
Destination:
Reason:
Risk:
Rollback plan:

Approval should not be a vague “continue?” button. It should describe the operation in human language.

Tool Allowlist

Each workflow should have an allowlist.

Example: public research article.

Allowed:

public web search;
opening public sources;
reading existing public docs;
editing a draft file;
local build;
link check.

Not allowed:

restricted archive access;
donor records;
incident records;
email sending;
social posting;
payments;
credentials;
shell commands beyond build and verification;
deployment without explicit request.

The allowlist should be written in the task brief, not inferred from the agent’s capabilities.

Tool Pairing Risks

Some tools are safe alone and dangerous together.

High-risk pairings:

private data reader + external sender;
browser + logged-in accounts;
web reader + shell execution;
repository reader + package installer;
CRM reader + email sender;
archive reader + summarizer connected to external service;
incident notes + public publishing tool;
MCP server discovery + automatic tool approval;
memory writer + untrusted content reader.

If an agent has both read access and outbound action, assume exfiltration is possible. If an agent has both untrusted input and code execution, assume command injection is possible.

New connectors, MCP servers, plugins, external services, and agent tools must also pass Vendor and Platform Governance before they become institutional infrastructure.

MCP and Plugin Rule

Model Context Protocol servers, plugins, connectors, and skills are tools with their own supply chains. Treat them like software dependencies plus delegated authority.

Before enabling one, ask:

Who maintains it?
What permissions does it request?
Can it read local files?
Can it execute commands?
Can it send data externally?
Can it install or load other tools?
Does it log actions?
Does it support approval gates?
Does it expose secrets in descriptions, examples, or errors?
How is it removed?

OpenAI’s developer guidance for MCP apps warns that unsafe or untrusted MCP servers can increase exposure to risks including prompt injection. OWASP’s MCP Top 10 similarly treats tool poisoning, command injection, context spoofing, and insecure memory references as core concerns. The correct default is narrow enablement, not curiosity.

Agent Identity

Agents should not borrow human accounts unless there is no alternative.

Preferred:

named service account;
least-privilege role;
no shared passwords;
short-lived token where possible;
separate read and write credentials;
logs tied to the agent identity;
human owner named in the register.

Avoid:

founder personal account;
shared chapter login;
long-lived admin token;
personal browser session;
hidden credentials in prompt text;
credentials visible to the model when a tool wrapper can hold them instead.

An agent should be attributable. If no one can say which agent did what, the tooling is not ready.

Memory and Persistence

Memory is a tool. Treat it as one.

Rules:

do not let untrusted content write durable memory;
do not let an agent store secrets, credentials, restricted records, or vulnerable testimony in memory;
separate user preference memory from institutional record memory;
review memory writes in high-risk workflows;
expire temporary task memory;
log what memory was used for a consequential action.

Prompt injection can persist if malicious instructions are stored as memory, notes, comments, issue descriptions, or retrieval documents. A clean prompt is not enough if the agent keeps retrieving poisoned context.

Spiralism Tool Register

Every institutional agent should have a register entry:

Agent name:
Owner:
Purpose:
Model/provider:
Tools enabled:
Permission classes:
Data allowed:
Data prohibited:
Can read:
Can write:
Can send:
Can execute:
Can deploy:
Approval gates:
Logging location:
Retention:
Review date:
Shutdown process:

No register, no production use.

Default Spiralism Policy

During the founding period:

agents may assist with public-source research, drafts, link checks, local builds, and non-sensitive documentation;
agents may not access restricted testimony, companion chat logs, minor material, incident records, donor records, care-circle notes, credentials, or legal records by default;
agents may not publish, deploy, email, message, pay, delete, or change permissions without explicit approval;
agents may not install or enable MCP servers, plugins, connectors, skills, or browser extensions without tool review;
agents working on rabbit-hole reports follow Forum Rabbit-Hole Response Protocol;
agents with tool access follow Agent Prompt Hardening;
consequential agent runs leave reviewable records under Agent Audit and Incident Review.

Review Questions

Before giving an agent a new tool, answer:

What harm can happen if the agent follows an instruction from untrusted content?
What private data could the tool expose?
What external action could the tool take?
Can the agent complete the task without this tool?
Can the permission be temporary?
Is there a human approval gate?
Is there a log?
Can the action be undone?
Who owns the risk?
When will access be removed?

If these questions feel excessive, the tool is probably too powerful for the workflow.

Sources Checked

OpenAI, Safety in building agents, accessed May 2026.
OpenAI, Developer mode, and MCP apps in ChatGPT, accessed May 2026.
OpenAI, Designing AI agents to resist prompt injection, March 11, 2026.
OWASP Foundation, OWASP MCP Top 10, accessed May 2026.
OWASP GenAI Security Project, OWASP GenAI Security Project Releases Top 10 Risks and Mitigations for Agentic AI Security, December 9, 2025.
NIST, AI Risk Management Framework, accessed May 2026.