Governance-as-code: capability mappings, forbidden rules, and enforcement modes, all declared in the alignment card’s capabilities + enforcement sections
The Policy Engine is Phase 1 of CLPI (Card Lifecycle & Policy Intelligence) — the governance layer that transforms alignment cards into lifecycle-managed artifacts with policy enforcement, trust recovery, risk intelligence, and on-chain anchoring.
The policy engine translates Alignment Card declarations into enforceable rules over concrete tools. An alignment card says an agent may perform web_fetch. The capabilities section of the same card says web_fetch means mcp__browser__navigate and mcp__browser__click — but not mcp__filesystem__delete. The card declares intent. The enforcement section enforces it.
Policy is part of the alignment card, not a separate artifact. The capabilities, enforcement, and per-capability forbidden rules live as sections of the unified card. There is no standalone policy YAML file, no PUT /v1/agents/:id/policy endpoint, and no mnemom policy CLI group. Use mnemom card evaluate + PUT /v1/alignment/agent/{id} instead. See the policy management guide for the customer workflow.
The policy engine is a parallel enforcement layer to alignment enforcement (observe/nudge/enforce). Alignment enforcement checks agent behavior against card values. Policy enforcement checks tool usage against the card’s capabilities + enforcement sections. Both run independently and produce separate verdicts.
Static evaluation runs in pipelines before deployment. It validates the card against the unified schema and evaluates its policy sections against a declared tool list.Commands:
# Validate the card (schema + structure)mnemom card validate card.yaml# Evaluate the card's policy against a tool listmnemom card evaluate card.yaml --tools mcp__browser__navigate,mcp__slack__post_message
What it checks:
Card YAML conforms to the unified schema (capability glob validity, enforcement-mode enums, forbidden-rule structure).
Capability card_actions reference actions that exist in autonomy.bounded_actions.
Each tool in the --tools list matches a capability, hits a forbidden rule, or falls through to the unmapped_tool_action default.
Coverage report identifies card actions with no backing capability mapping.
Use case: pre-deploy gates. See the CI/CD policy gates guide for GitHub Actions + GitLab CI templates.
Live evaluation runs in real-time as requests pass through the Mnemom gateway. The policy engine extracts tool names from the request body, checks each tool against the agent’s canonical card, and returns a verdict.How it works:
Request arrives at the gateway.
Gateway reads the canonical alignment card (KV-cached, 5-min TTL).
Each tool in the request is checked against enforcement.forbidden first.
Remaining tools are matched against capabilities[*].tools globs.
Unmatched tools fall through to enforcement.unmapped_tool_action.
A verdict is returned via the X-Policy-Verdict response header.
Verdict headers:
Header Value
Meaning
X-Policy-Verdict: pass
All tools passed policy checks
X-Policy-Verdict: warn
One or more tools triggered warnings (logged, not blocked)
X-Policy-Verdict: fail
One or more tools violated policy (blocked in enforce mode)
Enforcement interaction:enforcement.default_mode determines the gateway response:
Post-action evaluation runs after an action completes. It analyzes what actually happened against what the card expected.What it detects:
card_gap violations — tools the agent used successfully but that are not represented in any capability mapping. These are tools that should be in the card but are not.
Frequency anomalies — tools used at rates that exceed expected patterns.
New tool discovery — tools appearing for the first time, subject to grace-period handling.
Feeds into reclassification: observer findings feed the reclassification pipeline at POST /v1/agents/{id}/reclassify. A card_gap detection is a candidate for reclassification + an amendment to the card’s capabilities section — see Trust Recovery.Feeds into proving: when the gateway’s policy evaluation detects card_gap signals alongside a boundary_violation, ZK proving is deferred instead of immediately dispatched to GPUs. This prevents stale alignment cards from driving up proving costs during rapid iteration.
Capability mappings are the core of the card’s policy sections. They bridge the gap between what the card declares (abstract actions like web_fetch) and what agents actually invoke (concrete tool names like mcp__browser__navigate).
Tool patterns support standard glob syntax for flexible matching:
Pattern
Matches
mcp__browser__*
All browser tools (navigate, click, screenshot, etc.)
mcp__filesystem__read*
read_file, read_directory, read_metadata
mcp__*__list*
Any MCP server’s list operations
custom_tool_v?
custom_tool_v1, custom_tool_v2, etc.
Start with broad globs during initial card development, then tighten them as you understand which specific tools your agent uses. A mapping like mcp__browser__* is fine for week one. By month two, enumerate the specific tools.
When the policy engine evaluates a tool, it follows this order:
Forbidden check: does the tool match any enforcement.forbidden[].pattern? If yes, the tool is a violation regardless of capability mappings.
Capability match: does the tool match any capabilities[*].tools glob? If yes, the tool is allowed and mapped to the corresponding card actions.
Default fallback: if neither forbidden nor mapped, apply enforcement.unmapped_tool_action.
A tool can match multiple capabilities. This isn’t an error — it means the tool satisfies multiple card actions (e.g., a file-read tool satisfying both read_file and read_source_code).
enforcement.default_mode is the card’s policy-enforcement posture:
Warn
Log violations but do not block. X-Policy-Verdict: warn header returned. Default mode.
Enforce
Block requests with policy violations. X-Policy-Verdict: fail header returned. HTTP 403 for non-streaming requests.
Off
Skip policy evaluation entirely. No X-Policy-Verdict header. No performance overhead.
Policy enforcement (enforcement.default_mode), action-policing alignment (autonomy_mode), and values-policing alignment (integrity_mode) are three parallel systems with three independent master switches. Setting one to enforce does not affect the others — an agent can sit at autonomy_mode: observe + integrity_mode: enforce + enforcement.default_mode: nudge simultaneously, each on its own independent ladder.
enforcement.forbidden defines tools that must never be used, regardless of capability mappings. They’re always checked first in the evaluation pipeline.
enforcement: forbidden: - pattern: "mcp__filesystem__delete*" reason: "File deletion not permitted" severity: critical - pattern: "mcp__shell__*" reason: "Shell execution not permitted" severity: high - pattern: "mcp__*__drop_table" reason: "Table deletion not permitted" severity: critical - pattern: "mcp__email__send_bulk*" reason: "Bulk email sending restricted" severity: medium
Each rule has three fields:
Field
Type
Description
pattern
string (glob)
Tool name pattern to match
reason
string
Human-readable explanation for auditing
severity
enum
critical, high, medium, or low
Policy enforcement.forbidden rules complement alignment card autonomy.forbidden_actions. Card forbidden actions declare intent (“this agent must never delete files”). Policy forbidden rules enforce that intent at the tool level (“block all tools matching mcp__filesystem__delete*”). Both are checked — card-level by alignment enforcement, tool-level by policy enforcement.
New tools appear when agents gain new MCP server connections or when tool providers add capabilities. The grace period prevents these newly discovered tools from immediately becoming violations.
enforcement: grace_period_hours: 24
How it works:
The policy engine tracks when each tool is first seen via tool_first_seen records.
When an unmapped or forbidden tool is encountered, the engine checks how long ago it was first seen.
If the tool was first seen within the grace period window, the violation is downgraded to a warning (the verdict drops from fail to warn), and the request proceeds. Under enforce mode, this means the request is not blocked.
After the grace period expires, the tool falls back to the configured unmapped_tool_action (or its forbidden severity, for forbidden-pattern matches).
The window is per-(agent, tool): each agent’s first observation of each tool starts its own clock. The clock cannot be back-dated.This gives operators time to amend the card’s capabilities section after adding new tools or MCP servers, without immediately triggering violations in enforce mode.
Security implication. With the default 24h grace, brand-new tools — including ones introduced by an attacker via prompt injection, MCP server compromise, or tool-name overlap — get a 24-hour pass on enforce mode. Mature agents with stable tool inventories aren’t exposed; agents that add tools dynamically, run untrusted MCP servers, or accept tool definitions from user input absolutely are.If your threat model includes adversarial tool introduction, set grace_period_hours: 0 on the alignment card to disable the grace path entirely. There is no API to back-date a tool_first_seen record, so 0 is the only way to make enforce strict from the moment a card is published. See Enforcement § Grace period.
Use case
Recommended grace_period_hours
Adversarial threat model (untrusted user input, untrusted MCP servers, agents accepting tool defs from users)
0
Test harness / CI matrix
0
Production with stable tool inventory + mature card
0 (no operational benefit when the tool list is stable)
Production with frequent tool additions, trusted operators
In organizations with multiple agents, the capabilities and enforcement sections compose from platform → org → agent scopes per card composition rules. These are merged at storage time, not request time: every gateway read hits the pre-composed canonical card.
Transaction-scoped cards can further restrict the composed enforcement via intersection semantics. A transaction guardrail can only narrow what’s permitted — never expand it.
# Transaction-level override: restrict to read-only tools for this operationtransaction_guardrails: allowed_capabilities: - file_reading - database_read # All other capabilities are denied for this transaction
Every mnemom card evaluate run produces a coverage report that quantifies how well the card’s capabilities section maps to its autonomy.bounded_actions. This identifies gaps between what the card declares and what the policy actually covers.
A coverage percentage below 100% means some card actions have no backing capability mapping. Tools implementing those actions will fall through to enforcement.unmapped_tool_action. Aim for 100% coverage in production cards.
# Evaluate a card and fail if any card action is unmapped (sub-100% coverage)mnemom card evaluate \ card.yaml \ --tools mcp__browser__navigate,mcp__filesystem__read_file \ --strict# Exit code 1 on warnings (any unmapped action) as well as failures
card evaluate always prints the coverage percentage and lists unmapped actions. Coverage gating is binary: without --strict only hard policy violations exit non-zero; with --strict any unmapped action (i.e. coverage below 100%) is treated as a warning that also exits 1. This integrates naturally into pre-deploy gates: a card change that introduces an unmapped action blocks the merge. See CI/CD policy gates for the full pipeline template.
Policy evaluation adds latency to gateway requests (typically under 5 ms for cards with fewer than 100 capability patterns).
Glob patterns match tool names only, not tool arguments. A tool can be permitted by policy but still violate alignment constraints based on how it’s called.
Grace periods are tracked per-agent, not per-card-version. Updating a card doesn’t reset grace-period timers for previously seen tools.
Coverage reports require a valid autonomy.bounded_actions list. Agents with an empty envelope get a coverage report with 0% coverage (no denominator).
The Policy Engine is Phase 1 of CLPI; AEGIS Managed Rules are a separate (but composable) layer. When a Managed Rule promotes, the gateway loads it via the tiered KV → R2 → in-isolate read substrate. The policy engine still enforces card-defined capability mappings; the Managed Rule adds detection thresholds that screen the inputs and outputs the policy engine then allows or denies. Both compose through the same cards composition primitive — the recipe (detection content) and the rule (control-plane state) flow into the cards cascade Platform → Org → Team → Agent under strictest-wins composition.