Skip to main content
How to work with the protection card — the YAML document that tells Safe House how to defend an agent at runtime. Same CRUD shape as the alignment card, different semantics.

Viewing the current card

Via CLI

mnemom protection show            # canonical protection card (YAML)
mnemom protection show always renders the canonical (composed) card. To inspect the agent-scope raw card pre-composition, use the API ?include=sources envelope (see below) or the dashboard Security tab.

Via API

# Canonical (composed) protection card
curl -H "X-Mnemom-Api-Key: $MNEMOM_API_KEY" \
  https://api.mnemom.ai/v1/protection/agent/{agent_id}

# Raw agent-scope card pre-composition (five-row sources envelope:
# Platform / Org / Teams / Agent / Composed)
curl -H "X-Mnemom-Api-Key: $MNEMOM_API_KEY" \
  "https://api.mnemom.ai/v1/protection/agent/{agent_id}?include=sources"
Or visit the Security tab on the agent detail page in the dashboard — it shows both raw (agent-scope) and canonical (composed) side by side.

Authoring a protection card

Start from the protection card schema. A minimal card:
# protection.card.yaml
card_version: protection/2026-04-26
agent_id: mnm-xxxxxxxx
issued_at: 2026-04-26T00:00:00Z

mode: enforce            # "off" | "observe" | "nudge" | "enforce"

thresholds:
  warn: 0.60             # informational threshold
  quarantine: 0.80       # quarantine in enforce mode
  block: 0.95            # hard block in enforce mode

screen_surfaces:
  incoming: true
  outgoing: true
  tool_calls: true
  tool_responses: true

trusted_sources:
  domains: ["internal.acme.com"]
  agent_ids: []
  ip_ranges: ["10.0.0.0/8"]
Most fields are optional. Omitted fields inherit from the org template (if any), which inherits from the platform default.

Publishing

mnemom protection publish protection.card.yaml
Or via API:
curl -X PUT https://api.mnemom.ai/v1/protection/agent/{agent_id} \
  -H "X-Mnemom-Api-Key: $MNEMOM_API_KEY" \
  -H "Content-Type: text/yaml" \
  -H "Idempotency-Key: <uuid>" \
  --data-binary @protection.card.yaml
The publish triggers compose_protection_card(agent_id), which generates the new canonical card within a second.

Validating without publishing

mnemom protection validate protection.card.yaml
Runs the full schema validator + applies an inline composition with current platform/org templates so you can see the canonical output without writing anything.

Understanding composition

Protection-card composition follows the three-scope model (platform > org > agent). The per-field rules:
FieldComposition
modeStrictest wins (enforce > nudge > observe > off). Agent can go stricter, not looser.
thresholds.*Min across scopes — lowest = strictest wins. An agent can tighten further than the platform/org but not loosen.
screen_surfaces.*OR per field — true wins. Any scope can require scanning a surface; agents cannot turn off scanning the org or platform requires.
trusted_sources.{domains,agent_ids,ip_ranges}Platform intersection, org+agent union: platform allowlist is the compliance ceiling; downstream scopes can only add from inside that ceiling.
Publishing an org protection template propagates to all agents in the org via mark_agents_for_recompose — the same mechanism as alignment templates. See Managing Card Composition for the full flow.

Common tuning patterns

Production-grade strictness

For high-stakes agents (financial, health, compliance):
mode: enforce

thresholds:
  warn: 0.50           # tighter than the 0.60 platform default
  quarantine: 0.70
  block: 0.85

screen_surfaces:
  incoming: true
  outgoing: true
  tool_calls: true
  tool_responses: true

Observe-first for a new agent

Before committing to enforcement, run in observe mode to gather a baseline:
mode: observe          # all detectors run, nothing is blocked

thresholds:
  warn: 0.50           # lower = more sensitive (more events logged)
  quarantine: 0.70
  block: 0.90
Review the event stream for 7-14 days. Adjust thresholds based on false-positive rate. Promote to nudge or enforce when stable. nudge is a useful intermediate stage — the model receives an advisory annotation but the request still proceeds, so you can validate the security signal reaches the agent before committing to hard blocks.

Performance-sensitive agent (tight tool-response window)

If an agent’s tool responses contain large payloads and per-request latency matters:
screen_surfaces:
  incoming: true
  outgoing: true
  tool_calls: true
  tool_responses: false             # skip tool-response scanning

mode: enforce
Every detection event logs which surfaces were inspected, so auditors can see what was not scanned. Document the reason in your internal runbook.
If your org requires tool_responses: true, you cannot turn it off at agent scope (strictest wins). You’ll need a section-specific exemption with a documented reason.

Trusted internal backend

If your agent pulls from a known-safe internal API, add the domain to trusted_sources so Safe House skips detector runs on content from that source:
trusted_sources:
  domains:
    - internal-kb.acme.example
    - vendor-api.example.com
Trusted content still emits a low-priority trace entry. If your internal KB ever gets compromised, the trusted-source entry in the trace makes the blast radius auditable. Security reminder: never add a public DNS resolver, a user-controllable domain, or a public LLM API to trusted_sources. The API validates against a static deny-list and rejects obvious mistakes, but the risk model is on you.

Alerting

Protection-card violations emit webhook events if your org has webhooks configured:
POST https://your-webhook.example/safe-house
{
  "event_type": "safe_house.violation",
  "agent_id": "mnm-xxxxxxxx",
  "detector": "injection_score",
  "score": 0.83,
  "threshold": 0.70,
  "surface": "incoming",
  "action_taken": "block",
  "trace_id": "trace-...",
  "timestamp": "2026-04-17T18:23:41Z"
}
See Safe House Webhooks for the full event catalog.

Validating changes before deploy

For CI pipelines that publish card changes, validate the card client-side before the API call:
mnemom protection validate protection.card.yaml
echo $?   # 0 = valid, 1 = validation errors
protection validate checks the card against the schema locally (no network call). To check that the card composes cleanly with the current org template (no conflicts under stricter-wins), POST it to the /v1/protection/agent/{agent_id}/preview-compose endpoint, which returns the composed result plus any conflicts.

Rolling back a change

There’s no first-class rollback endpoint. To revert:
  1. Fetch the amendment history to find the prior shape:
    curl -H "X-Mnemom-Api-Key: $MNEMOM_API_KEY" \
      "https://api.mnemom.ai/v1/agents/{agent_id}/card-amendments"
    
    Each entry records the field_changed, old_value, new_value, reason, triggered_by, and created_at — so you can reconstruct any past field value.
  2. Reconstruct the card you want from the amendment history, then publish it as a new PUT /v1/protection/agent/{agent_id}.
All publishes are amendments — the history is preserved. A “rollback” is just another amendment referencing the prior shape.

See also