Top-level structure
§mode
Top-level action policy for Safe House on this agent. Mirrors the alignment card’sintegrity.enforcement_mode enum plus an off value for explicit opt-out.
| Value | Behavior |
|---|---|
off | Detection skipped entirely. No telemetry. For cost / latency / non-applicability cases. |
observe | Detectors run, signals logged asynchronously, no request-path action. |
nudge | Detectors run synchronously; matches attach an advisory annotation to the agent’s prompt context (and an X-Safe-House-Advisory response header) but the request proceeds. |
enforce | Detectors run synchronously; matches block the request (quarantine ≥ quarantine threshold, hard block ≥ block threshold). |
enforce implies synchronous verdict — to block a request, the gateway must wait for the verdict before delivering the message. There is no separate enforce_sync mode.
Composition: strictest wins across enforce > nudge > observe > off. An agent cannot drop below the platform/org floor.
nudge is the load-bearing middle ground: the model receives the advisory as part of its prompt context, so the security signal reaches the model without blocking the request. Customers running long-tail-confidence detectors typically run nudge rather than enforce until thresholds settle.
§thresholds
Three-band escalation ladder for Safe House detector scores. All values are floats in[0, 1].
| Field | Range | Meaning |
|---|---|---|
warn | [0, 1] | Score at-or-above triggers a warn-level annotation in observe/nudge mode (and a soft annotation in enforce). |
quarantine | [0, 1] | Score at-or-above triggers a quarantine in enforce mode (message held for review); informational in observe/nudge. |
block | [0, 1] | Score at-or-above triggers a hard block in enforce mode. |
warn ≤ quarantine ≤ block. The validator rejects any out-of-order combination at write time.
Composition: min across scopes. The lowest threshold wins, since lower = stricter (matches sooner). An agent cannot loosen a stricter platform/org threshold; it can only tighten further.
Three bands map cleanly onto the SOC severity ladder familiar to most operators. Per-detector tuning is an internal calibration concern and is not exposed in the schema.
§screen_surfaces
Which request surfaces Safe House inspects, named by direction (incoming/outgoing) and tool relationship.| Field | Default | Meaning |
|---|---|---|
incoming | true | Inbound prompts: user messages, webhook triggers, queue messages, API calls — anything entering the agent. |
outgoing | true | The agent’s generated response leaving the agent. |
tool_calls | true | Arguments the agent sends to tool invocations (outbound tool side). |
tool_responses | true | Return values from tool calls reaching the agent (inbound tool side). |
true, it’s scanned. Agents cannot disable scanning that org or platform requires. Phrased in alignment-card vocabulary: strictest wins (with true = scan being the more restrictive choice).
Direction-based naming is durable across transport changes: an agent receiving a webhook trigger is “incoming” whether it’s a user message, an API event, or a queue payload. Differentiating tool_calls from tool_responses reflects that they have different threat models — outgoing tool args may exfiltrate; incoming tool responses may inject.
Turning off a surface emits a low-priority audit trace so reviewers can see what was not scanned. If you need to disable a surface for a specific agent, the recommended path is an exemption with a documented reason rather than a raw false in the agent card.
§trusted_sources
Per-bucket allowlist of upstream sources whose content Safe House skips detection for. The buckets are typed so the validator can apply per-bucket deny-lists and the composer can apply per-bucket intersection rules.| Field | Type | Validation |
|---|---|---|
domains | string[] | DNS name (or host:port); deny-listed against public LLM endpoints (api.openai.com, api.anthropic.com, etc.) and public DNS-over-HTTPS providers. |
agent_ids | string[] | Mnemom agent IDs (mnm-* format). No wildcards. |
ip_ranges | string[] (CIDR) | IPv4 or IPv6 CIDR; deny-listed against 0.0.0.0/0, ::/0, and public DNS resolver ranges (8.8.8.0/24, 1.1.1.0/24, 9.9.9.0/24). |
- Platform → agent: intersection. The platform list is the compliance ceiling — downstream scopes (org, agent) cannot widen trust beyond what the platform allows. If the platform sets
ip_ranges: [10.0.0.0/8], an agent cannot add192.168.0.0/16to its own list and have it take effect. - Org + agent: union within the ceiling. Either scope can add trust within the platform-imposed ceiling.
- Empty platform list = unconstrained ceiling. When the platform doesn’t specify a bucket, downstream entries pass through without intersection.
sh_trusted_source_skip audit trace so reviewers can see what was waved through.
Security note: the validator’s deny-list is non-exhaustive — adding a publicly-routable IP range or a customer-controllable domain is a critical misconfiguration even if it passes the deny-list. Treat trusted_sources as a sharp tool.
§extensions
Free-formRecord<string, unknown> for protocol-specific or user-defined additions. Mnemom reserves mnemom.*.
§_composition (canonical-only)
Present on the canonical protection card produced by the composer; absent on raw agent-scope cards written byPUT /v1/agents/:id/protection-card.
_composition is read-only on the wire.
YAML safe schema
Allyaml.load() calls use { schema: yaml.CORE_SCHEMA } — Node-specific tags are rejected. Plain scalars, maps, and sequences only.
Body-size limits
- Full protection card payload: 64 KB max (enforced via Content-Length + body-length double-check).
thresholds,screen_surfaces,trusted_sourcesare bounded by the 64 KB envelope.
413 Payload Too Large for oversize bodies.
Versioning
card_version currently:
protection/2026-04-26— current. ADR-037 canonical form. All canonical cards emit this version.
protection/2026-04-15 cards stored before ADR-037 were transformed in-place by migration 140; the schema version was rolled forward at the same time.
See also
- Protection Card — conceptual overview
- Safe House — the detection pipeline this card configures
- Alignment Card Schema — companion spec
- Card Composition — three-scope composition rules + exemptions