Safe House gateway integration
This page explains how Safe House integrates technically with the Mnemom gateway. If you are new to Safe House, start with the concept overview first.Request pipeline
Safe House runs as Phase 0.5 — after agent identification resolves the agent config and policy, but before quota enforcement or message forwarding. This placement is intentional: the gateway already knows which agent is handling the request (so Safe House config can be loaded), but no downstream resources have been consumed yet.Phase-by-Phase breakdown
Phase 0 — Agent identification
The gateway resolves theAuthorization header to an agent record and loads the agent’s alignment card, Safe House configuration, and policy bundle from KV. If Safe House mode is disabled (the default), Phase 0.5 is skipped entirely with no performance cost.
Phase 0.5 — Safe House screening
Behavior depends on the configured mode:- enforce
- observe
- simulate
Safe House runs synchronously. The gateway awaits the full L1→L2→L3 verdict before deciding whether to continue.
passorwarn: pipeline continues to Phase 1quarantine: request is stored in the quarantine queue with the full message payload; gateway returnsHTTP 400with body:block: request is dropped; gateway returnsHTTP 403with body:
X-Safe-House-Verdict header is present on all responses in this mode. For quarantine verdicts, X-Safe-House-Quarantine-Id contains the entry ID.Phase 1 — Quota enforcement
Standard token and request quota checks run here. In enforce mode, this phase is only reached by messages that passed Safe House screening, so quota is not consumed by quarantined or blocked attempts.Phase 2 — Policy evaluation
The three-layer policy merge (org → agent → transaction) runs here. Safe House operates independently of the policy engine — a message can pass Safe House and fail policy, or vice versa.Phase 3 — Forward to AI provider
The message is forwarded to the configured AI provider (Anthropic, OpenAI, Gemini, etc.). In enforce mode, only messages with a Safe Housepass or warn verdict reach this stage.
Phase 4 — AIP conscience analysis
The streaming response is teed through the AIP analysis pipeline. If Safe House scored the inbound message with a non-zero threat score (even if the verdict waspass), that score and the specific threat categories detected are injected into the conscience prompt. This gives AIP additional signal to scrutinize the resulting reasoning.
The enrichment looks like:
KV caching
Safe House configuration and session state are cached in theBILLING_CACHE KV binding (the same binding used for quota state). Cache TTLs:
| Item | TTL |
|---|---|
| Safe House agent config | 5 minutes |
| Session risk score | Duration of session (30-min idle expiry) |
| Canary patterns | 5 minutes |
| Quarantine metadata | 72 hours |
BILLING_CACHE binding must be present in wrangler.toml for Safe House to operate. If the binding is absent and SAFE_HOUSE_ENABLED=true, the gateway logs a warning and skips Safe House analysis for all requests.
Pre-emptive nudge injection
When a message passes Safe House screening but its L2 score is ≥ 0.6, Safe House writes a record to theenforcement_nudges table. On the agent’s next request, the gateway’s standard nudge injection logic picks this up and prepends a notice to the system prompt:
Attestation
Safe House scores are included in the attestationinputCommitment hash. When proof generation runs for a checkpoint, the hash commits to:
- The message content
- The Safe House L1 score, L2 score, and verdict
- The session risk level at time of message
- Whether any canary patterns were matched
Sovereign agent template
For agents operating in high-trust, high-risk environments (financial automation, infrastructure management, regulated data handling), the sovereign agent template provides a hardened Safe House configuration as a starting point:Environment requirements
Safe House requires the following to be present in the gateway environment:| Requirement | Purpose |
|---|---|
SAFE_HOUSE_ENABLED=true | Activates Safe House code path (feature flag) |
BILLING_CACHE KV binding | Session state and config caching |
See also
- Safe House concept — What Safe House is and how the threat categories work
- Safe House Quickstart — Enable and test Safe House in 5 minutes
- Enforcement Modes — How the gateway handles downstream violations
- Policy Engine — The parallel policy evaluation system