Safe House API reference
The Safe House API covers six functional areas: configuration, quarantine management, observability and metrics, pattern and intelligence management, canary credentials, and compliance exports. All endpoints require a Bearer token or API key unless otherwise noted. Base URL:https://api.mnemom.ai
Configuration
Control how Safe House behaves — globally for the org, per-agent, or in bulk.| Method | Endpoint | Description |
|---|---|---|
GET | /v1/safe-house/config | Retrieve org-level Safe House defaults (thresholds, enforcement mode, enabled threat types) |
PUT | /v1/safe-house/config | Update org-level defaults — applies to all agents that don’t have a per-agent override |
GET | /v1/agents/:id/safe-house/config | Retrieve per-agent Safe House config (shows effective config after inheritance from org) |
PUT | /v1/agents/:id/safe-house/config | Update per-agent config — overrides org defaults for the specified fields only |
POST | /v1/safe-house/config/bulk-apply | Apply a config patch to multiple agents at once |
Quarantine management
Quarantined turns are held pending human review. Reviewers can release (with or without a false-positive flag) or confirm as a genuine threat.| Method | Endpoint | Description |
|---|---|---|
GET | /v1/safe-house/quarantine | List quarantined items — filter by status, agent_id, threat_type, date range |
GET | /v1/safe-house/quarantine/:id | Retrieve a single quarantine record with full evaluation detail |
DELETE | /v1/safe-house/quarantine/:id | Delete a quarantine record (admin only; irreversible) |
POST | /v1/safe-house/quarantine/:id/release | Release the quarantined turn to the agent; optionally mark as false positive |
POST | /v1/safe-house/quarantine/:id/report | Confirm the quarantined turn as a genuine threat |
Query & observability
Query the full evaluation history, aggregate metrics, and access a live SSE stream for real-time monitoring.| Method | Endpoint | Description |
|---|---|---|
GET | /v1/safe-house/evaluations | Full evaluation log — filter by agent_id, verdict, threat_type, from, to, min_risk |
GET | /v1/safe-house/metrics/summary | Aggregated counts: total evaluations, block rate, warn rate, false positive rate |
GET | /v1/safe-house/metrics/timeseries | Time-bucketed metrics for charts — specify bucket (hour, day, week) |
GET | /v1/safe-house/metrics/threats | Top threat types by volume and confidence over a time window |
GET | /v1/safe-house/feed | SSE stream of live Safe House events — connect once and receive events as they happen |
GET | /v1/safe-house/sessions | List active sessions with elevated session risk (medium or high) |
safe_house.evaluation.*, safe_house.canary.*, safe_house.session.*, and safe_house.campaign.* events as they occur. Reconnect with Last-Event-ID to replay missed events (replays up to 10 minutes back).
Patterns & intelligence
Manage the threat pattern library and retrieve adaptive threshold recommendations.| Method | Endpoint | Description |
|---|---|---|
GET | /v1/safe-house/patterns | List active and candidate threat patterns — filter by status, threat_type |
POST | /v1/safe-house/patterns | Submit a candidate pattern for review and potential promotion |
GET | /v1/safe-house/threshold-suggestions | Adaptive threshold recommendations based on your false-positive and miss rate |
candidate status. The arena evaluation pipeline tests them against labeled benign and malicious message sets. Patterns that exceed precision/recall thresholds are promoted to active.
Canary credentials
Canary credentials are honeypot API keys, tokens, or other secrets deliberately planted in the agent’s context. If an attacker extracts and uses them, Safe House detects the use and fires asafe_house.canary.triggered event.
| Method | Endpoint | Description |
|---|---|---|
POST | /v1/safe-house/canaries | Create a canary credential and associate it with an agent |
GET | /v1/safe-house/canaries?agent_id= | List canaries for an agent |
GET | /v1/safe-house/canaries/:id/status | Check whether a specific canary has been triggered |
credential value is returned only at creation time. Safe House monitors for its appearance in outbound requests or inbound message content.
Check canary status:
Special endpoints
Sovereign agent setup
One-call configuration for sovereign agents — applies hardened defaults, creates initial canaries, sets enforcement mode to block, and enables all threat types.Cross-Agent campaign detection
List detected attack campaigns — groups of related attacks targeting multiple agents from the same infrastructure.EU AI Act compliance export
Export Safe House evaluation data in EU AI Act Article 50 compliance format.Accept: text/csv for spreadsheet-compatible export.
Error responses
All Safe House endpoints return standard Mnemom error objects:| HTTP Status | Meaning |
|---|---|
400 | Invalid request body or parameters |
401 | Missing or invalid authentication |
403 | Insufficient permissions for the requested operation |
404 | Resource not found |
429 | Rate limit exceeded |
500 | Internal server error |
See also
- Safe House Threat Model — What each threat type means and how detection works
- Safe House Webhooks — React to Safe House events in real-time
- Safe House Monitoring — Security Observatory and alert management
- Policy Overview — Policy enforcement runs alongside Safe House