CFD Monitoring
Once CFD is running, the Security Observatory gives you a live view of the threat landscape across your agents. This guide covers what each panel shows, which metrics matter, and how to respond to the alerts you will encounter.
The Security Observatory
The Security Observatory is the CFD section of the Mnemom dashboard at app.mnemom.com/cfd. It has five main panels.
Threat Feed
A real-time stream of evaluation events across all agents in your org, newest first. Each row shows:
- Agent ID and session ID — which agent, which conversation
- Verdict —
warn, quarantine, or block (color-coded amber/orange/red)
- Top threat type — the highest-confidence threat category
- Confidence score — L2 confidence percentage
- Detection layer — whether the verdict came from L1 (pattern match), L2 (LLM analysis), or L3 (session model)
- Time — seconds or minutes ago
Click any row to open the full evaluation detail: the turn content, all threat signals, the full L2 reasoning, and the quarantine management actions.
What to look for: A sudden spike in prompt_injection or indirect_injection events on a specific agent often means its data sources (search results, email, API responses) have been poisoned. Investigate the source, not just the individual turn.
Metrics Overview
Four KPI cards with 7-day and 30-day comparisons:
- Block Rate —
block_count / total_evaluations — healthy baseline is below 0.5% for most agents. A spike above 2% warrants immediate investigation.
- Warn Rate —
warn_count / total_evaluations — 1–5% is typical. Sustained elevation above 10% suggests either a real threat campaign or thresholds that need calibration.
- Quarantine Queue Depth — count of items awaiting human review. If this grows faster than your review capacity, consider adjusting thresholds or enabling automatic release for low-confidence quarantines.
- False Positive Rate — computed over resolved quarantine items. Target below 15%. Above 20% means thresholds need adjustment for the dominant false-positive threat type.
Session Risk Panel
Lists all active sessions with elevated risk (medium or high). A session enters medium risk when any single turn scores above the warn threshold. It escalates to high when:
- Two or more turns in the session have scored above
warn, or
- A
hijack_attempt pattern has been detected (topic/scope pivot after benign history), or
- A
session.escalated event has fired
What to watch: A session that reaches high but has not been blocked means CFD is in quarantine or warn enforcement mode for that agent. The session is still live. Check whether human review is proceeding fast enough — the session may be waiting on you.
Campaign Tracker
Displays detected cross-agent attack campaigns — groups of agents that have received structurally similar attacks from what appears to be the same infrastructure within a rolling time window.
Campaign detection fires when:
- 3 or more agents in the same org receive turns that score ≥ 0.85 on the same threat type, and
- MinHash similarity between those turns is ≥ 0.88
What it means: A campaign detection is strong evidence of a coordinated attack rather than a coincidental spike. The attacker is probing multiple agents simultaneously, possibly to find the one with the weakest enforcement configuration. Respond by checking whether any of the targeted agents have higher thresholds than the others.
Pattern Activity
Shows recently promoted and demoted patterns in the threat library, and the current backlog of candidate patterns awaiting evaluation.
Pattern states:
candidate — submitted by user or generated by the arena, not yet evaluated
active — passed precision/recall thresholds and is in use
deprecated — retired due to excessive false positive rate or superseded by a better pattern
Auto-promotion conditions: A candidate pattern is promoted to active when it achieves precision ≥ 0.90 and recall ≥ 0.85 on the evaluation set of 200+ labeled messages. Arena evaluations run continuously; most candidates are evaluated within 24 hours.
Key Metrics to Watch
Block Rate by Agent
curl "https://api.mnemom.ai/v1/cfd/metrics/timeseries?metric=block_rate&bucket=hour&agent_id=smolt-18a228f3" \
-H "Authorization: Bearer $TOKEN"
A block rate above 2% on a single agent almost always means one of three things:
- The agent is under active attack
- A data source the agent reads has been compromised
- The threshold is too low for this agent’s legitimate workload
Compare the threat type breakdown to distinguish attack from false positive.
Warn Rate by Confidence Band
Break down warn events by confidence band to understand whether your threshold is set correctly:
| Band | Confidence | Interpretation |
|---|
| Low | 0.40–0.55 | High false positive probability — consider raising warn threshold |
| Medium | 0.55–0.70 | Mixed signal — review representative samples |
| High | 0.70–0.85 | Strong signal — these are approaching block territory |
| Very High | 0.85+ | Near-certain threat — if these are in warn mode, consider raising to block |
False Positive Rate by Confidence Band
This is the most useful metric for threshold calibration. If your false positive rate is high in the 0.55–0.70 band but low in the 0.70–0.85 band, you should raise your warn threshold to 0.70 and leave your block threshold where it is.
Session Threat Escalation
The X-CFD-Session-Risk response header on every gateway response reflects the current session risk level for that session. Your application code can read this header and take action:
const response = await fetch(SMOLTBOT_GATEWAY_URL, {
method: 'POST',
headers: { /* ... */ },
body: JSON.stringify(requestBody),
});
const sessionRisk = response.headers.get('X-CFD-Session-Risk');
// 'low' | 'medium' | 'high'
if (sessionRisk === 'high') {
// Pause the automated workflow and route to human review
await workflowQueue.escalate(sessionId, { reason: 'CFD session risk elevated' });
}
Risk level progression:
| Level | Meaning | Recommended action |
|---|
low | No threats detected, clean session history | Continue normally |
medium | One or more warn-level events in session | Monitor; consider slowing automated action rate |
high | Block-level event, hijack pattern, or repeated warn events | Route to human review before further agent actions |
Sessions reset to low at the start of each new session. They do not carry over across sessions.
Campaign Detection
When the campaign tracker shows a new campaign, your response workflow should be:
- Identify the weakest enforcement link — which of the affected agents has the lowest block threshold for the campaign’s threat type? Tighten it first.
- Check for a common data source — if the affected agents all read from the same email inbox, API feed, or shared data store, that source may be compromised.
- Review the similarity evidence — the campaign detail view shows representative turns from each affected agent. The structural similarities are often informative about the attacker’s tooling.
- File a pattern candidate — if the campaign’s payload structure is novel, submit it as a candidate pattern so it enters the arena evaluation pipeline.
Query all active campaigns:
curl "https://api.mnemom.ai/v1/cfd/campaigns?status=active" \
-H "Authorization: Bearer $TOKEN"
Adaptive Threshold Suggestions
The calibration engine runs nightly and computes threshold suggestions for any agent whose false positive or miss rate deviates from targets. Retrieve them:
curl https://api.mnemom.ai/v1/cfd/threshold-suggestions \
-H "Authorization: Bearer $TOKEN"
How to interpret each suggestion:
rationale — the specific metric that triggered the suggestion (false positive rate, miss rate, or both)
confidence field on the suggestion — high means the pattern is consistent over 30 days; low means there is not yet enough data to be certain
scope — agent means the suggestion applies only to a specific agent; org means it applies org-wide
Apply a suggestion:
curl -X PUT https://api.mnemom.ai/v1/agents/smolt-18a228f3/cfd/config \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"thresholds": {
"bec_fraud": { "warn": 0.55, "block": 0.88 }
}
}'
When the calibration engine produces suggestions for multiple threat types on the same agent, apply them one at a time over successive days rather than all at once. This lets you observe the effect of each change in isolation.
Querying Evaluation History
The evaluation log supports rich filtering for forensic investigation and compliance review.
All blocks in the past 7 days:
curl "https://api.mnemom.ai/v1/cfd/evaluations?verdict=block&from=2026-03-23T00:00:00Z&limit=100" \
-H "Authorization: Bearer $TOKEN"
All BEC fraud signals above 0.7 confidence for a specific agent:
curl "https://api.mnemom.ai/v1/cfd/evaluations?agent_id=smolt-18a228f3&threat_type=bec_fraud&min_risk=0.7&limit=50" \
-H "Authorization: Bearer $TOKEN"
Timeseries for a specific threat type:
curl "https://api.mnemom.ai/v1/cfd/metrics/timeseries?threat_type=prompt_injection&bucket=day&from=2026-03-01T00:00:00Z" \
-H "Authorization: Bearer $TOKEN"
Available filter parameters:
| Parameter | Type | Description |
|---|
agent_id | string | Filter to a specific agent |
verdict | warn | quarantine | block | Filter by verdict |
threat_type | string | Filter by threat category |
min_risk | float | Minimum overall risk score (0–1) |
detection_layer | l1 | l2 | l3 | Which detection layer produced the verdict |
from | ISO 8601 | Start of time range |
to | ISO 8601 | End of time range |
limit | integer | Results per page (max 200) |
cursor | string | Pagination cursor from previous response |
The Audit Log
Every configuration change, quarantine decision, and pattern activation is recorded in the CFD audit log. The audit log is append-only and tamper-evident.
What the audit log contains:
| Event Class | Examples |
|---|
| Configuration changes | Threshold update, enforcement mode change, bulk-apply |
| Quarantine decisions | Released (with is_false_positive), confirmed as threat, deleted |
| Pattern events | Candidate submitted, promoted to active, deprecated |
| Canary events | Created, triggered, acknowledged |
| Campaign events | Detected, closed, false-positive dismissed |
Query the audit log:
curl "https://api.mnemom.ai/v1/cfd/audit?from=2026-03-01T00:00:00Z&limit=50" \
-H "Authorization: Bearer $TOKEN"
Audit log entries are included in the EU AI Act compliance export (GET /v1/compliance/cfd-report) and support the 90-day minimum retention required for Article 50 compliance.
Further Reading