CFD Monitoring

Once CFD is running, the Security Observatory gives you a live view of the threat landscape across your agents. This guide covers what each panel shows, which metrics matter, and how to respond to the alerts you will encounter.

The Security Observatory

The Security Observatory is the CFD section of the Mnemom dashboard at app.mnemom.com/cfd. It has five main panels.

Threat Feed

A real-time stream of evaluation events across all agents in your org, newest first. Each row shows:

Agent ID and session ID — which agent, which conversation
Verdict — warn, quarantine, or block (color-coded amber/orange/red)
Top threat type — the highest-confidence threat category
Confidence score — L2 confidence percentage
Detection layer — whether the verdict came from L1 (pattern match), L2 (LLM analysis), or L3 (session model)
Time — seconds or minutes ago

Click any row to open the full evaluation detail: the turn content, all threat signals, the full L2 reasoning, and the quarantine management actions. What to look for: A sudden spike in prompt_injection or indirect_injection events on a specific agent often means its data sources (search results, email, API responses) have been poisoned. Investigate the source, not just the individual turn.

Metrics Overview

Four KPI cards with 7-day and 30-day comparisons:

Block Rate — block_count / total_evaluations — healthy baseline is below 0.5% for most agents. A spike above 2% warrants immediate investigation.
Warn Rate — warn_count / total_evaluations — 1–5% is typical. Sustained elevation above 10% suggests either a real threat campaign or thresholds that need calibration.
Quarantine Queue Depth — count of items awaiting human review. If this grows faster than your review capacity, consider adjusting thresholds or enabling automatic release for low-confidence quarantines.
False Positive Rate — computed over resolved quarantine items. Target below 15%. Above 20% means thresholds need adjustment for the dominant false-positive threat type.

Session Risk Panel

Lists all active sessions with elevated risk (medium or high). A session enters medium risk when any single turn scores above the warn threshold. It escalates to high when:

Two or more turns in the session have scored above warn, or
A hijack_attempt pattern has been detected (topic/scope pivot after benign history), or
A session.escalated event has fired

What to watch: A session that reaches high but has not been blocked means CFD is in quarantine or warn enforcement mode for that agent. The session is still live. Check whether human review is proceeding fast enough — the session may be waiting on you.

Campaign Tracker

Displays detected cross-agent attack campaigns — groups of agents that have received structurally similar attacks from what appears to be the same infrastructure within a rolling time window. Campaign detection fires when:

3 or more agents in the same org receive turns that score ≥ 0.85 on the same threat type, and
MinHash similarity between those turns is ≥ 0.88

What it means: A campaign detection is strong evidence of a coordinated attack rather than a coincidental spike. The attacker is probing multiple agents simultaneously, possibly to find the one with the weakest enforcement configuration. Respond by checking whether any of the targeted agents have higher thresholds than the others.

Pattern Activity

Shows recently promoted and demoted patterns in the threat library, and the current backlog of candidate patterns awaiting evaluation. Pattern states:

candidate — submitted by user or generated by the arena, not yet evaluated
active — passed precision/recall thresholds and is in use
deprecated — retired due to excessive false positive rate or superseded by a better pattern

Auto-promotion conditions: A candidate pattern is promoted to active when it achieves precision ≥ 0.90 and recall ≥ 0.85 on the evaluation set of 200+ labeled messages. Arena evaluations run continuously; most candidates are evaluated within 24 hours.

Key Metrics to Watch

Block Rate by Agent

curl "https://api.mnemom.ai/v1/cfd/metrics/timeseries?metric=block_rate&bucket=hour&agent_id=smolt-18a228f3" \
  -H "Authorization: Bearer $TOKEN"

A block rate above 2% on a single agent almost always means one of three things:

The agent is under active attack
A data source the agent reads has been compromised
The threshold is too low for this agent’s legitimate workload

Compare the threat type breakdown to distinguish attack from false positive.

Warn Rate by Confidence Band

Break down warn events by confidence band to understand whether your threshold is set correctly:

Band	Confidence	Interpretation
Low	0.40–0.55	High false positive probability — consider raising warn threshold
Medium	0.55–0.70	Mixed signal — review representative samples
High	0.70–0.85	Strong signal — these are approaching block territory
Very High	0.85+	Near-certain threat — if these are in `warn` mode, consider raising to `block`

False Positive Rate by Confidence Band

This is the most useful metric for threshold calibration. If your false positive rate is high in the 0.55–0.70 band but low in the 0.70–0.85 band, you should raise your warn threshold to 0.70 and leave your block threshold where it is.

Session Threat Escalation

The X-CFD-Session-Risk response header on every gateway response reflects the current session risk level for that session. Your application code can read this header and take action:

const response = await fetch(SMOLTBOT_GATEWAY_URL, {
  method: 'POST',
  headers: { /* ... */ },
  body: JSON.stringify(requestBody),
});

const sessionRisk = response.headers.get('X-CFD-Session-Risk');
// 'low' | 'medium' | 'high'

if (sessionRisk === 'high') {
  // Pause the automated workflow and route to human review
  await workflowQueue.escalate(sessionId, { reason: 'CFD session risk elevated' });
}

Risk level progression:

Level	Meaning	Recommended action
`low`	No threats detected, clean session history	Continue normally
`medium`	One or more warn-level events in session	Monitor; consider slowing automated action rate
`high`	Block-level event, hijack pattern, or repeated warn events	Route to human review before further agent actions

Sessions reset to low at the start of each new session. They do not carry over across sessions.

Campaign Detection

When the campaign tracker shows a new campaign, your response workflow should be:

Identify the weakest enforcement link — which of the affected agents has the lowest block threshold for the campaign’s threat type? Tighten it first.
Check for a common data source — if the affected agents all read from the same email inbox, API feed, or shared data store, that source may be compromised.
Review the similarity evidence — the campaign detail view shows representative turns from each affected agent. The structural similarities are often informative about the attacker’s tooling.
File a pattern candidate — if the campaign’s payload structure is novel, submit it as a candidate pattern so it enters the arena evaluation pipeline.

Query all active campaigns:

curl "https://api.mnemom.ai/v1/cfd/campaigns?status=active" \
  -H "Authorization: Bearer $TOKEN"

Adaptive Threshold Suggestions

The calibration engine runs nightly and computes threshold suggestions for any agent whose false positive or miss rate deviates from targets. Retrieve them:

curl https://api.mnemom.ai/v1/cfd/threshold-suggestions \
  -H "Authorization: Bearer $TOKEN"

How to interpret each suggestion:

rationale — the specific metric that triggered the suggestion (false positive rate, miss rate, or both)
confidence field on the suggestion — high means the pattern is consistent over 30 days; low means there is not yet enough data to be certain
scope — agent means the suggestion applies only to a specific agent; org means it applies org-wide

Apply a suggestion:

curl -X PUT https://api.mnemom.ai/v1/agents/smolt-18a228f3/cfd/config \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "thresholds": {
      "bec_fraud": { "warn": 0.55, "block": 0.88 }
    }
  }'

When the calibration engine produces suggestions for multiple threat types on the same agent, apply them one at a time over successive days rather than all at once. This lets you observe the effect of each change in isolation.

Querying Evaluation History

The evaluation log supports rich filtering for forensic investigation and compliance review. All blocks in the past 7 days:

curl "https://api.mnemom.ai/v1/cfd/evaluations?verdict=block&from=2026-03-23T00:00:00Z&limit=100" \
  -H "Authorization: Bearer $TOKEN"

All BEC fraud signals above 0.7 confidence for a specific agent:

curl "https://api.mnemom.ai/v1/cfd/evaluations?agent_id=smolt-18a228f3&threat_type=bec_fraud&min_risk=0.7&limit=50" \
  -H "Authorization: Bearer $TOKEN"

Timeseries for a specific threat type:

curl "https://api.mnemom.ai/v1/cfd/metrics/timeseries?threat_type=prompt_injection&bucket=day&from=2026-03-01T00:00:00Z" \
  -H "Authorization: Bearer $TOKEN"

Available filter parameters:

Parameter	Type	Description
`agent_id`	string	Filter to a specific agent
`verdict`	`warn` \| `quarantine` \| `block`	Filter by verdict
`threat_type`	string	Filter by threat category
`min_risk`	float	Minimum overall risk score (0–1)
`detection_layer`	`l1` \| `l2` \| `l3`	Which detection layer produced the verdict
`from`	ISO 8601	Start of time range
`to`	ISO 8601	End of time range
`limit`	integer	Results per page (max 200)
`cursor`	string	Pagination cursor from previous response

The Audit Log

Every configuration change, quarantine decision, and pattern activation is recorded in the CFD audit log. The audit log is append-only and tamper-evident. What the audit log contains:

Event Class	Examples
Configuration changes	Threshold update, enforcement mode change, bulk-apply
Quarantine decisions	Released (with `is_false_positive`), confirmed as threat, deleted
Pattern events	Candidate submitted, promoted to active, deprecated
Canary events	Created, triggered, acknowledged
Campaign events	Detected, closed, false-positive dismissed

Query the audit log:

curl "https://api.mnemom.ai/v1/cfd/audit?from=2026-03-01T00:00:00Z&limit=50" \
  -H "Authorization: Bearer $TOKEN"

Audit log entries are included in the EU AI Act compliance export (GET /v1/compliance/cfd-report) and support the 90-day minimum retention required for Article 50 compliance.

Guides

Context Front Door

CFD Monitoring

CFD Monitoring

The Security Observatory

Threat Feed

Metrics Overview

Session Risk Panel

Campaign Tracker

Pattern Activity

Key Metrics to Watch

Block Rate by Agent

Warn Rate by Confidence Band

False Positive Rate by Confidence Band

Session Threat Escalation

Campaign Detection

Adaptive Threshold Suggestions

Querying Evaluation History

The Audit Log

Further Reading

Guides

Context Front Door

​CFD Monitoring

​The Security Observatory

​Threat Feed

​Metrics Overview

​Session Risk Panel

​Campaign Tracker

​Pattern Activity

​Key Metrics to Watch

​Block Rate by Agent

​Warn Rate by Confidence Band

​False Positive Rate by Confidence Band

​Session Threat Escalation

​Campaign Detection

​Adaptive Threshold Suggestions

​Querying Evaluation History

​The Audit Log

​Further Reading

CFD Monitoring

The Security Observatory

Threat Feed

Metrics Overview

Session Risk Panel

Campaign Tracker

Pattern Activity

Key Metrics to Watch

Block Rate by Agent

Warn Rate by Confidence Band

False Positive Rate by Confidence Band

Session Threat Escalation

Campaign Detection

Adaptive Threshold Suggestions

Querying Evaluation History

The Audit Log

Further Reading