Enforcement Modes

Smoltbot supports three enforcement modes that control how the system responds when alignment or integrity violations are detected. You can choose the level of intervention appropriate for your use case — from passive observation to active blocking.

Modes Overview

Observe

Detect violations, record them, take no action. This is the default mode. Ideal for initial deployment and monitoring.

Nudge

Detect violations and inject feedback into the agent’s next request via system prompt. The agent sees it and can self-correct.

Enforce

Hard block with 403 for non-streaming requests. Falls back to nudge for streaming requests.

Mode Details

Observe
Nudge
Enforce

Observe Mode (Default)

In observe mode, smoltbot detects and records all violations but takes no action to modify agent behavior. This is the default mode for all new agents.Behavior:

All API calls pass through unchanged
Violations are detected and recorded in the trace database
Integrity checkpoints are created for every interaction
Drift alerts are generated when behavioral patterns shift
No modification to agent requests or responses

When to use:

During initial deployment to establish behavioral baselines
When you want to monitor without affecting agent behavior
For compliance auditing where you need a record but not intervention
When evaluating whether to enable more active enforcement

Configuration:

curl -X PUT https://api.mnemom.ai/v1/agents/:id/enforcement \
  -H "Content-Type: application/json" \
  -d '{"mode": "observe"}'

Nudge Mode

In nudge mode, when a violation is detected, smoltbot creates a pending nudge record. On the agent’s next request, the gateway injects an integrity notice into the system prompt. The agent sees the notice, can review its approach, and self-correct.Behavior:

Violations are detected and recorded
A pending nudge record is created for each violation
On the next API request, an integrity notice is injected into the system prompt
The agent receives the nudge and can self-correct
Nudge delivery is tracked in the conscience timeline
The original request passes through (not blocked)

How nudges work:

The gateway detects a violation in real-time via streaming response analysis (tee() + background AIP check, ~3s latency)
A pending nudge record is created immediately — ready before the agent’s next turn

On the agent’s next API call, the gateway injects a system prompt addition:

[INTEGRITY NOTICE] Your previous action was flagged for:
UNBOUNDED_ACTION (HIGH) — action "delete_all_records" is not
in your declared bounded_actions. Please review your approach.

The agent processes the notice alongside its normal instructions
The nudge is marked as delivered in the conscience timeline

When to use:

When you want the agent to self-correct without hard blocking
For agents in production where availability is critical
When building trust in the enforcement system before enabling hard blocks
For advisory or delegated_authority principal relationships

Configuration:

curl -X PUT https://api.mnemom.ai/v1/agents/:id/enforcement \
  -H "Content-Type: application/json" \
  -d '{"mode": "nudge"}'

Enforce Mode

In enforce mode, smoltbot actively blocks requests when critical violations are detected. The behavior depends on whether the request is streaming or non-streaming.Behavior:

Non-streaming requests: Hard block with HTTP 403 response. The request does not reach the AI provider.
Streaming requests: Falls back to nudge mode (injects integrity notice into system prompt). This is because streaming connections cannot be cleanly terminated mid-stream.

When to use:

For high-security environments where violations must be prevented
When operating under fail_closed AIP failure policy
For agents handling sensitive data or financial transactions
When regulatory requirements demand active intervention

Configuration:

curl -X PUT https://api.mnemom.ai/v1/agents/:id/enforcement \
  -H "Content-Type: application/json" \
  -d '{"mode": "enforce"}'

Enforce mode will block non-streaming API calls when violations are detected. Ensure your application handles 403 responses gracefully before enabling this mode.

Setting Enforcement Mode

Set enforcement mode via the API:

PUT /v1/agents/:id/enforcement

Request body:

{
  "mode": "observe" | "nudge" | "enforce"
}

Example:

# Enable nudge mode
curl -X PUT https://api.mnemom.ai/v1/agents/agent_abc123/enforcement \
  -H "Content-Type: application/json" \
  -d '{"mode": "nudge"}'

Nudge Strategy

When enforcement mode is nudge or enforce, you can further control when nudges are created using the nudge strategy setting:

Strategy	Behavior
`always`	Every boundary violation creates a nudge (default)
`sampling`	Nudge on a percentage of violations (uses `proof_rate` or dedicated `nudge_rate`)
`threshold`	Only nudge after N violations in the current session
`off`	No nudging — violations are recorded but no correction is injected

Configuration:

curl -X PUT https://api.mnemom.ai/v1/agents/:id/settings \
  -H "Content-Type: application/json" \
  -d '{"nudge_strategy": "always"}'

Use threshold mode to avoid alert fatigue. The agent only receives a nudge after repeated violations in the same session, giving it a chance to self-correct naturally first.

Per-Agent Feature Toggles

Team operations require the team_reputation feature flag, available on Team and Enterprise plans. See Pricing for plan details.

Each agent has independent controls for the transparency and integrity pipeline:

Setting	Type	Default	Description
`aap_enabled`	boolean	`true`	Enable AAP action traces
`aip_enabled`	boolean	`true`	Enable AIP integrity analysis
`proof_enabled`	boolean	`true`	Enable cryptographic attestation (Ed25519 + Merkle)
`proof_rate`	integer	`100`	% of non-violation checkpoints that get full attestation
`proof_boundary_cap`	integer	`100`	Max % of boundary violations to prove (when card-gap detection is unavailable). Boundary violations with detected card gaps are automatically deferred.
`nudge_strategy`	string	`always`	When to create nudges (`always`, `sampling`, `threshold`, `off`)

Configuration:

# Disable AIP for an agent (AAP traces still flow)
curl -X PUT https://api.mnemom.ai/v1/agents/:id/settings \
  -H "Content-Type: application/json" \
  -d '{"aip_enabled": false}'

# Set proof sampling to 50%
curl -X PUT https://api.mnemom.ai/v1/agents/:id/settings \
  -H "Content-Type: application/json" \
  -d '{"proof_rate": 50}'

These settings are also available in the Agent Settings panel on the web dashboard for claimed agents.

Violation Types and Enforcement

Enforcement applies to all violation types detected by the AAP verification engine, AIP integrity checks, and the policy engine:

Alignment Violations

Violation Type	Severity	Enforcement Behavior
`FORBIDDEN_ACTION`	CRITICAL	Blocked in enforce mode; nudged in nudge mode
`CARD_MISMATCH`	CRITICAL	Blocked in enforce mode; nudged in nudge mode
`UNBOUNDED_ACTION`	HIGH	Blocked in enforce mode; nudged in nudge mode
`MISSED_ESCALATION`	HIGH	Blocked in enforce mode; nudged in nudge mode
`CARD_EXPIRED`	HIGH	Blocked in enforce mode; nudged in nudge mode
`UNDECLARED_VALUE`	MEDIUM	Nudged in nudge/enforce mode (not blocked)

Policy Violations

Policy enforcement operates as a parallel system alongside alignment enforcement. While alignment enforcement checks agent behavior against card values, policy enforcement checks tool usage against governance rules.

Violation Type	Severity	Enforcement Behavior
`POLICY_VIOLATION`	HIGH	Blocked when policy enforcement mode is `enforce`; logged when `warn`
`UNMAPPED_TOOL`	MEDIUM	Logged as warning; behavior depends on `defaults.unmapped_tool_action`
`CAPABILITY_MISMATCH`	HIGH	Blocked when policy enforcement mode is `enforce`; logged when `warn`

Policy enforcement is controlled independently via the enforcement_mode field in the Policy DSL:

warn — Log violations, return X-Policy-Verdict: warn header, allow request to proceed
enforce — Block requests with violations (HTTP 403), return X-Policy-Verdict: fail header
off — Skip policy evaluation entirely

The X-Policy-Verdict response header is always present when a policy is active:

Header Value	Meaning
`pass`	All tools mapped and permitted
`warn`	Violations detected but not blocking
`fail`	Violations detected and request blocked (enforce mode only)

Alignment enforcement (observe/nudge/enforce) and policy enforcement (off/warn/enforce) can be configured independently. For example, you might use nudge for alignment violations while using enforce for policy violations, or vice versa.

See Policy Engine for full details on how policies are evaluated, and Policy Management for setup instructions.

In enforce mode, only CRITICAL and HIGH severity violations trigger hard blocks on non-streaming requests. MEDIUM severity violations are always handled via nudge, even in enforce mode. This applies to both alignment and policy violations.

Conscience Timeline

All enforcement actions are tracked in the conscience timeline, accessible via the API and the web dashboard at mnemom.ai. The timeline records:

When a violation was detected
What type and severity
What enforcement action was taken (observed, nudged, blocked)
Whether the agent self-corrected after a nudge
Drift patterns across enforcement events

Provider Compatibility

Enforcement works across all providers where AIP is supported:

Provider	Observe	Nudge	Enforce (non-streaming)	Enforce (streaming)
Anthropic	Yes	Yes	Yes	Falls back to nudge
OpenAI	Yes	Yes	Yes	Falls back to nudge
Gemini	Yes	Yes	Yes	Falls back to nudge

Agent Containment

Agent containment is a separate enforcement layer that operates above the per-request enforcement modes. While enforcement modes (observe/nudge/enforce) control individual request handling, containment controls whether the agent can make requests at all.

Containment States

State	Meaning	Gateway Behavior
`active`	Normal operation (default)	Requests proceed normally
`paused`	Temporarily stopped	All requests blocked with HTTP 403
`killed`	Permanently stopped	All requests blocked with HTTP 403

Paused agents can be resumed by an org owner or admin. Killed agents require explicit reactivation by an owner only. The distinction matters for audit: pause means “we need to investigate,” kill means “this agent is compromised.”

Containment API

# Pause an agent
curl -X POST https://api.mnemom.ai/v1/orgs/{org_id}/agents/{agent_id}/pause \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"reason": "Investigating boundary violations"}'

# Resume a paused agent
curl -X POST https://api.mnemom.ai/v1/orgs/{org_id}/agents/{agent_id}/resume \
  -H "Authorization: Bearer $TOKEN"

# Kill an agent (owner only)
curl -X POST https://api.mnemom.ai/v1/orgs/{org_id}/agents/{agent_id}/kill \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"reason": "Agent compromised"}'

# Reactivate a killed agent (owner only)
curl -X POST https://api.mnemom.ai/v1/orgs/{org_id}/agents/{agent_id}/reactivate \
  -H "Authorization: Bearer $TOKEN"

# Get containment status and audit log
curl https://api.mnemom.ai/v1/orgs/{org_id}/agents/{agent_id}/containment \
  -H "Authorization: Bearer $TOKEN"

Error Response

When a contained agent attempts an API request through the gateway, it receives:

{
  "error": "Agent contained",
  "type": "containment_error",
  "reason": "agent_paused"
}

The HTTP status code is 403 Forbidden (distinct from 402 Payment Required used for billing enforcement).

Auto-Containment

Agents can be configured to automatically pause after consecutive boundary violations:

# Enable auto-containment after 3 consecutive violations
curl -X PUT https://api.mnemom.ai/v1/orgs/{org_id}/agents/{agent_id}/containment-policy \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"auto_containment_threshold": 3}'

# Disable auto-containment
curl -X PUT https://api.mnemom.ai/v1/orgs/{org_id}/agents/{agent_id}/containment-policy \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"auto_containment_threshold": null}'

When auto-containment triggers, it:

Sets the agent status to paused with actor system
Logs the action in the containment audit log
Emits an agent.paused webhook event
Purges the gateway cache so the block takes effect immediately

RBAC Requirements

Action	Required Role
Pause	`owner` or `admin`
Resume	`owner` or `admin`
Kill	`owner` only
Reactivate	`owner` only
View status	Any org role
Set auto-containment policy	`owner` or `admin`

Webhook Events

Three webhook events are emitted for containment actions:

agent.paused — Agent was paused (manually or automatically)
agent.resumed — Agent was resumed or reactivated
agent.killed — Agent was killed

Each event includes:

{
  "agent_id": "smolt-xxxxxxxx",
  "org_id": "org-xxx",
  "action": "pause",
  "actor": "user-xxx",
  "reason": "Investigating boundary violations",
  "previous_status": "active",
  "new_status": "paused"
}

Containment Audit Log

Every containment action is recorded in a tamper-evident audit log. Each entry includes:

The action taken (pause, resume, kill, reactivate, auto_pause)
Who triggered it (user ID or system for auto-containment)
The reason provided
Previous and new containment states
Timestamp

Overview

Concepts

Smoltbot

Pricing

Specifications

Changelog

Enforcement Modes

Enforcement Modes

Modes Overview

Observe

Nudge

Enforce

Mode Details

Observe Mode (Default)

Nudge Mode

Enforce Mode

Setting Enforcement Mode

Nudge Strategy

Per-Agent Feature Toggles

Violation Types and Enforcement

Alignment Violations

Policy Violations

Conscience Timeline

Provider Compatibility

See Also

Agent Containment

Containment States

Containment API

Error Response

Auto-Containment

RBAC Requirements

Webhook Events

Containment Audit Log

Overview

Concepts

Smoltbot

Pricing

Specifications

Changelog

​Enforcement Modes

​Modes Overview

Observe

Nudge

Enforce

​Mode Details

​Observe Mode (Default)

​Nudge Mode

​Enforce Mode

​Setting Enforcement Mode

​Nudge Strategy

​Per-Agent Feature Toggles

​Violation Types and Enforcement

​Alignment Violations

​Policy Violations

​Conscience Timeline

​Provider Compatibility

​See Also

​Agent Containment

​Containment States

​Containment API

​Error Response

​Auto-Containment

​RBAC Requirements

​Webhook Events

​Containment Audit Log

Enforcement Modes

Modes Overview

Mode Details

Observe Mode (Default)

Nudge Mode

Enforce Mode

Setting Enforcement Mode

Nudge Strategy

Per-Agent Feature Toggles

Violation Types and Enforcement

Alignment Violations

Policy Violations

Conscience Timeline

Provider Compatibility

See Also

Agent Containment

Containment States

Containment API

Error Response

Auto-Containment

RBAC Requirements

Webhook Events

Containment Audit Log