OWASP Agentic Top 10 mapping

Safe House enforces eight named threat patterns at the Mnemom gateway. Buyers evaluating agentic security posture against the OWASP Top 10 for Agentic Applications (OWASP Gen AI Security Project, released 2025-12-09; ASI01–ASI10) need a published cross-reference to make the shipped enforcement legible. This page provides that cross-reference. The discipline is the same one applied throughout this documentation: where Safe House covers an OWASP threat class, the coverage mechanism is named. Where coverage is partial or absent, the gap is stated rather than papered over.

Source of truth. The ASI identifiers and titles below are pinned to the official OWASP Gen AI Security Project release (announcement, resource page). The full taxonomy: ASI01 Agent Goal Hijack, ASI02 Tool Misuse, ASI03 Identity & Privilege Abuse, ASI04 Agentic Supply Chain Vulnerabilities, ASI05 Unexpected Code Execution, ASI06 Memory & Context Poisoning, ASI07 Insecure Inter-Agent Communication, ASI08 Cascading Failures, ASI09 Human-Agent Trust Exploitation, ASI10 Rogue Agents.

Safe House threat patterns

Safe House detects eight threat categories via its L1 pattern library, L2 LLM analysis layer, and L3 session model. Each pattern has a stable threat_type identifier used in API responses, webhook events, and threshold configuration:

Pattern	`threat_type`	Checkpoint
Business Email Compromise	`bec_fraud`	Front door
Prompt injection	`prompt_injection`	Front door
Indirect injection	`indirect_injection`	Front door
Social engineering	`social_engineering`	Front door
Agent spoofing	`agent_spoofing`	Front door
Multi-turn hijack	`hijack_attempt`	Front door (L3 session model)
Data exfiltration	`data_exfiltration`	Back door
Privilege escalation	`privilege_escalation`	Front door + back door

Pattern-to-OWASP mapping

Each Safe House pattern maps to one or more OWASP ASI entries. A pattern can map to multiple entries because OWASP threat classes are defined by attacker goal; Safe House detects by observable signal.

Safe House pattern	OWASP entry	Coverage level	Enforcement mechanism
`prompt_injection`	ASI01 Agent Goal Hijack	Full — direct variant	L1 regex families (override phrases, role reassignment, jailbreak openers, authority spoofing); L2 compound confidence scoring; L3 session model. Direct injection is the canonical goal-hijack vector
`indirect_injection`	ASI01 Agent Goal Hijack	Partial — indirect variant	L1 instruction-delimiter detection; MinHash similarity against known payloads; L2 instruction-structure detection in tool results. Novel payloads with no similarity to known patterns score lower — see limits
`hijack_attempt`	ASI01 Agent Goal Hijack	Partial — multi-turn variant	L3 session model: topic coherence tracking, escalating action scope, identity drift, pivot detection after a trust-building sequence. Substantial multi-turn coverage, with a known residual on novel multi-turn and multi-vector sequences (active recall work). Threshold: ≥ 0.7 confidence (routes to human review — see calibration note)
`bec_fraud`	ASI09 Human-Agent Trust Exploitation, ASI01 Agent Goal Hijack	Full — conjunction detection	Requires co-occurring: financial action term + authority claim + urgency marker + secrecy instruction. The authority/urgency/secrecy manipulation is trust exploitation; the intent to redirect the agent into an unauthorized action is goal hijack
`social_engineering`	ASI09 Human-Agent Trust Exploitation	Full	Authority + urgency signal pair, absent the financial-action component. Regulatory threat framing, role-authority claims, urgency escalation — manipulating the agent’s trust to change its behavior
`agent_spoofing`	ASI07 Insecure Inter-Agent Communication, ASI09 Human-Agent Trust Exploitation	Full — inbound signal	Claims of override authority or elevated permissions arriving as runtime messages; fake admin agent identifiers; credential-presentation patterns in message content. Detects unauthenticated identity/authority claims in the inter-agent message stream
`data_exfiltration`	ASI02 Tool Misuse	Full — outbound	Back-door checkpoint: external destination patterns vs. declared `bounded_actions`, bulk-data request language, covert-channel patterns. Exfiltration is misuse of connected tools/connectors to route data out. Also enforced independently by Policy Engine
`privilege_escalation`	ASI03 Identity & Privilege Abuse	Full	Runtime permission claims; requests outside declared `bounded_actions`; attempts to disable safety mechanisms mid-session. Front-door detection + Policy Engine independent check

OWASP-to-coverage mapping

The reverse view — starting from each OWASP entry and tracing what ships in enforcement. Coverage is asserted only where there is a concrete shipped mechanism to point to; everything else is stated as a gap.

OWASP entry	Safe House pattern	Additional coverage	Status
ASI01 Agent Goal Hijack	`prompt_injection`, `indirect_injection`, `hijack_attempt`	—	Shipped (direct); partial (multi-turn / indirect). Direct injection fully covered. Multi-turn goal redirection is substantially covered by the L3 session model, with a known residual on novel multi-turn and multi-vector sequences (active Safe House recall work). Indirect injection partial — novel payloads with no payload-library similarity may score below block threshold
ASI02 Tool Misuse	`data_exfiltration`	Policy Engine: `bounded_actions` enforcement, forbidden-rule Managed Rules, tool-capability mappings	Partial — policy layer + outbound screen. Tool execution is constrained by the Policy Engine before it reaches Safe House; Safe House’s back-door checkpoint catches data-exfiltration-via-tool. Mnemom does not intercept every unsafe tool invocation at the gateway — declared-scope enforcement is the primary control
ASI03 Identity & Privilege Abuse	`privilege_escalation`	AAP alignment cards declare the autonomy envelope; CLPI policy engine enforces	Shipped. Runtime privilege claims blocked at the front door. Declared envelope enforced at the policy layer
ASI04 Agentic Supply Chain Vulnerabilities	—	AEGIS substrate fingerprinting + L1 cross-tenant aggregator	Covered at the AEGIS layer, not a Safe House pattern. Runtime-behavior deviation consistent with a compromised dependency/substrate is detected cross-tenant. Does not replace build-time package provenance — see supply-chain trust
ASI05 Unexpected Code Execution	—	Policy Engine: `bounded_actions` constrains which tools/actions an agent may invoke	Gap (front-door). Safe House has no pattern that intercepts code-execution payloads. The Policy Engine limits the action surface (an agent can only invoke declared tools), which reduces blast radius, but Mnemom does not sandbox or screen executed code. Pair with an application-layer execution sandbox
ASI06 Memory & Context Poisoning	—	—	Gap. Persistent memory/context store attacks are not in the shipped Safe House pattern library. AIP thinking-block analysis and AAP drift detection give partial observability of downstream effects, but not upstream interception — see limits
ASI07 Insecure Inter-Agent Communication	`agent_spoofing`	—	Partial. Safe House treats unauthenticated authority/identity claims arriving as inbound runtime messages as suspicious by design. Legitimate agent-to-agent authority must be encoded in alignment cards at configuration time. This screens the content of inter-agent messages; it is not a transport-authentication scheme
ASI08 Cascading Failures	—	AAP drift detection + CLPI lifecycle governance (observability of degraded agent behavior)	Gap. No shipped Safe House pattern targets multi-agent cascading failure. AAP/CLPI provide some observability of an individual agent drifting, but Mnemom does not model or circuit-break failure propagation across an agent fleet. This is an application-architecture concern (timeouts, bulkheads, circuit breakers)
ASI09 Human-Agent Trust Exploitation	`bec_fraud`, `social_engineering`, `agent_spoofing`	—	Shipped. Authority/urgency/secrecy manipulation and impersonated-authority claims from inbound messages are treated as suspicious by design
ASI10 Rogue Agents	—	AAP alignment cards declare the autonomy envelope; CLPI lifecycle governance; AEGIS reputation	Covered at the governance layer, not a Safe House pattern. A “rogue” agent operating outside its declared envelope is constrained by Policy Engine `bounded_actions` enforcement and surfaced by CLPI lifecycle + reputation signals. Safe House screens inbound/outbound message content; it does not by itself decommission a misbehaving agent

Gaps and limits

ASI06 — Memory & Context Poisoning

No shipped Safe House pattern targets memory/context store attacks. An attacker who can write to an agent’s persistent memory — conversation history, vector store, retrieved documents — can influence future turns without a detectable inbound signal. The AIP thinking-block analysis layer provides partial mitigation: if the poisoned memory causes the agent to reason in ways inconsistent with its alignment card, AIP may flag it before the action lands. That is detection of downstream effect, not upstream interception. Recommended defense-in-depth: treat memory stores as an untrusted input boundary, apply the same L1/L2 screening to memory-retrieved content as to external tool results, and enable AIP to catch the reasoning anomalies that poisoned memory produces.

ASI05 — Unexpected Code Execution

Safe House does not intercept code-execution payloads at the gateway. The Policy Engine’s bounded_actions enforcement limits which tools an agent may invoke (reducing the surface that can reach an executor), but Mnemom neither sandboxes nor statically screens executed code. For agents that can run code, pair Mnemom with an application-layer execution sandbox and treat the executor as an untrusted boundary.

ASI08 — Cascading Failures

Mnemom screens per-agent message content and surfaces individual-agent drift (AAP/CLPI), but it does not model failure propagation across a multi-agent fleet. Cascading-failure resilience is an application-architecture responsibility: apply per-agent timeouts, bulkheads, and circuit breakers between agents so one degraded agent cannot fan out.

ASI01 — indirect injection, novel payloads

Indirect injection via tool results is partially covered. MinHash similarity matching compares tool results against a library of known injection payloads. Sufficiently novel payloads that bear no structural or semantic similarity to known patterns will score below L1 thresholds. L2 analysis provides a second pass, but detection accuracy is bounded by the analysis model’s capability. The arena flywheel closes this gap over time: canary credentials and cross-agent campaign detection surface novel attack infrastructure, and confirmed patterns promote to active detection. For high-sensitivity environments, consider adding application-layer validation of tool results before they re-enter the agent context.

ASI01 — multi-turn hijack, human escalation at default threshold

The hijack_attempt pattern routes to human review (not autonomous block) at the default 0.7 confidence threshold. This is intentional — legitimate multi-topic conversations produce similar L3 signals. If your use case can tolerate more aggressive autonomous blocking, lower the hijack_attempt threshold in the protection card.

Pairing Safe House with application-layer controls

OWASP guidance recommends pairing runtime substrate controls with application-layer controls. For the gaps above:

ASI06 (Memory & Context Poisoning): Apply Safe House’s inbound screening to memory-retrieved content as well as direct inbound messages. This is not the default — configure the SDK to route memory fetches through the Safe House evaluation layer.
ASI05 (Unexpected Code Execution): Run agent-executed code in an application-layer sandbox; scope bounded_actions so the agent can only reach the executor when its function genuinely requires it.
ASI04 (Agentic Supply Chain): AEGIS covers the runtime-behavior dimension. Pair with SLSA/Sigstore package provenance for the build-time dimension. See supply-chain trust.
ASI08 (Cascading Failures): Add per-agent timeouts, bulkheads, and circuit breakers at the application/orchestration layer so a degraded agent cannot cascade across the fleet.
ASI10 (Rogue Agents): Card design is the primary defense. Scope bounded_actions as narrowly as the agent’s function permits; CLPI enforces declared scope and surfaces lifecycle/reputation signals for an agent operating outside its envelope.

OWASP Agentic Top 10 Mapping

OWASP Agentic Top 10 mapping

Safe House threat patterns

Pattern-to-OWASP mapping

OWASP-to-coverage mapping

Gaps and limits

ASI06 — Memory & Context Poisoning

ASI05 — Unexpected Code Execution

ASI08 — Cascading Failures

ASI01 — indirect injection, novel payloads

ASI01 — multi-turn hijack, human escalation at default threshold

Pairing Safe House with application-layer controls

See also

​OWASP Agentic Top 10 mapping

​Safe House threat patterns

​Pattern-to-OWASP mapping

​OWASP-to-coverage mapping

​Gaps and limits

​ASI06 — Memory & Context Poisoning

​ASI05 — Unexpected Code Execution

​ASI08 — Cascading Failures

​ASI01 — indirect injection, novel payloads

​ASI01 — multi-turn hijack, human escalation at default threshold

​Pairing Safe House with application-layer controls

​See also

OWASP Agentic Top 10 mapping

Safe House threat patterns

Pattern-to-OWASP mapping

OWASP-to-coverage mapping

Gaps and limits

ASI06 — Memory & Context Poisoning

ASI05 — Unexpected Code Execution

ASI08 — Cascading Failures

ASI01 — indirect injection, novel payloads

ASI01 — multi-turn hijack, human escalation at default threshold

Pairing Safe House with application-layer controls

See also