Documentation Index
Fetch the complete documentation index at: https://docs.mnemom.ai/llms.txt
Use this file to discover all available pages before exploring further.
Provider support
Mnemom sits in front of three upstream model providers: Anthropic, OpenAI, and Gemini. Safe House, AIP integrity checkpoints, CLPI policy enforcement, and DLP all route through a single gateway, but the quality of each feature varies by provider — because the underlying APIs differ in what they expose. This page is the honest accounting: what Mnemom guarantees on each provider, what it partially guarantees, and what it cannot guarantee.Where coverage is partial — particularly OpenAI’s integrity-checkpoint coverage — Mnemom does not claim parity. The v1 commitment is honest per-provider differentiation, not uniform coverage. Marketing materials, the status page, and the trust center all reflect this.
Supported models
Models the v1 promise applies to. Listed in the gateway’s/models.json registry; routed end-to-end through Safe House, AIP, CLPI, and DLP per the matrix below.
Anthropic
Claude Opus 4.7
Claude Sonnet 4.6
Claude Haiku 4.5
OpenAI
GPT-5
GPT-5 Codex
o3
o3-mini
Gemini
Gemini 2.5 Pro
Gemini 2.5 Flash
model: on a Mnemom API call):
Feature coverage matrix
Each cell describes the v1 commitment level. Symbols:- ✓ Fully supported. Tested in CI; Safe House features work identically to the Anthropic baseline.
- ⚠ Partial — Provider exposes the feature, but with a documented limitation. See the supporting note.
- N/A — Provider does not expose this capability. Not a Mnemom limitation.
| Capability | Anthropic | OpenAI | Gemini |
|---|---|---|---|
| Streaming (SSE) | ✓ | ✓ | ✓ |
| Tool use / function calling | ✓ | ✓ | ✓ |
| Thinking-trace inspection (AIP) | ✓ [1] | ⚠ Partial [2] | ✓ [3] |
| Multimodal (image inputs) | ✓ | ✓ | ✓ |
| Prompt caching | ✓ [4] | ⚠ Partial [5] | ⚠ Partial [6] |
| Batch API | ✓ | ✓ | ✓ |
[1] Anthropic — full extended thinking
Anthropic models expose full extended thinking blocks through the response API. AIP reads completed thinking blocks post-response, pre-delivery, and the verifier (Claude Haiku 4.5) has full chain-of-thought visibility. This is the most complete AIP coverage Mnemom offers — and the baseline against which other providers are compared.[2] OpenAI — reasoning summaries only
OpenAI’s reasoning APIs (o3, o3-mini) expose reasoning summaries, not raw chain-of-thought. AIP can inspect these summaries, but the depth of analysis is bounded by what OpenAI chose to summarize. Non-reasoning models (gpt-5, gpt-5-codex, gpt-4o) expose no internal reasoning at all — AIP degrades to surface-only analysis on these.
What this means in practice:
- Boundary violations that surface in the model’s final response are caught equally well across all providers.
- Boundary violations that would have surfaced in the model’s hidden reasoning, but never reach the final response, are caught with lower confidence on OpenAI than on Anthropic or Gemini.
- The AIP integrity-checkpoint verdict is still emitted on OpenAI; the certainty of that verdict is provider-conditional.
[3] Gemini — full thoughts exposure
Gemini 2.5 models expose a thoughts field on response candidates. Coverage is uniform across 2.5 Pro and 2.5 Flash; AIP reads thoughts through the gateway’s response normalizer and treats it as equivalent to Anthropic extended thinking.
[4] Anthropic — explicit cache_control
Anthropic supports explicit cache_control block markers — customers control which prompt segments are cached. Mnemom passes cache_control through transparently. Safe House still evaluates the full request (it does not assume cached prefixes are safe just because they were previously seen). Cache hits do not bypass any checkpoint.
[5] OpenAI — automatic caching, no customer control
OpenAI’s prompt caching is automatic — the API decides what to cache based on request shape. Customers cannot reason about cache hit rates the way they can on Anthropic. Mnemom passes requests through unchanged; cache decisions are OpenAI’s. Safe House dispatch remains idempotent across cache hits and misses — the same prompt routed through Mnemom twice produces the same verdict regardless of whether OpenAI cached it.[6] Gemini — separate CachedContent API
Gemini exposes prompt caching as a separate CachedContent resource (explicit cache lifecycle, named caches with TTL). The gateway today does not surface or use this API; requests are sent without referencing cached content. Customers using Gemini’s cache outside Mnemom will see lower latency than they see through Mnemom — this is a known gap, tracked in the Track 3 hardening plan.
AIP coverage by provider — the headline commitment
Of all per-provider gaps, integrity-checkpoint coverage is the load-bearing one. Mnemom’s public SLO commitment is:| Provider | Target AIP coverage [*] |
|---|---|
| Anthropic (Opus 4.7, Sonnet 4.6, Haiku 4.5) | ≥ 99% |
| Gemini (2.5 Pro, 2.5 Flash) | ≥ 95% |
OpenAI o-series (o3, o3-mini) | ≥ 50% |
OpenAI non-thinking (gpt-5, gpt-5-codex, gpt-4o) | Excluded — AIP runs in degraded “surface-only” mode |
Latency expectations
Per-provider P50/P95 latency for the Safe House dispatch path and AIP analysis is published on the status page and committed via SLI-P2 and SLI-P3. In aggregate:- Safe House dispatch adds ~15 ms P50 / ~60 ms P95 across all providers (the dispatch path is provider-agnostic).
- AIP analysis runs post-response, pre-delivery. Cost varies by upstream-response token volume — Anthropic Opus with full extended thinking emits the largest traces and therefore the longest AIP analysis tails (P95 up to 2.5 seconds). Gemini 2.5 Flash and OpenAI non-thinking models emit thin traces and complete AIP analysis in P50 ≤ 800 ms.
How we test against each provider
Mnemom’s gateway adapter — the code that parses each provider’s response format, extracts thinking blocks, and routes tool calls — is tested in two layers:| Layer | Cadence | Catches |
|---|---|---|
| Static-shape | Every PR to the gateway | ”Did our parser break?” Asserts the adapter handles captured response fixtures correctly. |
| Live | Nightly | ”Did the upstream provider ship a breaking change?” Real upstream call → real response → real parse. |
Streaming OpenAI thinking visibility: OpenAI’s STREAMING Chat Completions API does not emit reasoning summaries via
delta.reasoning_content for o-series models. Reasoning happens server-side and is reported in usage.completion_tokens_details.reasoning_tokens, but the reasoning text itself does not reach the streamed response. Customers using OpenAI’s non-streaming Responses API see reasoning summaries; streaming customers do not.This is upstream behavior, not a Mnemom limitation. The AIP coverage commitment of ≥50% on OpenAI o-series applies to non-streaming consumers; streaming OpenAI o-series degrades to surface-only treatment. Mnemom’s gateway parser correctly reports empty thinking content for streaming OpenAI responses regardless of model.Deprecation policy
Models the gateway routes are classified into two tiers:Supported
Listed in the supported models section above. v1 promise applies. Safe House, AIP, CLPI, DLP all work to their per-provider commitment. Tested in CI. Deprecation requires 90 days’ notice via this page and the changelog.
Passthrough
Routed by the gateway but not in the supported tier. Inference works; Safe House features are best-effort. Not tested in CI. No deprecation notice — model availability tracks the provider’s lifecycle.
| Model | Status | Sunset date | Migration target |
|---|---|---|---|
claude-3-opus-20240229 | Passthrough | Tracks Anthropic’s deprecation | claude-opus-4-7 |
claude-3-5-sonnet-20241022 | Passthrough | Tracks Anthropic’s deprecation | claude-sonnet-4-6 |
claude-sonnet-4-20250514 | Passthrough | 2026-Q3 (recommend migrating now) | claude-sonnet-4-6 |
gpt-4o | Passthrough | Tracks OpenAI’s deprecation | gpt-5 |
gemini-3-pro, gemini-3-flash | Preview | When Google releases stable | Stay on gemini-3-* once supported |
/models.json registry; we do not maintain shims.
Out of scope
Provider expansion beyond Anthropic + OpenAI + Gemini is not in scope for v1. Cohere, Mistral, Together, Groq, and other providers are not supported. Adding a provider is a multi-quarter effort (Safe House dispatch, AIP adapter, CLPI policy schema mapping, harness coverage, docs) — tracked separately. BYOK (bring-your-own-key) for upstream providers is also out of v1. v1 ships with Mnemom-only key custody.Related
- Integrity Checkpoints — the AIP analysis machinery
- Safe House — pre-screening layer for inbound messages
- CLPI — Continuous Local Policy Interpretation for tool use
- Webhook contract — event delivery for operator surfaces (provider-agnostic)