Skip to main content

Self-Hosted deployment

Run the Mnemom Gateway on your own infrastructure for full data residency control. Prompt and response content is never sent to Mnemom’s cloud, though prompts are forwarded to your configured LLM providers — see Data residency for exact traffic boundaries. The self-hosted gateway is a Node.js adapter that runs the same code as the managed Cloudflare Workers service — identical behavior, your infrastructure.
Self-hosted deployment requires an Enterprise license. Contact us to obtain a license key. Enterprise includes hybrid analysis mode, SSO/SAML integration, and dedicated support.

Deployment options

Managed (Cloud)Docker ComposeKubernetes (Helm)
Best forMost teamsSmall teams, eval, devProduction at scale
InfrastructureNone (Mnemom hosts)Single VM or serverK8s cluster
Setup timeMinutes~10 minutes~30 minutes
ScalingAutomaticManualHPA auto-scaling
Data residencyMnemom cloudYour infrastructureYour infrastructure
High availabilityBuilt-inSingle nodeMulti-replica, PDB
MonitoringDashboardPrometheus + logsPrometheus + ServiceMonitor

Prerequisites

  • An Enterprise license JWT from mnemom.ai/dashboard
  • An Anthropic API key (required for AIP integrity analysis)
  • Optional: OpenAI and Gemini API keys for multi-provider tracing
AIP defaults to fail-open mode. If the analysis LLM is unreachable, integrity checks will silently pass. For production deployments handling sensitive operations, set failure_policy: { mode: "fail_closed" } in your AIP configuration.

Quick start: Docker compose

The fastest way to get a self-hosted gateway running. Includes PostgreSQL, Redis, and automatic database migrations.

Requirements

  • Docker 24+ and Docker Compose v2+
  • 2 GB RAM minimum, 4 GB recommended
  • 10 GB disk space
1

Clone the repository

git clone https://github.com/mnemom/mnemom-platform.git
cd mnemom-platform/deploy/docker
2

Configure environment

Copy the example environment file and fill in your credentials:
cp .env.example .env
Edit .env and set the required values:
# Required
POSTGRES_PASSWORD=<strong-password>
REDIS_PASSWORD=<strong-password>
SUPABASE_URL=https://<your-project-ref>.supabase.co
SUPABASE_SECRET_KEY=<your-supabase-service-role-key>
SUPABASE_JWT_SECRET=<your-supabase-jwt-secret>
INTERNAL_API_KEY=<strong-random-string>
MNEMOM_LICENSE_JWT=<your-enterprise-license-jwt>
ANTHROPIC_API_KEY=<your-anthropic-api-key>

# Optional: additional providers
OPENAI_API_KEY=<your-openai-key>
GEMINI_API_KEY=<your-gemini-key>

# Optional: heartbeat override (for EU/air-gapped deployments)
# HEARTBEAT_URL=https://api.mnemom.ai/v1/deployments/heartbeat
If your .env.example shows SMOLTBOT_ROLE, rename it to MNEMOM_ROLE — the file carries a stale branding name but the entrypoint reads MNEMOM_ROLE.
3

Start the stack

docker compose up -d
This starts four services in order:
  1. PostgreSQL — database with health check
  2. Redis — caching layer with persistence
  3. Gateway — HTTP proxy on port 8787 (applies database migrations on startup)
  4. Observer — background scheduler for trace processing
4

Verify health

Wait about 30 seconds, then check the gateway health:
curl http://localhost:8787/health/ready
Expected response
{
  "status": "ok",
  "checks": {
    "redis": { "ok": true },
    "supabase": { "ok": true },
    "license": { "ok": true }
  }
}
5

Connect an agent

Point the mnemom CLI at your self-hosted gateway:
npm install -g @mnemom/mnemom
# Point the CLI at your self-hosted stack, then authenticate
export MNEMOM_ENV=local   # resolves API + gateway to http://localhost:8787
mnemom login
Make a test request:
curl http://localhost:8787/anthropic/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-haiku-4-5-20251001",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello"}]
  }'
Verify the agent is connected:
mnemom status

Production: Kubernetes with Helm

For production deployments with auto-scaling, high availability, and monitoring.

Requirements

  • Kubernetes 1.27+
  • Helm 3.12+
  • kubectl configured for your cluster
1

Add the Helm chart

cd mnemom-platform/deploy/helm  # navigate to the helm chart
2

Create a Kubernetes Secret

Store sensitive credentials in a Secret:
kubectl create secret generic mnemom-secrets \
  --from-literal=SUPABASE_URL=<your-supabase-url> \
  --from-literal=SUPABASE_SECRET_KEY=<your-service-role-key> \
  --from-literal=SUPABASE_JWT_SECRET=<your-supabase-jwt-secret> \
  --from-literal=INTERNAL_API_KEY=<strong-random-string> \
  --from-literal=ANTHROPIC_API_KEY=<your-anthropic-key> \
  --from-literal=MNEMOM_LICENSE_JWT=<your-license-jwt> \
  --from-literal=REDIS_URL=<your-redis-url> \
  --from-literal=DATABASE_URL=<your-postgres-url>
3

Install the chart

helm install mnemom ./mnemom-gateway \
  --set secrets.existingSecret=mnemom-secrets \
  --set ingress.enabled=true \
  --set ingress.hosts[0].host=gateway.yourcompany.com \
  --set ingress.hosts[0].paths[0].path=/ \
  --set ingress.hosts[0].paths[0].pathType=Prefix
4

Verify the deployment

kubectl get pods -l app.kubernetes.io/name=mnemom-gateway
helm test mnemom

What the chart deploys

  • Gateway Deployment (2 replicas by default) — HTTP proxy with liveness, readiness, and startup probes
  • Observer Deployment (1 replica) — background scheduler for trace processing
  • Migration Job — Helm pre-install/pre-upgrade hook that applies database migrations
  • Service — ClusterIP on port 8787
  • NetworkPolicy — deny-all default with explicit allows for ingress, Redis, PostgreSQL, and upstream LLM APIs
  • PodDisruptionBudget — ensures at least 1 replica during rolling updates
  • Optional: Ingress with TLS, HPA, ServiceMonitor for Prometheus

Scaling

Enable the HorizontalPodAutoscaler for automatic scaling:
# values.yaml
hpa:
  enabled: true
  minReplicas: 2
  maxReplicas: 20
  targetCPU: 70
  targetMemory: 80

Architecture

In self-hosted mode, a Node.js adapter layer replaces Cloudflare-specific APIs while running the exact same gateway code:
Your App / Agents


Self-Hosted Gateway (Node.js, port 8787)
  │ ── KV adapter ──▶ Redis (or in-memory)
  │ ── fetch interceptor ──▶ Anthropic / OpenAI / Gemini (direct)

  ├──▶ Observer (cron scheduler)
  │     ── builds AP-Traces
  │     ── runs AAP verification
  │     ── runs AIP integrity checks


PostgreSQL (Supabase or self-managed)

  ├──▶ CLI (mnemom status / logs)
  └──▶ Dashboard (mnemom.ai or self-hosted)
Adaptation layer — zero modifications to gateway source code:
Cloudflare APISelf-Hosted Replacement
KV NamespaceRedis (with in-memory fallback)
ctx.waitUntil()Promise collection with drain after response
AI Gateway URL routingFetch interceptor rewriting to upstream APIs
ExecutionContextNode.js shim with fire-and-forget semantics

Data residency

Prompt and response content is never sent to Mnemom’s cloud. However, prompts are forwarded to your configured LLM providers — see the table below for exact traffic boundaries.
TrafficDestinationHow to keep in-region
LLM provider callsAnthropic / OpenAI / Gemini APIs (port 443)Route through a VPC-peered API proxy or use a provider’s regional endpoint
Heartbeathttps://api.mnemom.ai/v1/deployments/heartbeatSet HEARTBEAT_URL to an internal endpoint or regional relay
Agent creation (s2s)https://api.mnemom.ai/v1/agents (sends agent_hash + API-key prefix only — no prompt content)Contact [email protected] for air-gapped options
Traces, integrity checkpoints, and all prompt/response content remain in your database and are never sent to Mnemom’s cloud.

Configuration reference

Required

VariableDescription
SUPABASE_URLSupabase project URL (https://<ref>.supabase.co) or self-hosted PostgREST endpoint
SUPABASE_SECRET_KEYSupabase service-role key
SUPABASE_JWT_SECRETJWT secret for verifying Supabase auth tokens (observer hard-fails without this)
REDIS_PASSWORDPassword for the Redis instance (required when Redis is used)
INTERNAL_API_KEYInternal service-to-service secret for agent-creation calls
MNEMOM_LICENSE_JWTEnterprise license JWT from mnemom.ai/dashboard
ANTHROPIC_API_KEYAnthropic API key (required for AIP analysis)

Optional: Providers

VariableDefaultDescription
OPENAI_API_KEYOpenAI API key for multi-provider routing
GEMINI_API_KEYGoogle Gemini API key for multi-provider routing

Optional: Hybrid analysis

VariableDefaultDescription
MNEMOM_ANALYZE_URLDelegate AIP analysis to Mnemom cloud (https://api.mnemom.ai/v1/analyze)
MNEMOM_API_KEYMnemom API key with analyze scope (required when MNEMOM_ANALYZE_URL is set)
In hybrid mode, only thinking/reasoning blocks are sent for analysis — raw prompts and responses never leave your infrastructure.

Optional: Infrastructure

VariableDefaultDescription
REDIS_URLRedis connection URL. Without Redis, an in-memory KV adapter is used (single-node only).
PORT8787HTTP listen port
HOST0.0.0.0HTTP bind address
MNEMOM_ROLEallgateway (HTTP only), scheduler (cron only), or all (both)
LOG_LEVELinfodebug, info, warn, or error. Structured JSON to stdout.
HEARTBEAT_URLhttps://api.mnemom.ai/v1/deployments/heartbeatOverride the phone-home heartbeat endpoint. Set to an internal relay for EU or air-gapped deployments.

Health endpoints

Three Kubernetes-standard probes:
EndpointPurposeBehavior
/health/liveLiveness probeAlways 200 unless deadlocked
/health/readyReadiness probeChecks Redis, PostgreSQL, and license validity
/health/startupStartup probeReturns 503 until initialization complete

Prometheus metrics

The gateway exposes a /metrics endpoint with:
  • gateway_requests_total{provider,status} — request counter
  • gateway_request_duration_seconds{provider} — latency histogram
  • gateway_aip_checks_total{verdict} — integrity check counter
  • gateway_cache_operations_total{operation,result} — cache hit/miss
  • Standard process_* and nodejs_* metrics
For Kubernetes, enable the ServiceMonitor in values.yaml:
metrics:
  serviceMonitor:
    enabled: true
    interval: 30s

Upgrading

Docker compose

cd mnemom-platform && git pull
cd deploy/docker
docker compose build
docker compose up -d
Migrations run automatically as part of the gateway startup.

Helm

helm upgrade mnemom ./deploy/helm/mnemom-gateway \
  --set secrets.existingSecret=mnemom-secrets
The migration job runs as a pre-upgrade Helm hook.
Always back up your database before upgrading. For Docker: docker compose exec postgres pg_dump -U mnemom mnemom > backup.sql. For Kubernetes: use your standard PostgreSQL backup procedure.

Troubleshooting

A required environment variable is missing. Check the error message for which variable, then verify your .env file or Kubernetes Secret.
  • Docker Compose: ensure the redis service is healthy (docker compose ps)
  • Kubernetes: verify REDIS_URL in your Secret points to a reachable Redis instance
  • Without Redis, the gateway falls back to in-memory KV (single-node only)
  • Verify MNEMOM_LICENSE_JWT is set and not expired
  • Check /health/ready for the specific license error
  • Contact [email protected] for license reissuance
  • Verify your API keys are correct and have sufficient credits
  • The gateway proxies directly to provider APIs — ensure outbound HTTPS (port 443) is allowed
  • In Kubernetes, check the NetworkPolicy allows egress to 0.0.0.0/0:443
  • Increase container memory limits (512Mi minimum, 1Gi recommended for high traffic)
  • If using in-memory KV, switch to Redis to reduce memory pressure
  • Set NODE_OPTIONS=--max-old-space-size=768 for fine-grained heap control

Next steps