Self-Hosted deployment
Run the Mnemom Gateway on your own infrastructure for full data residency control. Prompt and response content is never sent to Mnemom’s cloud, though prompts are forwarded to your configured LLM providers — see Data residency for exact traffic boundaries. The self-hosted gateway is a Node.js adapter that runs the same code as the managed Cloudflare Workers service — identical behavior, your infrastructure.Self-hosted deployment requires an Enterprise license. Contact us to obtain a license key. Enterprise includes hybrid analysis mode, SSO/SAML integration, and dedicated support.
Deployment options
| Managed (Cloud) | Docker Compose | Kubernetes (Helm) | |
|---|---|---|---|
| Best for | Most teams | Small teams, eval, dev | Production at scale |
| Infrastructure | None (Mnemom hosts) | Single VM or server | K8s cluster |
| Setup time | Minutes | ~10 minutes | ~30 minutes |
| Scaling | Automatic | Manual | HPA auto-scaling |
| Data residency | Mnemom cloud | Your infrastructure | Your infrastructure |
| High availability | Built-in | Single node | Multi-replica, PDB |
| Monitoring | Dashboard | Prometheus + logs | Prometheus + ServiceMonitor |
Prerequisites
- An Enterprise license JWT from mnemom.ai/dashboard
- An Anthropic API key (required for AIP integrity analysis)
- Optional: OpenAI and Gemini API keys for multi-provider tracing
Quick start: Docker compose
The fastest way to get a self-hosted gateway running. Includes PostgreSQL, Redis, and automatic database migrations.Requirements
- Docker 24+ and Docker Compose v2+
- 2 GB RAM minimum, 4 GB recommended
- 10 GB disk space
Configure environment
Copy the example environment file and fill in your credentials:Edit
.env and set the required values:If your
.env.example shows SMOLTBOT_ROLE, rename it to MNEMOM_ROLE — the file carries a stale branding name but the entrypoint reads MNEMOM_ROLE.Start the stack
- PostgreSQL — database with health check
- Redis — caching layer with persistence
- Gateway — HTTP proxy on port 8787 (applies database migrations on startup)
- Observer — background scheduler for trace processing
Production: Kubernetes with Helm
For production deployments with auto-scaling, high availability, and monitoring.Requirements
- Kubernetes 1.27+
- Helm 3.12+
kubectlconfigured for your cluster
What the chart deploys
- Gateway Deployment (2 replicas by default) — HTTP proxy with liveness, readiness, and startup probes
- Observer Deployment (1 replica) — background scheduler for trace processing
- Migration Job — Helm pre-install/pre-upgrade hook that applies database migrations
- Service — ClusterIP on port 8787
- NetworkPolicy — deny-all default with explicit allows for ingress, Redis, PostgreSQL, and upstream LLM APIs
- PodDisruptionBudget — ensures at least 1 replica during rolling updates
- Optional: Ingress with TLS, HPA, ServiceMonitor for Prometheus
Scaling
Enable the HorizontalPodAutoscaler for automatic scaling:Architecture
In self-hosted mode, a Node.js adapter layer replaces Cloudflare-specific APIs while running the exact same gateway code:| Cloudflare API | Self-Hosted Replacement |
|---|---|
| KV Namespace | Redis (with in-memory fallback) |
ctx.waitUntil() | Promise collection with drain after response |
| AI Gateway URL routing | Fetch interceptor rewriting to upstream APIs |
ExecutionContext | Node.js shim with fire-and-forget semantics |
Data residency
Prompt and response content is never sent to Mnemom’s cloud. However, prompts are forwarded to your configured LLM providers — see the table below for exact traffic boundaries.| Traffic | Destination | How to keep in-region |
|---|---|---|
| LLM provider calls | Anthropic / OpenAI / Gemini APIs (port 443) | Route through a VPC-peered API proxy or use a provider’s regional endpoint |
| Heartbeat | https://api.mnemom.ai/v1/deployments/heartbeat | Set HEARTBEAT_URL to an internal endpoint or regional relay |
| Agent creation (s2s) | https://api.mnemom.ai/v1/agents (sends agent_hash + API-key prefix only — no prompt content) | Contact [email protected] for air-gapped options |
Configuration reference
Required
| Variable | Description |
|---|---|
SUPABASE_URL | Supabase project URL (https://<ref>.supabase.co) or self-hosted PostgREST endpoint |
SUPABASE_SECRET_KEY | Supabase service-role key |
SUPABASE_JWT_SECRET | JWT secret for verifying Supabase auth tokens (observer hard-fails without this) |
REDIS_PASSWORD | Password for the Redis instance (required when Redis is used) |
INTERNAL_API_KEY | Internal service-to-service secret for agent-creation calls |
MNEMOM_LICENSE_JWT | Enterprise license JWT from mnemom.ai/dashboard |
ANTHROPIC_API_KEY | Anthropic API key (required for AIP analysis) |
Optional: Providers
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY | — | OpenAI API key for multi-provider routing |
GEMINI_API_KEY | — | Google Gemini API key for multi-provider routing |
Optional: Hybrid analysis
| Variable | Default | Description |
|---|---|---|
MNEMOM_ANALYZE_URL | — | Delegate AIP analysis to Mnemom cloud (https://api.mnemom.ai/v1/analyze) |
MNEMOM_API_KEY | — | Mnemom API key with analyze scope (required when MNEMOM_ANALYZE_URL is set) |
Optional: Infrastructure
| Variable | Default | Description |
|---|---|---|
REDIS_URL | — | Redis connection URL. Without Redis, an in-memory KV adapter is used (single-node only). |
PORT | 8787 | HTTP listen port |
HOST | 0.0.0.0 | HTTP bind address |
MNEMOM_ROLE | all | gateway (HTTP only), scheduler (cron only), or all (both) |
LOG_LEVEL | info | debug, info, warn, or error. Structured JSON to stdout. |
HEARTBEAT_URL | https://api.mnemom.ai/v1/deployments/heartbeat | Override the phone-home heartbeat endpoint. Set to an internal relay for EU or air-gapped deployments. |
Health endpoints
Three Kubernetes-standard probes:| Endpoint | Purpose | Behavior |
|---|---|---|
/health/live | Liveness probe | Always 200 unless deadlocked |
/health/ready | Readiness probe | Checks Redis, PostgreSQL, and license validity |
/health/startup | Startup probe | Returns 503 until initialization complete |
Prometheus metrics
The gateway exposes a/metrics endpoint with:
gateway_requests_total{provider,status}— request countergateway_request_duration_seconds{provider}— latency histogramgateway_aip_checks_total{verdict}— integrity check countergateway_cache_operations_total{operation,result}— cache hit/miss- Standard
process_*andnodejs_*metrics
values.yaml:
Upgrading
Docker compose
Helm
Troubleshooting
Gateway won't start — EnvValidationError
Gateway won't start — EnvValidationError
A required environment variable is missing. Check the error message for which variable, then verify your
.env file or Kubernetes Secret.Redis connection refused
Redis connection refused
- Docker Compose: ensure the
redisservice is healthy (docker compose ps) - Kubernetes: verify
REDIS_URLin your Secret points to a reachable Redis instance - Without Redis, the gateway falls back to in-memory KV (single-node only)
License validation failed
License validation failed
- Verify
MNEMOM_LICENSE_JWTis set and not expired - Check
/health/readyfor the specific license error - Contact [email protected] for license reissuance
Upstream LLM API errors (401/403)
Upstream LLM API errors (401/403)
- Verify your API keys are correct and have sufficient credits
- The gateway proxies directly to provider APIs — ensure outbound HTTPS (port 443) is allowed
- In Kubernetes, check the NetworkPolicy allows egress to
0.0.0.0/0:443
High memory / OOMKilled
High memory / OOMKilled
- Increase container memory limits (512Mi minimum, 1Gi recommended for high traffic)
- If using in-memory KV, switch to Redis to reduce memory pressure
- Set
NODE_OPTIONS=--max-old-space-size=768for fine-grained heap control
Next steps
- Mnemom Gateway overview — architecture and components
- Enforcement modes — observe, nudge, and enforce
- Observability guide — dashboards and alerting
- Security model — trust boundaries and threat model