> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mnemom.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Self-Hosted Deployment

> Run the Mnemom Gateway on your own infrastructure with Docker or Kubernetes

# Self-Hosted deployment

Run the Mnemom Gateway on your own infrastructure for full data residency control. Prompt and response content is never sent to Mnemom's cloud, though prompts are forwarded to your configured LLM providers — see [Data residency](#data-residency) for exact traffic boundaries. The self-hosted gateway is a Node.js adapter that runs the same code as the managed Cloudflare Workers service — identical behavior, your infrastructure.

<Note>
  Self-hosted deployment requires an **Enterprise license**. [Contact us](https://mnemom.ai/contact) to obtain a license key. Enterprise includes hybrid analysis mode, SSO/SAML integration, and dedicated support.
</Note>

## Deployment options

|                       | **Managed (Cloud)** | **Docker Compose**     | **Kubernetes (Helm)**       |
| --------------------- | ------------------- | ---------------------- | --------------------------- |
| **Best for**          | Most teams          | Small teams, eval, dev | Production at scale         |
| **Infrastructure**    | None (Mnemom hosts) | Single VM or server    | K8s cluster                 |
| **Setup time**        | Minutes             | \~10 minutes           | \~30 minutes                |
| **Scaling**           | Automatic           | Manual                 | HPA auto-scaling            |
| **Data residency**    | Mnemom cloud        | Your infrastructure    | Your infrastructure         |
| **High availability** | Built-in            | Single node            | Multi-replica, PDB          |
| **Monitoring**        | Dashboard           | Prometheus + logs      | Prometheus + ServiceMonitor |

## Prerequisites

* An **Enterprise license JWT** from [mnemom.ai/dashboard](https://mnemom.ai/dashboard)
* An **Anthropic API key** (required for AIP integrity analysis)
* Optional: OpenAI and Gemini API keys for multi-provider tracing

<Warning>
  **AIP defaults to fail-open mode.** If the analysis LLM is unreachable, integrity checks will silently pass. For production deployments handling sensitive operations, set `failure_policy: { mode: "fail_closed" }` in your AIP configuration.
</Warning>

***

## Quick start: Docker compose

The fastest way to get a self-hosted gateway running. Includes PostgreSQL, Redis, and automatic database migrations.

### Requirements

* Docker 24+ and Docker Compose v2+
* 2 GB RAM minimum, 4 GB recommended
* 10 GB disk space

<Steps>
  <Step title="Clone the repository">
    ```bash theme={null}
    git clone https://github.com/mnemom/mnemom-platform.git
    cd mnemom-platform/deploy/docker
    ```
  </Step>

  <Step title="Configure environment">
    Copy the example environment file and fill in your credentials:

    ```bash theme={null}
    cp .env.example .env
    ```

    Edit `.env` and set the required values:

    ```bash theme={null}
    # Required
    POSTGRES_PASSWORD=<strong-password>
    REDIS_PASSWORD=<strong-password>
    SUPABASE_URL=https://<your-project-ref>.supabase.co
    SUPABASE_SECRET_KEY=<your-supabase-service-role-key>
    SUPABASE_JWT_SECRET=<your-supabase-jwt-secret>
    INTERNAL_API_KEY=<strong-random-string>
    MNEMOM_LICENSE_JWT=<your-enterprise-license-jwt>
    ANTHROPIC_API_KEY=<your-anthropic-api-key>

    # Optional: additional providers
    OPENAI_API_KEY=<your-openai-key>
    GEMINI_API_KEY=<your-gemini-key>

    # Optional: heartbeat override (for EU/air-gapped deployments)
    # HEARTBEAT_URL=https://api.mnemom.ai/v1/deployments/heartbeat
    ```

    <Note>
      If your `.env.example` shows `SMOLTBOT_ROLE`, rename it to `MNEMOM_ROLE` — the file carries a stale branding name but the entrypoint reads `MNEMOM_ROLE`.
    </Note>
  </Step>

  <Step title="Start the stack">
    ```bash theme={null}
    docker compose up -d
    ```

    This starts four services in order:

    1. **PostgreSQL** — database with health check
    2. **Redis** — caching layer with persistence
    3. **Gateway** — HTTP proxy on port 8787 (applies database migrations on startup)
    4. **Observer** — background scheduler for trace processing
  </Step>

  <Step title="Verify health">
    Wait about 30 seconds, then check the gateway health:

    ```bash theme={null}
    curl http://localhost:8787/health/ready
    ```

    ```json Expected response theme={null}
    {
      "status": "ok",
      "checks": {
        "redis": { "ok": true },
        "supabase": { "ok": true },
        "license": { "ok": true }
      }
    }
    ```
  </Step>

  <Step title="Connect an agent">
    Point the mnemom CLI at your self-hosted gateway:

    ```bash theme={null}
    npm install -g @mnemom/mnemom
    # Point the CLI at your self-hosted stack, then authenticate
    export MNEMOM_ENV=local   # resolves API + gateway to http://localhost:8787
    mnemom login
    ```

    Make a test request:

    ```bash theme={null}
    curl http://localhost:8787/anthropic/v1/messages \
      -H "x-api-key: $ANTHROPIC_API_KEY" \
      -H "anthropic-version: 2023-06-01" \
      -H "content-type: application/json" \
      -d '{
        "model": "claude-haiku-4-5-20251001",
        "max_tokens": 256,
        "messages": [{"role": "user", "content": "Hello"}]
      }'
    ```

    Verify the agent is connected:

    ```bash theme={null}
    mnemom status
    ```
  </Step>
</Steps>

***

## Production: Kubernetes with Helm

For production deployments with auto-scaling, high availability, and monitoring.

### Requirements

* Kubernetes 1.27+
* Helm 3.12+
* `kubectl` configured for your cluster

<Steps>
  <Step title="Add the Helm chart">
    ```bash theme={null}
    cd mnemom-platform/deploy/helm  # navigate to the helm chart
    ```
  </Step>

  <Step title="Create a Kubernetes Secret">
    Store sensitive credentials in a Secret:

    ```bash theme={null}
    kubectl create secret generic mnemom-secrets \
      --from-literal=SUPABASE_URL=<your-supabase-url> \
      --from-literal=SUPABASE_SECRET_KEY=<your-service-role-key> \
      --from-literal=SUPABASE_JWT_SECRET=<your-supabase-jwt-secret> \
      --from-literal=INTERNAL_API_KEY=<strong-random-string> \
      --from-literal=ANTHROPIC_API_KEY=<your-anthropic-key> \
      --from-literal=MNEMOM_LICENSE_JWT=<your-license-jwt> \
      --from-literal=REDIS_URL=<your-redis-url> \
      --from-literal=DATABASE_URL=<your-postgres-url>
    ```
  </Step>

  <Step title="Install the chart">
    ```bash theme={null}
    helm install mnemom ./mnemom-gateway \
      --set secrets.existingSecret=mnemom-secrets \
      --set ingress.enabled=true \
      --set ingress.hosts[0].host=gateway.yourcompany.com \
      --set ingress.hosts[0].paths[0].path=/ \
      --set ingress.hosts[0].paths[0].pathType=Prefix
    ```
  </Step>

  <Step title="Verify the deployment">
    ```bash theme={null}
    kubectl get pods -l app.kubernetes.io/name=mnemom-gateway
    helm test mnemom
    ```
  </Step>
</Steps>

### What the chart deploys

* **Gateway Deployment** (2 replicas by default) — HTTP proxy with liveness, readiness, and startup probes
* **Observer Deployment** (1 replica) — background scheduler for trace processing
* **Migration Job** — Helm pre-install/pre-upgrade hook that applies database migrations
* **Service** — ClusterIP on port 8787
* **NetworkPolicy** — deny-all default with explicit allows for ingress, Redis, PostgreSQL, and upstream LLM APIs
* **PodDisruptionBudget** — ensures at least 1 replica during rolling updates
* **Optional**: Ingress with TLS, HPA, ServiceMonitor for Prometheus

### Scaling

Enable the HorizontalPodAutoscaler for automatic scaling:

```yaml theme={null}
# values.yaml
hpa:
  enabled: true
  minReplicas: 2
  maxReplicas: 20
  targetCPU: 70
  targetMemory: 80
```

***

## Architecture

In self-hosted mode, a Node.js adapter layer replaces Cloudflare-specific APIs while running the exact same gateway code:

```
Your App / Agents
  │
  ▼
Self-Hosted Gateway (Node.js, port 8787)
  │ ── KV adapter ──▶ Redis (or in-memory)
  │ ── fetch interceptor ──▶ Anthropic / OpenAI / Gemini (direct)
  │
  ├──▶ Observer (cron scheduler)
  │     ── builds AP-Traces
  │     ── runs AAP verification
  │     ── runs AIP integrity checks
  │
  ▼
PostgreSQL (Supabase or self-managed)
  │
  ├──▶ CLI (mnemom status / logs)
  └──▶ Dashboard (mnemom.ai or self-hosted)
```

**Adaptation layer** — zero modifications to gateway source code:

| Cloudflare API         | Self-Hosted Replacement                      |
| ---------------------- | -------------------------------------------- |
| KV Namespace           | Redis (with in-memory fallback)              |
| `ctx.waitUntil()`      | Promise collection with drain after response |
| AI Gateway URL routing | Fetch interceptor rewriting to upstream APIs |
| `ExecutionContext`     | Node.js shim with fire-and-forget semantics  |

***

## Data residency

Prompt and response content is never sent to Mnemom's cloud. However, prompts are forwarded to your configured LLM providers — see the table below for exact traffic boundaries.

| Traffic                  | Destination                                                                                      | How to keep in-region                                                        |
| ------------------------ | ------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------- |
| **LLM provider calls**   | Anthropic / OpenAI / Gemini APIs (port 443)                                                      | Route through a VPC-peered API proxy or use a provider's regional endpoint   |
| **Heartbeat**            | `https://api.mnemom.ai/v1/deployments/heartbeat`                                                 | Set `HEARTBEAT_URL` to an internal endpoint or regional relay                |
| **Agent creation (s2s)** | `https://api.mnemom.ai/v1/agents` (sends `agent_hash` + API-key prefix only — no prompt content) | Contact [support@mnemom.ai](mailto:support@mnemom.ai) for air-gapped options |

Traces, integrity checkpoints, and all prompt/response content remain in your database and are never sent to Mnemom's cloud.

***

## Configuration reference

### Required

| Variable              | Description                                                                          |
| --------------------- | ------------------------------------------------------------------------------------ |
| `SUPABASE_URL`        | Supabase project URL (`https://<ref>.supabase.co`) or self-hosted PostgREST endpoint |
| `SUPABASE_SECRET_KEY` | Supabase service-role key                                                            |
| `SUPABASE_JWT_SECRET` | JWT secret for verifying Supabase auth tokens (observer hard-fails without this)     |
| `REDIS_PASSWORD`      | Password for the Redis instance (required when Redis is used)                        |
| `INTERNAL_API_KEY`    | Internal service-to-service secret for agent-creation calls                          |
| `MNEMOM_LICENSE_JWT`  | Enterprise license JWT from [mnemom.ai/dashboard](https://mnemom.ai/dashboard)       |
| `ANTHROPIC_API_KEY`   | Anthropic API key (required for AIP analysis)                                        |

### Optional: Providers

| Variable         | Default | Description                                      |
| ---------------- | ------- | ------------------------------------------------ |
| `OPENAI_API_KEY` | --      | OpenAI API key for multi-provider routing        |
| `GEMINI_API_KEY` | --      | Google Gemini API key for multi-provider routing |

### Optional: Hybrid analysis

| Variable             | Default | Description                                                                     |
| -------------------- | ------- | ------------------------------------------------------------------------------- |
| `MNEMOM_ANALYZE_URL` | --      | Delegate AIP analysis to Mnemom cloud (`https://api.mnemom.ai/v1/analyze`)      |
| `MNEMOM_API_KEY`     | --      | Mnemom API key with `analyze` scope (required when `MNEMOM_ANALYZE_URL` is set) |

<Tip>
  In hybrid mode, only thinking/reasoning blocks are sent for analysis — raw prompts and responses never leave your infrastructure.
</Tip>

### Optional: Infrastructure

| Variable        | Default                                          | Description                                                                                            |
| --------------- | ------------------------------------------------ | ------------------------------------------------------------------------------------------------------ |
| `REDIS_URL`     | --                                               | Redis connection URL. Without Redis, an in-memory KV adapter is used (single-node only).               |
| `PORT`          | `8787`                                           | HTTP listen port                                                                                       |
| `HOST`          | `0.0.0.0`                                        | HTTP bind address                                                                                      |
| `MNEMOM_ROLE`   | `all`                                            | `gateway` (HTTP only), `scheduler` (cron only), or `all` (both)                                        |
| `LOG_LEVEL`     | `info`                                           | `debug`, `info`, `warn`, or `error`. Structured JSON to stdout.                                        |
| `HEARTBEAT_URL` | `https://api.mnemom.ai/v1/deployments/heartbeat` | Override the phone-home heartbeat endpoint. Set to an internal relay for EU or air-gapped deployments. |

***

## Health endpoints

Three Kubernetes-standard probes:

| Endpoint          | Purpose         | Behavior                                       |
| ----------------- | --------------- | ---------------------------------------------- |
| `/health/live`    | Liveness probe  | Always 200 unless deadlocked                   |
| `/health/ready`   | Readiness probe | Checks Redis, PostgreSQL, and license validity |
| `/health/startup` | Startup probe   | Returns 503 until initialization complete      |

## Prometheus metrics

The gateway exposes a `/metrics` endpoint with:

* `gateway_requests_total{provider,status}` — request counter
* `gateway_request_duration_seconds{provider}` — latency histogram
* `gateway_aip_checks_total{verdict}` — integrity check counter
* `gateway_cache_operations_total{operation,result}` — cache hit/miss
* Standard `process_*` and `nodejs_*` metrics

For Kubernetes, enable the ServiceMonitor in `values.yaml`:

```yaml theme={null}
metrics:
  serviceMonitor:
    enabled: true
    interval: 30s
```

***

## Upgrading

### Docker compose

```bash theme={null}
cd mnemom-platform && git pull
cd deploy/docker
docker compose build
docker compose up -d
```

Migrations run automatically as part of the gateway startup.

### Helm

```bash theme={null}
helm upgrade mnemom ./deploy/helm/mnemom-gateway \
  --set secrets.existingSecret=mnemom-secrets
```

The migration job runs as a pre-upgrade Helm hook.

<Warning>
  Always back up your database before upgrading. For Docker: `docker compose exec postgres pg_dump -U mnemom mnemom > backup.sql`. For Kubernetes: use your standard PostgreSQL backup procedure.
</Warning>

***

## Troubleshooting

<AccordionGroup>
  <Accordion title="Gateway won't start — EnvValidationError">
    A required environment variable is missing. Check the error message for which variable, then verify your `.env` file or Kubernetes Secret.
  </Accordion>

  <Accordion title="Redis connection refused">
    * Docker Compose: ensure the `redis` service is healthy (`docker compose ps`)
    * Kubernetes: verify `REDIS_URL` in your Secret points to a reachable Redis instance
    * Without Redis, the gateway falls back to in-memory KV (single-node only)
  </Accordion>

  <Accordion title="License validation failed">
    * Verify `MNEMOM_LICENSE_JWT` is set and not expired
    * Check `/health/ready` for the specific license error
    * Contact [support@mnemom.ai](mailto:support@mnemom.ai) for license reissuance
  </Accordion>

  <Accordion title="Upstream LLM API errors (401/403)">
    * Verify your API keys are correct and have sufficient credits
    * The gateway proxies directly to provider APIs — ensure outbound HTTPS (port 443) is allowed
    * In Kubernetes, check the NetworkPolicy allows egress to `0.0.0.0/0:443`
  </Accordion>

  <Accordion title="High memory / OOMKilled">
    * Increase container memory limits (512Mi minimum, 1Gi recommended for high traffic)
    * If using in-memory KV, switch to Redis to reduce memory pressure
    * Set `NODE_OPTIONS=--max-old-space-size=768` for fine-grained heap control
  </Accordion>
</AccordionGroup>

***

## Next steps

* [Mnemom Gateway overview](/gateway/overview) — architecture and components
* [Enforcement modes](/gateway/enforcement) — observe, nudge, and enforce
* [Observability guide](/guides/observability) — dashboards and alerting
* [Security model](/guides/security-trust-model) — trust boundaries and threat model
