self-hosting
Kubernetes

Kubernetes

Deploy Orsa to any Kubernetes cluster. This guide covers Helm chart configuration, resource requirements, scaling, and ingress setup.

Overview

Orsa's architecture maps cleanly to Kubernetes:

┌─────────────────────────────────────────────────┐
│                  Ingress (nginx/traefik)         │
│         api.orsa.dev  /  app.orsa.dev            │
└────────┬──────────────────────┬─────────────────┘
         │                      │
   ┌─────▼─────┐         ┌─────▼─────┐
   │ API Deploy │         │ Web Deploy │
   │ (2+ pods)  │         │ (2+ pods)  │
   └─────┬─────┘         └───────────┘

   ┌─────▼──────────┐
   │ Browser Worker  │     ┌───────────┐
   │ DaemonSet or    │────▶│  Redis    │
   │ Deployment      │     │ (StatefulSet
   │ (high-memory    │     │  or managed)
   │  nodes)         │     └───────────┘
   └─────────────────┘

   ┌─────▼─────┐
   │ PostgreSQL │
   │ (Supabase  │
   │  or managed│
   │  PG)       │
   └───────────┘

Prerequisites

  • Kubernetes cluster 1.28+
  • kubectl configured for your cluster
  • Helm 3.12+
  • Container registry access (GHCR, ECR, Docker Hub)
  • External PostgreSQL (Supabase Cloud or self-hosted)
  • External Redis (Upstash or in-cluster)

Helm Chart

The Helm chart lives in infrastructure/k8s/ (or you can create it from the templates below).

Install

# Add the Orsa Helm repo (when published)
helm repo add orsa https://charts.orsa.dev
helm repo update
 
# Or install from local chart
helm install orsa ./infrastructure/k8s/orsa \
  --namespace orsa \
  --create-namespace \
  --values values.yaml

values.yaml

# ─── Global ─────────────────────────────────────────────────
global:
  domain: orsa.yourdomain.com
  env: production
 
# ─── API ────────────────────────────────────────────────────
api:
  replicas: 2
  image:
    repository: ghcr.io/paragonhq/orsa-api
    tag: latest
    pullPolicy: IfNotPresent
  resources:
    requests:
      cpu: 250m
      memory: 512Mi
    limits:
      cpu: '1'
      memory: 1Gi
  env:
    NODE_ENV: production
    PORT: '3001'
  envFrom:
    - secretRef:
        name: orsa-secrets
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 10
    targetCPUUtilization: 70
    targetMemoryUtilization: 80
 
# ─── Web Dashboard ──────────────────────────────────────────
web:
  replicas: 2
  image:
    repository: ghcr.io/paragonhq/orsa-web
    tag: latest
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 500m
      memory: 512Mi
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 5
    targetCPUUtilization: 70
 
# ─── Browser Worker ─────────────────────────────────────────
browserWorker:
  replicas: 3
  image:
    repository: ghcr.io/paragonhq/orsa-browser-worker
    tag: latest
  resources:
    requests:
      cpu: '1'
      memory: 2Gi
    limits:
      cpu: '2'
      memory: 4Gi
  env:
    POOL_SIZE: '3'
    MAX_CONTEXTS: '10'
    PAGE_TIMEOUT: '30000'
  # Shared memory for Chromium — mount as emptyDir with medium: Memory
  shmSize: 512Mi
  # Node affinity — schedule on high-memory nodes
  nodeSelector:
    orsa.dev/role: browser
  tolerations:
    - key: orsa.dev/browser
      operator: Exists
      effect: NoSchedule
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 20
    targetCPUUtilization: 60
 
# ─── Redis (in-cluster) ────────────────────────────────────
redis:
  enabled: true  # Set to false if using Upstash or external Redis
  image:
    repository: redis
    tag: 7-alpine
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 500m
      memory: 512Mi
  persistence:
    enabled: true
    size: 2Gi
    storageClass: ''  # Uses default StorageClass
 
# ─── Ingress ───────────────────────────────────────────────
ingress:
  enabled: true
  className: nginx  # or traefik, alb, etc.
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/rate-limit-connections: '50'
    nginx.ingress.kubernetes.io/proxy-body-size: '10m'
  hosts:
    - host: api.orsa.yourdomain.com
      paths:
        - path: /
          pathType: Prefix
          service: api
    - host: app.orsa.yourdomain.com
      paths:
        - path: /
          pathType: Prefix
          service: web
  tls:
    - secretName: orsa-tls
      hosts:
        - api.orsa.yourdomain.com
        - app.orsa.yourdomain.com

Secrets

Create the secrets before deploying:

kubectl create namespace orsa
 
kubectl create secret generic orsa-secrets \
  --namespace orsa \
  --from-literal=NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co \
  --from-literal=NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key \
  --from-literal=SUPABASE_SERVICE_ROLE_KEY=your-service-role-key \
  --from-literal=UPSTASH_REDIS_REST_URL=https://your-redis.upstash.io \
  --from-literal=UPSTASH_REDIS_REST_TOKEN=your-token \
  --from-literal=STRIPE_SECRET_KEY=sk_live_... \
  --from-literal=STRIPE_WEBHOOK_SECRET=whsec_... \
  --from-literal=OPENAI_API_KEY=sk-... \
  --from-literal=ANTHROPIC_API_KEY=sk-ant-... \
  --from-literal=CLOUDFLARE_R2_ACCESS_KEY=your-key \
  --from-literal=CLOUDFLARE_R2_SECRET_KEY=your-secret \
  --from-literal=CLOUDFLARE_R2_ENDPOINT=https://your-account.r2.cloudflarestorage.com \
  --from-literal=CLOUDFLARE_R2_BUCKET=orsa-assets \
  --from-literal=PROXY_DATACENTER_URL=http://user:pass@dc-proxy:port \
  --from-literal=PROXY_RESIDENTIAL_URL=http://user:pass@res-proxy:port \
  --from-literal=PROXY_ISP_URL=http://user:pass@isp-proxy:port

Or use a sealed-secrets / external-secrets controller for production.

Browser Worker Node Requirements

The browser worker runs headless Chromium and is the most resource-intensive component. Each pod needs:

ResourceMinimumRecommendedNotes
CPU1 vCPU2 vCPUChromium is CPU-intensive during rendering
Memory2 GB4 GBEach browser instance uses ~200-400 MB
Shared Memory256 MB512 MBRequired for Chromium IPC. Mount as emptyDir with medium: Memory
Disk1 GB5 GBTemporary browser profiles, download cache

Shared Memory Mount

Chromium requires a large /dev/shm. The Helm chart handles this, but if deploying manually:

# In your browser worker Deployment spec:
spec:
  containers:
    - name: browser-worker
      volumeMounts:
        - name: dshm
          mountPath: /dev/shm
  volumes:
    - name: dshm
      emptyDir:
        medium: Memory
        sizeLimit: 512Mi

Dedicated Node Pool

For production, create a dedicated node pool for browser workers:

# GKE example
gcloud container node-pools create browser-pool \
  --cluster=orsa \
  --machine-type=e2-standard-4 \
  --num-nodes=3 \
  --min-nodes=1 \
  --max-nodes=10 \
  --enable-autoscaling \
  --node-labels=orsa.dev/role=browser \
  --node-taints=orsa.dev/browser=true:NoSchedule
 
# EKS example
eksctl create nodegroup \
  --cluster=orsa \
  --name=browser-pool \
  --node-type=m5.xlarge \
  --nodes=3 \
  --nodes-min=1 \
  --nodes-max=10 \
  --node-labels=orsa.dev/role=browser \
  --node-taints=orsa.dev/browser=true:NoSchedule

Scaling Configuration

Horizontal Pod Autoscaler

The Helm chart creates HPAs for each service. Key tuning parameters:

# API — scales on CPU (request handling)
api:
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 10
    targetCPUUtilization: 70
 
# Browser Worker — scales on CPU (Chromium rendering)
browserWorker:
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 20
    targetCPUUtilization: 60  # Lower threshold — Chromium is bursty

KEDA (Event-Driven Scaling)

For more precise scaling, use KEDA to scale browser workers based on queue depth:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: browser-worker-scaler
  namespace: orsa
spec:
  scaleTargetRef:
    name: orsa-browser-worker
  minReplicaCount: 2
  maxReplicaCount: 20
  triggers:
    - type: redis
      metadata:
        address: redis:6379
        listName: orsa:browser:queue
        listLength: '5'

Ingress / TLS Setup

nginx-ingress + cert-manager

# Install nginx-ingress
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace
 
# Install cert-manager
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set installCRDs=true
 
# Create ClusterIssuer for Let's Encrypt
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: your-email@domain.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
      - http01:
          ingress:
            class: nginx
EOF

Traefik

If using Traefik as your ingress controller:

ingress:
  enabled: true
  className: traefik
  annotations:
    traefik.ingress.kubernetes.io/router.tls.certresolver: letsencrypt
  hosts:
    - host: api.orsa.yourdomain.com
      paths:
        - path: /
          pathType: Prefix
          service: api

AWS ALB

ingress:
  enabled: true
  className: alb
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:...
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
    alb.ingress.kubernetes.io/ssl-redirect: '443'

Deploy

# Deploy
helm install orsa ./infrastructure/k8s/orsa \
  --namespace orsa \
  --create-namespace \
  --values values.yaml
 
# Verify
kubectl get pods -n orsa
kubectl get ingress -n orsa
 
# Check logs
kubectl logs -n orsa -l app=orsa-api --tail=50
kubectl logs -n orsa -l app=orsa-browser-worker --tail=50
 
# Run database migrations (one-time job)
kubectl run orsa-migrate \
  --namespace orsa \
  --image=ghcr.io/paragonhq/orsa-api:latest \
  --restart=Never \
  --env-from=secret/orsa-secrets \
  --command -- npx supabase db push

Health Checks

All services expose health endpoints:

livenessProbe:
  httpGet:
    path: /health
    port: 3001
  initialDelaySeconds: 10
  periodSeconds: 15
  failureThreshold: 3
 
readinessProbe:
  httpGet:
    path: /health
    port: 3001
  initialDelaySeconds: 5
  periodSeconds: 10
  failureThreshold: 3

Monitoring

Prometheus metrics

Export metrics from the API and browser worker for Prometheus scraping:

podAnnotations:
  prometheus.io/scrape: 'true'
  prometheus.io/port: '3001'
  prometheus.io/path: '/metrics'

Key metrics to monitor:

  • orsa_requests_total — Total API requests by endpoint and status
  • orsa_credits_consumed_total — Credits consumed by endpoint
  • orsa_browser_pool_active — Active browser instances
  • orsa_browser_pool_queue_depth — Queued browser requests
  • orsa_cache_hit_ratio — Cache hit/miss ratio
  • orsa_proxy_errors_total — Proxy failures by tier

Next Steps