Kubernetes
Deploy Orsa to any Kubernetes cluster. This guide covers Helm chart configuration, resource requirements, scaling, and ingress setup.
Overview
Orsa's architecture maps cleanly to Kubernetes:
┌─────────────────────────────────────────────────┐
│ Ingress (nginx/traefik) │
│ api.orsa.dev / app.orsa.dev │
└────────┬──────────────────────┬─────────────────┘
│ │
┌─────▼─────┐ ┌─────▼─────┐
│ API Deploy │ │ Web Deploy │
│ (2+ pods) │ │ (2+ pods) │
└─────┬─────┘ └───────────┘
│
┌─────▼──────────┐
│ Browser Worker │ ┌───────────┐
│ DaemonSet or │────▶│ Redis │
│ Deployment │ │ (StatefulSet
│ (high-memory │ │ or managed)
│ nodes) │ └───────────┘
└─────────────────┘
│
┌─────▼─────┐
│ PostgreSQL │
│ (Supabase │
│ or managed│
│ PG) │
└───────────┘Prerequisites
- Kubernetes cluster 1.28+
kubectlconfigured for your cluster- Helm 3.12+
- Container registry access (GHCR, ECR, Docker Hub)
- External PostgreSQL (Supabase Cloud or self-hosted)
- External Redis (Upstash or in-cluster)
Helm Chart
The Helm chart lives in infrastructure/k8s/ (or you can create it from the templates below).
Install
# Add the Orsa Helm repo (when published)
helm repo add orsa https://charts.orsa.dev
helm repo update
# Or install from local chart
helm install orsa ./infrastructure/k8s/orsa \
--namespace orsa \
--create-namespace \
--values values.yamlvalues.yaml
# ─── Global ─────────────────────────────────────────────────
global:
domain: orsa.yourdomain.com
env: production
# ─── API ────────────────────────────────────────────────────
api:
replicas: 2
image:
repository: ghcr.io/paragonhq/orsa-api
tag: latest
pullPolicy: IfNotPresent
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: '1'
memory: 1Gi
env:
NODE_ENV: production
PORT: '3001'
envFrom:
- secretRef:
name: orsa-secrets
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilization: 70
targetMemoryUtilization: 80
# ─── Web Dashboard ──────────────────────────────────────────
web:
replicas: 2
image:
repository: ghcr.io/paragonhq/orsa-web
tag: latest
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
targetCPUUtilization: 70
# ─── Browser Worker ─────────────────────────────────────────
browserWorker:
replicas: 3
image:
repository: ghcr.io/paragonhq/orsa-browser-worker
tag: latest
resources:
requests:
cpu: '1'
memory: 2Gi
limits:
cpu: '2'
memory: 4Gi
env:
POOL_SIZE: '3'
MAX_CONTEXTS: '10'
PAGE_TIMEOUT: '30000'
# Shared memory for Chromium — mount as emptyDir with medium: Memory
shmSize: 512Mi
# Node affinity — schedule on high-memory nodes
nodeSelector:
orsa.dev/role: browser
tolerations:
- key: orsa.dev/browser
operator: Exists
effect: NoSchedule
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 20
targetCPUUtilization: 60
# ─── Redis (in-cluster) ────────────────────────────────────
redis:
enabled: true # Set to false if using Upstash or external Redis
image:
repository: redis
tag: 7-alpine
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
persistence:
enabled: true
size: 2Gi
storageClass: '' # Uses default StorageClass
# ─── Ingress ───────────────────────────────────────────────
ingress:
enabled: true
className: nginx # or traefik, alb, etc.
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/rate-limit-connections: '50'
nginx.ingress.kubernetes.io/proxy-body-size: '10m'
hosts:
- host: api.orsa.yourdomain.com
paths:
- path: /
pathType: Prefix
service: api
- host: app.orsa.yourdomain.com
paths:
- path: /
pathType: Prefix
service: web
tls:
- secretName: orsa-tls
hosts:
- api.orsa.yourdomain.com
- app.orsa.yourdomain.comSecrets
Create the secrets before deploying:
kubectl create namespace orsa
kubectl create secret generic orsa-secrets \
--namespace orsa \
--from-literal=NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co \
--from-literal=NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key \
--from-literal=SUPABASE_SERVICE_ROLE_KEY=your-service-role-key \
--from-literal=UPSTASH_REDIS_REST_URL=https://your-redis.upstash.io \
--from-literal=UPSTASH_REDIS_REST_TOKEN=your-token \
--from-literal=STRIPE_SECRET_KEY=sk_live_... \
--from-literal=STRIPE_WEBHOOK_SECRET=whsec_... \
--from-literal=OPENAI_API_KEY=sk-... \
--from-literal=ANTHROPIC_API_KEY=sk-ant-... \
--from-literal=CLOUDFLARE_R2_ACCESS_KEY=your-key \
--from-literal=CLOUDFLARE_R2_SECRET_KEY=your-secret \
--from-literal=CLOUDFLARE_R2_ENDPOINT=https://your-account.r2.cloudflarestorage.com \
--from-literal=CLOUDFLARE_R2_BUCKET=orsa-assets \
--from-literal=PROXY_DATACENTER_URL=http://user:pass@dc-proxy:port \
--from-literal=PROXY_RESIDENTIAL_URL=http://user:pass@res-proxy:port \
--from-literal=PROXY_ISP_URL=http://user:pass@isp-proxy:portOr use a sealed-secrets / external-secrets controller for production.
Browser Worker Node Requirements
The browser worker runs headless Chromium and is the most resource-intensive component. Each pod needs:
| Resource | Minimum | Recommended | Notes |
|---|---|---|---|
| CPU | 1 vCPU | 2 vCPU | Chromium is CPU-intensive during rendering |
| Memory | 2 GB | 4 GB | Each browser instance uses ~200-400 MB |
| Shared Memory | 256 MB | 512 MB | Required for Chromium IPC. Mount as emptyDir with medium: Memory |
| Disk | 1 GB | 5 GB | Temporary browser profiles, download cache |
Shared Memory Mount
Chromium requires a large /dev/shm. The Helm chart handles this, but if deploying manually:
# In your browser worker Deployment spec:
spec:
containers:
- name: browser-worker
volumeMounts:
- name: dshm
mountPath: /dev/shm
volumes:
- name: dshm
emptyDir:
medium: Memory
sizeLimit: 512MiDedicated Node Pool
For production, create a dedicated node pool for browser workers:
# GKE example
gcloud container node-pools create browser-pool \
--cluster=orsa \
--machine-type=e2-standard-4 \
--num-nodes=3 \
--min-nodes=1 \
--max-nodes=10 \
--enable-autoscaling \
--node-labels=orsa.dev/role=browser \
--node-taints=orsa.dev/browser=true:NoSchedule
# EKS example
eksctl create nodegroup \
--cluster=orsa \
--name=browser-pool \
--node-type=m5.xlarge \
--nodes=3 \
--nodes-min=1 \
--nodes-max=10 \
--node-labels=orsa.dev/role=browser \
--node-taints=orsa.dev/browser=true:NoScheduleScaling Configuration
Horizontal Pod Autoscaler
The Helm chart creates HPAs for each service. Key tuning parameters:
# API — scales on CPU (request handling)
api:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilization: 70
# Browser Worker — scales on CPU (Chromium rendering)
browserWorker:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 20
targetCPUUtilization: 60 # Lower threshold — Chromium is burstyKEDA (Event-Driven Scaling)
For more precise scaling, use KEDA to scale browser workers based on queue depth:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: browser-worker-scaler
namespace: orsa
spec:
scaleTargetRef:
name: orsa-browser-worker
minReplicaCount: 2
maxReplicaCount: 20
triggers:
- type: redis
metadata:
address: redis:6379
listName: orsa:browser:queue
listLength: '5'Ingress / TLS Setup
nginx-ingress + cert-manager
# Install nginx-ingress
helm install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace
# Install cert-manager
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--set installCRDs=true
# Create ClusterIssuer for Let's Encrypt
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: your-email@domain.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
EOFTraefik
If using Traefik as your ingress controller:
ingress:
enabled: true
className: traefik
annotations:
traefik.ingress.kubernetes.io/router.tls.certresolver: letsencrypt
hosts:
- host: api.orsa.yourdomain.com
paths:
- path: /
pathType: Prefix
service: apiAWS ALB
ingress:
enabled: true
className: alb
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:...
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/ssl-redirect: '443'Deploy
# Deploy
helm install orsa ./infrastructure/k8s/orsa \
--namespace orsa \
--create-namespace \
--values values.yaml
# Verify
kubectl get pods -n orsa
kubectl get ingress -n orsa
# Check logs
kubectl logs -n orsa -l app=orsa-api --tail=50
kubectl logs -n orsa -l app=orsa-browser-worker --tail=50
# Run database migrations (one-time job)
kubectl run orsa-migrate \
--namespace orsa \
--image=ghcr.io/paragonhq/orsa-api:latest \
--restart=Never \
--env-from=secret/orsa-secrets \
--command -- npx supabase db pushHealth Checks
All services expose health endpoints:
livenessProbe:
httpGet:
path: /health
port: 3001
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 3001
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3Monitoring
Prometheus metrics
Export metrics from the API and browser worker for Prometheus scraping:
podAnnotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '3001'
prometheus.io/path: '/metrics'Key metrics to monitor:
orsa_requests_total— Total API requests by endpoint and statusorsa_credits_consumed_total— Credits consumed by endpointorsa_browser_pool_active— Active browser instancesorsa_browser_pool_queue_depth— Queued browser requestsorsa_cache_hit_ratio— Cache hit/miss ratioorsa_proxy_errors_total— Proxy failures by tier
Next Steps
- External Providers — Configure all third-party services
- Configuration — Full environment variable reference
- Upgrading — Version upgrades and migration strategy