Self-Hosting Orsa
Run Orsa on your own infrastructure. Full control over your data, no usage limits, no vendor lock-in.
Self-Hosted vs Managed Cloud
| Feature | Self-Hosted | Managed Cloud (orsa.dev) |
|---|---|---|
| Data residency | Your servers, your rules | US cloud regions |
| Usage limits | None — limited only by your infra | Credit-based billing |
| Custom proxies | Bring any proxy provider | Pre-configured providers |
| LLM providers | Any provider, including local (Ollama) | OpenAI + Anthropic |
| Browser pool | Scale to any size | Shared pool with fair-use limits |
| Updates | Manual (pull + migrate) | Automatic |
| Support | Community (GitHub Issues) | Priority support |
| SSO / Audit logs | Full access to Enterprise tables | Enterprise plan only |
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Client / SDK │
│ (TypeScript, Python, MCP, cURL) │
└──────────────────────────┬──────────────────────────────────────┘
│ HTTPS
┌──────────────────────────▼──────────────────────────────────────┐
│ API (Next.js on Vercel) │
│ apps/api — /api/v1/* │
│ ┌──────────┐ ┌──────────┐ ┌───────────┐ ┌──────────────┐ │
│ │ Scraping │ │ Brand │ │ AI │ │ Screenshot │ │
│ │ Routes │ │ Routes │ │ Routes │ │ Routes │ │
│ └────┬─────┘ └────┬─────┘ └─────┬─────┘ └──────┬───────┘ │
│ │ │ │ │ │
│ ┌────▼──────────────▼──────────────▼───────────────▼───────┐ │
│ │ @orsa/core │ │
│ │ Extraction engine — scraping, brand pipeline, AI, etc. │ │
│ └──────┬──────────────────┬────────────────────┬───────────┘ │
└─────────┼──────────────────┼────────────────────┼───────────────┘
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼──────┐
│ Supabase │ │ Upstash │ │ Browser │
│ (Postgres │ │ Redis │ │ Worker │
│ + Auth + │ │ (Cache + │ │ (Playwright│
│ Storage) │ │ Rate │ │ on Fly.io)│
└───────────┘ │ Limits) │ └─────┬──────┘
└───────────┘ │
┌─────▼──────┐
│ Proxy Pool │
│ (DC/Resi/ │
│ ISP) │
└────────────┘
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
│ Trigger.dev │ │ Cloudflare │ │ Stripe │
│ (Queues — │ │ R2 (Asset │ │ (Billing + │
│ crawl jobs)│ │ CDN/Storage)│ │ Credits) │
└─────────────┘ └──────────────┘ └──────────────┘Service Boundaries
| Service | Role | Runs On |
|---|---|---|
API (apps/api) | All /api/v1/* endpoints. Auth, rate limiting, credit deduction, request routing. | Vercel (or any Node.js host) |
Web (apps/web) | Dashboard + marketing site. User management, API key creation, usage analytics. | Vercel |
Docs (apps/docs) | Nextra documentation site (docs.orsa.dev in managed cloud). | Vercel |
Core (packages/core) | Extraction engine. Scraping, brand pipeline, AI extraction, screenshots, classification. Shared library — not deployed independently. | Bundled with API |
DB (packages/db) | Supabase client, generated types, query helpers. | Bundled with API/Web |
Browser Worker (services/browser-worker) | Playwright browser pool. Renders pages, takes screenshots, executes JavaScript. | Fly.io (or Docker) |
Trigger Jobs (services/trigger) | Background job definitions — full-site crawls, brand extraction queues, AI queries. | Trigger.dev (cloud or self-hosted) |
Prerequisites
Required
| Dependency | Minimum Version | Purpose |
|---|---|---|
| Node.js | 22.0+ | Runtime for API, Web, and all packages |
| pnpm | 9.0+ | Package manager (monorepo workspaces) |
| Docker | 24.0+ | Browser worker, Redis, local development |
| PostgreSQL | 16+ | Primary database (via Supabase) |
Recommended
| Dependency | Purpose |
|---|---|
| Supabase CLI | Local development, migrations, type generation |
| flyctl | Browser worker deployment to Fly.io |
| Vercel CLI | API/Web/Docs deployment |
| Stripe CLI | Webhook testing in development |
System Requirements
API Server:
- 1 vCPU, 1 GB RAM minimum
- Scales horizontally (stateless)
Browser Worker:
- 2 vCPU, 4 GB RAM minimum per instance
- 512 MB shared memory (
/dev/shm) for Chromium - Scales horizontally — each instance handles
POOL_SIZEconcurrent browsers
Database:
- PostgreSQL 16+ with extensions:
pgcrypto,pg_trgm,vector - 10 GB storage minimum for brand cache
- Supabase (managed or self-hosted) recommended
Redis:
- 256 MB RAM minimum
- Persistent storage recommended (AOF enabled)
Next Steps
- Docker Compose — Fastest way to get running locally
- Kubernetes — Production deployment with Helm
- External Providers — Configure Supabase, Redis, proxies, LLMs, and more
- Configuration — Complete environment variable reference