Guides
Get Started
Webhooks

Webhooks

Receive real-time notifications when asynchronous operations complete — crawl jobs, prefetch results, and more.

Setting Up Webhooks

Crawl Webhooks

Pass a webhookUrl when starting a crawl job:

curl -X POST "https://api.orsa.dev/v1/web/crawl" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://stripe.com",
    "maxPages": 50,
    "webhookUrl": "https://yourapp.com/webhooks/orsa/crawl"
  }'

When the crawl completes, Orsa sends a POST to your webhook URL.

Webhook Payload

Crawl Complete

{
  "event": "crawl.completed",
  "timestamp": "2024-12-15T10:32:45Z",
  "data": {
    "job_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
    "status": "completed",
    "root_url": "https://stripe.com",
    "pages_found": 48,
    "pages_completed": 48,
    "credits_charged": 48,
    "started_at": "2024-12-15T10:30:00Z",
    "completed_at": "2024-12-15T10:32:45Z"
  }
}

Crawl Failed

{
  "event": "crawl.failed",
  "timestamp": "2024-12-15T10:31:00Z",
  "data": {
    "job_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
    "status": "failed",
    "root_url": "https://example.com",
    "error": "Root URL returned 403 Forbidden",
    "pages_completed": 0,
    "credits_charged": 1
  }
}

Event Types

EventDescription
crawl.completedCrawl job finished successfully
crawl.failedCrawl job encountered a fatal error

Webhook Headers

Every webhook request includes:

HeaderDescription
Content-Typeapplication/json
X-Orsa-EventEvent type (e.g., crawl.completed)
X-Orsa-SignatureHMAC-SHA256 signature for verification
X-Orsa-TimestampUnix timestamp of the webhook
X-Orsa-Delivery-IdUnique delivery ID for idempotency

Signature Verification

Verify webhook signatures to ensure requests come from Orsa.

TypeScript

import crypto from 'crypto';
 
function verifyWebhook(
  payload: string,
  signature: string,
  timestamp: string,
  secret: string,
): boolean {
  const signedContent = `${timestamp}.${payload}`;
  const expectedSignature = crypto
    .createHmac('sha256', secret)
    .update(signedContent)
    .digest('hex');
 
  return crypto.timingSafeEqual(
    Buffer.from(signature),
    Buffer.from(expectedSignature),
  );
}
 
// Express handler
app.post('/webhooks/orsa/crawl', (req, res) => {
  const signature = req.headers['x-orsa-signature'] as string;
  const timestamp = req.headers['x-orsa-timestamp'] as string;
  const payload = JSON.stringify(req.body);
 
  if (!verifyWebhook(payload, signature, timestamp, process.env.ORSA_WEBHOOK_SECRET!)) {
    return res.status(401).json({ error: 'Invalid signature' });
  }
 
  // Process the event
  const event = req.body;
  console.log(`Crawl ${event.data.status}: ${event.data.job_id}`);
 
  res.status(200).json({ received: true });
});

Python

import hmac
import hashlib
 
def verify_webhook(payload: str, signature: str, timestamp: str, secret: str) -> bool:
    signed_content = f"{timestamp}.{payload}"
    expected = hmac.new(
        secret.encode(),
        signed_content.encode(),
        hashlib.sha256,
    ).hexdigest()
    return hmac.compare_digest(signature, expected)
 
# Flask handler
@app.route("/webhooks/orsa/crawl", methods=["POST"])
def handle_webhook():
    signature = request.headers.get("X-Orsa-Signature")
    timestamp = request.headers.get("X-Orsa-Timestamp")
    payload = request.get_data(as_text=True)
 
    if not verify_webhook(payload, signature, timestamp, os.environ["ORSA_WEBHOOK_SECRET"]):
        return {"error": "Invalid signature"}, 401
 
    event = request.json
    print(f"Crawl {event['data']['status']}: {event['data']['job_id']}")
 
    return {"received": True}, 200

Best Practices

  • Always verify signatures. Never trust webhook payloads without verification.
  • Respond with 200 quickly. Process the webhook asynchronously — Orsa will retry on non-2xx responses.
  • Use X-Orsa-Delivery-Id for idempotency. Store processed delivery IDs to avoid handling duplicates.
  • Reject stale timestamps. Discard webhooks where X-Orsa-Timestamp is more than 5 minutes old.
  • Use HTTPS endpoints only. Webhook URLs must use HTTPS in production.

Retry Policy

If your endpoint returns a non-2xx status code, Orsa retries up to 3 times with exponential backoff:

AttemptDelay
1st retry30 seconds
2nd retry2 minutes
3rd retry10 minutes

After 3 failed attempts, the webhook is abandoned. Check the crawl status endpoint for results.