Webhooks
Receive real-time notifications when asynchronous operations complete — crawl jobs, prefetch results, and more.
Setting Up Webhooks
Crawl Webhooks
Pass a webhookUrl when starting a crawl job:
curl -X POST "https://api.orsa.dev/v1/web/crawl" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://stripe.com",
"maxPages": 50,
"webhookUrl": "https://yourapp.com/webhooks/orsa/crawl"
}'When the crawl completes, Orsa sends a POST to your webhook URL.
Webhook Payload
Crawl Complete
{
"event": "crawl.completed",
"timestamp": "2024-12-15T10:32:45Z",
"data": {
"job_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
"status": "completed",
"root_url": "https://stripe.com",
"pages_found": 48,
"pages_completed": 48,
"credits_charged": 48,
"started_at": "2024-12-15T10:30:00Z",
"completed_at": "2024-12-15T10:32:45Z"
}
}Crawl Failed
{
"event": "crawl.failed",
"timestamp": "2024-12-15T10:31:00Z",
"data": {
"job_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
"status": "failed",
"root_url": "https://example.com",
"error": "Root URL returned 403 Forbidden",
"pages_completed": 0,
"credits_charged": 1
}
}Event Types
| Event | Description |
|---|---|
crawl.completed | Crawl job finished successfully |
crawl.failed | Crawl job encountered a fatal error |
Webhook Headers
Every webhook request includes:
| Header | Description |
|---|---|
Content-Type | application/json |
X-Orsa-Event | Event type (e.g., crawl.completed) |
X-Orsa-Signature | HMAC-SHA256 signature for verification |
X-Orsa-Timestamp | Unix timestamp of the webhook |
X-Orsa-Delivery-Id | Unique delivery ID for idempotency |
Signature Verification
Verify webhook signatures to ensure requests come from Orsa.
TypeScript
import crypto from 'crypto';
function verifyWebhook(
payload: string,
signature: string,
timestamp: string,
secret: string,
): boolean {
const signedContent = `${timestamp}.${payload}`;
const expectedSignature = crypto
.createHmac('sha256', secret)
.update(signedContent)
.digest('hex');
return crypto.timingSafeEqual(
Buffer.from(signature),
Buffer.from(expectedSignature),
);
}
// Express handler
app.post('/webhooks/orsa/crawl', (req, res) => {
const signature = req.headers['x-orsa-signature'] as string;
const timestamp = req.headers['x-orsa-timestamp'] as string;
const payload = JSON.stringify(req.body);
if (!verifyWebhook(payload, signature, timestamp, process.env.ORSA_WEBHOOK_SECRET!)) {
return res.status(401).json({ error: 'Invalid signature' });
}
// Process the event
const event = req.body;
console.log(`Crawl ${event.data.status}: ${event.data.job_id}`);
res.status(200).json({ received: true });
});Python
import hmac
import hashlib
def verify_webhook(payload: str, signature: str, timestamp: str, secret: str) -> bool:
signed_content = f"{timestamp}.{payload}"
expected = hmac.new(
secret.encode(),
signed_content.encode(),
hashlib.sha256,
).hexdigest()
return hmac.compare_digest(signature, expected)
# Flask handler
@app.route("/webhooks/orsa/crawl", methods=["POST"])
def handle_webhook():
signature = request.headers.get("X-Orsa-Signature")
timestamp = request.headers.get("X-Orsa-Timestamp")
payload = request.get_data(as_text=True)
if not verify_webhook(payload, signature, timestamp, os.environ["ORSA_WEBHOOK_SECRET"]):
return {"error": "Invalid signature"}, 401
event = request.json
print(f"Crawl {event['data']['status']}: {event['data']['job_id']}")
return {"received": True}, 200Best Practices
- Always verify signatures. Never trust webhook payloads without verification.
- Respond with 200 quickly. Process the webhook asynchronously — Orsa will retry on non-2xx responses.
- Use
X-Orsa-Delivery-Idfor idempotency. Store processed delivery IDs to avoid handling duplicates. - Reject stale timestamps. Discard webhooks where
X-Orsa-Timestampis more than 5 minutes old. - Use HTTPS endpoints only. Webhook URLs must use HTTPS in production.
Retry Policy
If your endpoint returns a non-2xx status code, Orsa retries up to 3 times with exponential backoff:
| Attempt | Delay |
|---|---|
| 1st retry | 30 seconds |
| 2nd retry | 2 minutes |
| 3rd retry | 10 minutes |
After 3 failed attempts, the webhook is abandoned. Check the crawl status endpoint for results.