Guides
Use Cases
AI Agents

Powering AI Agents with Web Context

Give your AI agents real-time access to website content, brand data, and structured product information through Orsa's API.

The Problem

LLMs have training data cutoffs and can't access live web content. AI agents need fresh, structured data to make decisions and take actions.

The Solution

Use Orsa as the web context layer for your AI agents:

  • Scrape pages for up-to-date content (markdown format, LLM-ready)
  • Extract brand data for company research and analysis
  • Discover products for competitive intelligence and comparison
  • Query websites with natural language for custom data extraction

Template-driven scraping for agents

For search-engine, marketplace, social, and AI-tool scraping flows, use the template endpoint:

curl --request GET "https://api.orsa.dev/api/v1/web/scrape/template?template=amazon-search&query=laptop&domain=com&mode=markdown" \
  --header "Authorization: Bearer $ORSA_API_KEY"

This endpoint builds the target URL from template + query + domain, then runs through the same browser-pool pipeline as standard scraping routes.

Common template IDs

  • google-search-ai-overview
  • amazon-search
  • web (direct URL mode)
  • bing-search
  • walmart-search
  • target-search
  • youtube-search
  • reddit-subreddit
  • chatgpt
  • perplexity

Use dashboard-only templates for provider-specific verticals where the dashboard adds additional parser behavior and post-processing.

Agent Tool Definitions

OpenAI Function Calling

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "scrape_website",
        "description": "Scrape a website and return its content as clean markdown. Use this when you need current information from a specific URL.",
        "parameters": {
          "type": "object",
          "properties": {
            "url": {
              "type": "string",
              "description": "The URL to scrape"
            }
          },
          "required": ["url"]
        }
      }
    },
    {
      "type": "function",
      "function": {
        "name": "get_brand_info",
        "description": "Get comprehensive brand data for a company including name, logo, colors, industry, and social links.",
        "parameters": {
          "type": "object",
          "properties": {
            "domain": {
              "type": "string",
              "description": "Company domain (e.g., stripe.com)"
            }
          },
          "required": ["domain"]
        }
      }
    },
    {
      "type": "function",
      "function": {
        "name": "extract_custom_data",
        "description": "Extract specific data from a website using natural language. Describe what you need and the AI will find it.",
        "parameters": {
          "type": "object",
          "properties": {
            "domain": {
              "type": "string",
              "description": "Domain to analyze"
            },
            "query": {
              "type": "string",
              "description": "What data to extract (e.g., 'pricing plans with features and prices')"
            }
          },
          "required": ["domain", "query"]
        }
      }
    }
  ]
}

Tool Implementation

import { Orsa } from 'orsa';
 
const orsa = new Orsa({ apiKey: process.env.ORSA_API_KEY });
 
const toolHandlers = {
  async scrape_website({ url }: { url: string }) {
    const result = await orsa.web.scrapeMarkdown({ url });
    return {
      content: result.markdown,
      title: result.title,
      wordCount: result.word_count,
    };
  },
 
  async get_brand_info({ domain }: { domain: string }) {
    const brand = await orsa.brand.retrieve({ domain });
    return {
      name: brand.name,
      domain: brand.domain,
      description: brand.description,
      industry: brand.industry,
      logo: brand.logos[0]?.url,
      colors: brand.colors,
      socials: brand.socials,
    };
  },
 
  async extract_custom_data({ domain, query }: { domain: string; query: string }) {
    const result = await orsa.ai.query({
      domain,
      dataToExtract: query,
      responseFormat: 'json',
    });
    return result.result;
  },
};

MCP Server Integration

For Claude Desktop and Cursor, use the Orsa MCP server to give Claude direct access to Orsa tools.

If you are using OpenClaw, follow the dedicated OpenClaw setup guide for the dashboard key-generation flow, CLI examples, and MCP config.

Use Cases

Company Research Agent

// Agent prompt: "Research Vercel and summarize their product offering"
// Agent calls: get_brand_info("vercel.com") + extract_custom_data("vercel.com", "all products with pricing")

Competitive Analysis Agent

// Agent prompt: "Compare pricing between Linear and Jira"
// Agent calls: extract_custom_data("linear.app", "pricing") + extract_custom_data("atlassian.com/jira", "pricing")

Content Generation Agent

// Agent prompt: "Write a blog post about the latest updates from Stripe"
// Agent calls: scrape_website("https://stripe.com/blog") → generates content from live data

Credit Budget

Tool CallEndpointCredits
Scrape pageGET /v1/web/scrape/markdown1
Brand dataGET /v1/brand/retrieve5
AI queryPOST /v1/brand/ai/query10
ProductsGET /v1/brand/ai/products10

Tip: Set credit budgets per agent run to prevent runaway costs. A typical research task uses 15-30 credits.

Tips

  • Prefer markdown scraping over HTML for LLM consumption — it's cleaner and uses fewer tokens.
  • Cache aggressively. If your agent might query the same domain twice in one session, cache the first result.
  • Use the AI Query endpoint for complex extractions instead of scraping + parsing in your agent logic.
  • Set timeout budgets — Orsa calls can take 10-60s for uncached data. Plan your agent's timeout accordingly.