Scrape Images

Extract all images from any web page with dimensions, alt text, and a heuristic role classification (logo, hero, product, icon, decorative).

Endpoint: GET /v1/web/scrape/images Credits: 1 per request

Parameters

Parameter	Type	Required	Description
`url`	string	Yes	The URL to extract images from

Response Schema

{
  "data": {
    "url": "https://stripe.com",
    "images": [
      {
        "url": "https://stripe.com/img/v3/home/social.png",
        "alt": "Stripe payment processing",
        "width": 1200,
        "height": 630,
        "role": "hero"
      },
      {
        "url": "https://stripe.com/favicon.svg",
        "alt": "",
        "width": null,
        "height": null,
        "role": "logo"
      }
    ],
    "count": 47
  },
  "_meta": { "timing": { "total_ms": 1820 }, "cache": { "hit": false } }
}

Code Examples

cURL

curl -X GET "https://api.orsa.dev/v1/web/scrape/images?url=https://stripe.com" \
  -H "Authorization: Bearer YOUR_API_KEY"

TypeScript

const { data } = await client.web.scrapeImages({
  url: 'https://stripe.com',
});
 
console.log(data.count);               // 47
console.log(data.images[0].url);
console.log(data.images[0].role);      // "hero" | "logo" | "product" | "icon" | "decorative"
 
const logos = data.images.filter(i => i.role === 'logo');

Python

res = client.web.scrape_images(url="https://stripe.com")
data = res["data"]
 
print(data["count"])
print(data["images"][0]["url"])
print(data["images"][0]["role"])
 
logos = [i for i in data["images"] if i["role"] == "logo"]

Error Codes

Code	Status	Description
`INPUT_VALIDATION_ERROR`	400	Invalid or missing URL
`UNAUTHORIZED`	401	Missing or invalid API key
`RATE_LIMITED`	429	Rate limit exceeded
`INTERNAL_ERROR`	500	Server error during extraction

Notes

Images are extracted after JavaScript rendering, so dynamically loaded images are included. Falls back to plain fetch if the browser pool is unreachable.
The role field is a heuristic: logo for images in <header>/<nav> or with logo-y alt text, icon for tiny squares (≤64×64), hero for large above-the-fold images, product for images inside product/item/card containers, and decorative otherwise.
Both <img src>, <img srcset> (highest-density variant), and <picture><source srcset> are picked up. Data URIs are skipped.

Transaction Categorization Scrape Sitemap