GuidesUse CasesTransaction Categorization

Transaction Categorization

Enrich financial transactions with brand data — logos, merchant names, and industry classification — to build better banking and expense management experiences.

The Problem

Bank transaction descriptors are cryptic. STRIPE* VERCEL INC 800-555-0199 CA means nothing to most users. Fintech apps need to decode these into recognizable brands.

The Solution

Use Orsa’s Transaction Identifier to match descriptors to brands, then enrich with logos, colors, and NAICS codes for categorization.

Pipeline

Raw Transaction → Orsa Transaction ID → Brand Data → Enriched Transaction
       ↓                    ↓                ↓               ↓
 "STRIPE* VERCEL"    domain: vercel.com   logo: ✓      "Vercel - Software"
                     similarity: 0.92     color: #000   category: "Technology"

Implementation

Single Transaction

import Orsa from '@orsa.dev/sdk';
 
const orsa = new Orsa({ apiKey: process.env.ORSA_API_KEY! });
 
async function enrichTransaction(descriptor: string) {
  const result = await orsa.brand.transactionIdentifier({
    transactionInfo: descriptor,
  });
 
  if (!result.match) {
    return { descriptor, cleaned: result.cleaned, match: null };
  }
 
  // Pull the full brand record for logo/colors/etc.
  const brand = await orsa.brand.retrieveByDomain({ domain: result.match.domain });
 
  return {
    descriptor,
    cleaned: result.cleaned,
    merchant: result.match.name,
    domain: result.match.domain,
    description: result.match.description,
    logo: brand.logos?.primary,
    primaryColor: brand.colors?.primary,
    industries: brand.industries,
    similarity: result.match.similarity,
  };
}
 
// Usage
const enriched = await enrichTransaction('STRIPE* VERCEL INC 800-555-0199 CA');

Python

from orsa import Orsa
 
orsa = Orsa(api_key=os.environ["ORSA_API_KEY"])
 
def enrich_transaction(descriptor: str) -> dict:
    result = orsa.brand.transaction_identifier(transaction_info=descriptor)
    if not result.match:
        return {"descriptor": descriptor, "cleaned": result.cleaned, "match": None}
 
    brand = orsa.brand.retrieve_by_domain(domain=result.match.domain)
 
    return {
        "merchant": result.match.name,
        "domain": result.match.domain,
        "logo": (brand.logos or {}).get("primary"),
        "primary_color": (brand.colors or {}).get("primary"),
        "industries": brand.industries,
        "similarity": result.match.similarity,
    }

Batch Processing

async function enrichBatch(transactions: Array<{ id: string; descriptor: string }>) {
  const results = await Promise.allSettled(
    transactions.map(async (tx) => ({
      id: tx.id,
      ...(await enrichTransaction(tx.descriptor)),
    }))
  );
 
  return results
    .filter((r) => r.status === 'fulfilled')
    .map((r) => r.value);
}
 
// Process in chunks to respect rate limits
async function processAll(transactions: Array<{ id: string; descriptor: string }>) {
  const chunkSize = 20;
  const results = [];
 
  for (let i = 0; i < transactions.length; i += chunkSize) {
    const chunk = transactions.slice(i, i + chunkSize);
    const enriched = await enrichBatch(chunk);
    results.push(...enriched);
 
    if (i + chunkSize < transactions.length) {
      await new Promise((resolve) => setTimeout(resolve, 1000));
    }
  }
 
  return results;
}

Adding NAICS Classification

For expense categorization, combine transaction identification with NAICS:

async function categorizeTransaction(descriptor: string) {
  const result = await orsa.brand.transactionIdentifier({ transactionInfo: descriptor });
  if (!result.match) return { descriptor, category: 'Unknown' };
 
  const naics = await orsa.brand.naics({ domain: result.match.domain });
 
  return {
    merchant: result.match.name,
    industries: naics.industries,
    primaryIndustry: naics.primary_industry,
    expenseCategory: mapIndustryToCategory(naics.primary_industry),
  };
}
 
function mapIndustryToCategory(industry: string | null): string {
  if (!industry) return 'Other';
  const lower = industry.toLowerCase();
  if (lower.includes('software') || lower.includes('saas')) return 'Software & Technology';
  if (lower.includes('financial')) return 'Financial Services';
  if (lower.includes('restaurant') || lower.includes('food')) return 'Food & Dining';
  if (lower.includes('retail')) return 'Shopping';
  return 'Other';
}

Credit Cost

OperationEndpointCredits
Identify transactionGET /v1/brand/transaction-identifier10
Retrieve by domain (optional)GET /v1/brand/retrieve-by-domain10
NAICS classification (optional)GET /v1/brand/naics5
Per fully-enriched transaction25 credits

Tips

  • Cache results by normalized descriptor in your database. The same merchant string will appear repeatedly — result.cleaned is the normalized form, perfect as a cache key.
  • Handle low-similarity matches (< 0.7) by showing the raw descriptor alongside the brand match, or by hiding the brand match entirely.
  • Use result.candidates to surface alternative matches when the top match is borderline.
  • Batch process historical transactions during off-peak hours to avoid rate limits.