Transaction Categorization
Enrich financial transactions with brand data — logos, merchant names, and industry classification — to build better banking and expense management experiences.
The Problem
Bank transaction descriptors are cryptic. STRIPE* VERCEL INC 800-555-0199 CA means nothing to most users. Fintech apps need to decode these into recognizable brands.
The Solution
Use Orsa’s Transaction Identifier to match descriptors to brands, then enrich with logos, colors, and NAICS codes for categorization.
Pipeline
Raw Transaction → Orsa Transaction ID → Brand Data → Enriched Transaction
↓ ↓ ↓ ↓
"STRIPE* VERCEL" domain: vercel.com logo: ✓ "Vercel - Software"
similarity: 0.92 color: #000 category: "Technology"Implementation
Single Transaction
import Orsa from '@orsa.dev/sdk';
const orsa = new Orsa({ apiKey: process.env.ORSA_API_KEY! });
async function enrichTransaction(descriptor: string) {
const result = await orsa.brand.transactionIdentifier({
transactionInfo: descriptor,
});
if (!result.match) {
return { descriptor, cleaned: result.cleaned, match: null };
}
// Pull the full brand record for logo/colors/etc.
const brand = await orsa.brand.retrieveByDomain({ domain: result.match.domain });
return {
descriptor,
cleaned: result.cleaned,
merchant: result.match.name,
domain: result.match.domain,
description: result.match.description,
logo: brand.logos?.primary,
primaryColor: brand.colors?.primary,
industries: brand.industries,
similarity: result.match.similarity,
};
}
// Usage
const enriched = await enrichTransaction('STRIPE* VERCEL INC 800-555-0199 CA');Python
from orsa import Orsa
orsa = Orsa(api_key=os.environ["ORSA_API_KEY"])
def enrich_transaction(descriptor: str) -> dict:
result = orsa.brand.transaction_identifier(transaction_info=descriptor)
if not result.match:
return {"descriptor": descriptor, "cleaned": result.cleaned, "match": None}
brand = orsa.brand.retrieve_by_domain(domain=result.match.domain)
return {
"merchant": result.match.name,
"domain": result.match.domain,
"logo": (brand.logos or {}).get("primary"),
"primary_color": (brand.colors or {}).get("primary"),
"industries": brand.industries,
"similarity": result.match.similarity,
}Batch Processing
async function enrichBatch(transactions: Array<{ id: string; descriptor: string }>) {
const results = await Promise.allSettled(
transactions.map(async (tx) => ({
id: tx.id,
...(await enrichTransaction(tx.descriptor)),
}))
);
return results
.filter((r) => r.status === 'fulfilled')
.map((r) => r.value);
}
// Process in chunks to respect rate limits
async function processAll(transactions: Array<{ id: string; descriptor: string }>) {
const chunkSize = 20;
const results = [];
for (let i = 0; i < transactions.length; i += chunkSize) {
const chunk = transactions.slice(i, i + chunkSize);
const enriched = await enrichBatch(chunk);
results.push(...enriched);
if (i + chunkSize < transactions.length) {
await new Promise((resolve) => setTimeout(resolve, 1000));
}
}
return results;
}Adding NAICS Classification
For expense categorization, combine transaction identification with NAICS:
async function categorizeTransaction(descriptor: string) {
const result = await orsa.brand.transactionIdentifier({ transactionInfo: descriptor });
if (!result.match) return { descriptor, category: 'Unknown' };
const naics = await orsa.brand.naics({ domain: result.match.domain });
return {
merchant: result.match.name,
industries: naics.industries,
primaryIndustry: naics.primary_industry,
expenseCategory: mapIndustryToCategory(naics.primary_industry),
};
}
function mapIndustryToCategory(industry: string | null): string {
if (!industry) return 'Other';
const lower = industry.toLowerCase();
if (lower.includes('software') || lower.includes('saas')) return 'Software & Technology';
if (lower.includes('financial')) return 'Financial Services';
if (lower.includes('restaurant') || lower.includes('food')) return 'Food & Dining';
if (lower.includes('retail')) return 'Shopping';
return 'Other';
}Credit Cost
| Operation | Endpoint | Credits |
|---|---|---|
| Identify transaction | GET /v1/brand/transaction-identifier | 10 |
| Retrieve by domain (optional) | GET /v1/brand/retrieve-by-domain | 10 |
| NAICS classification (optional) | GET /v1/brand/naics | 5 |
| Per fully-enriched transaction | 25 credits |
Tips
- Cache results by normalized descriptor in your database. The same merchant string will appear repeatedly —
result.cleanedis the normalized form, perfect as a cache key. - Handle low-similarity matches (< 0.7) by showing the raw descriptor alongside the brand match, or by hiding the brand match entirely.
- Use
result.candidatesto surface alternative matches when the top match is borderline. - Batch process historical transactions during off-peak hours to avoid rate limits.