exec-job-board — multi-source data aggregation pipeline

Four divergent public APIs normalized behind one Pydantic schema, deduped by content hash, served as a zero-backend static site. Refreshed daily by cron.

PythonhttpxPydantic v2Next.js 16Fuse.jsGitHub ActionsDokku

Live demo

Sources: 4 public APIs, one schema
Pipeline cadence: daily 06:00 UTC
Search latency: < 200 ms client-side
Runtime cost: $0 (SSG)

The problem

Four third-party APIs, four response schemas, four rate-limit and auth models. This is the canonical glue code nobody wants to maintain, made harder by wanting zero warm backend.

The solution

One adapter per source isolates response-shape drift. Pydantic v2 enforces the unified schema at the seam. SHA-256 over (title, employer, location, posted_date) handles dedupe. A GitHub Actions cron runs the collector daily, commits the JSON, and redeploys the static site. The site falls back to a curated 30-row seed when the API output is missing: the demo never breaks.

Overview

An automated daily pipeline collects executive-tier listings from 4 public data APIs (JSearch, Adzuna, The Muse, USAJobs), normalizes them into a unified Pydantic schema, deduplicates via SHA-256 content hashing, and emits a single `jobs.json` consumed by a Next.js static site at build time. Fuse.js client-side fuzzy search + 4-dimension filtering, sub-200ms response, zero runtime backend cost.