Logan Jonesupdated jul 2026
← index

PullFirst

2026 · live in production · pullfirst.com

Hiring a contractor in Minnesota means trusting scattered public records: state licensing, enforcement actions, OSHA history, permit systems in 50+ cities, EPA compliance. Each lives in its own silo, in its own format. PullFirst pulls them into one searchable profile per contractor. Consumer site in front, B2B API behind it.

What it holds

1.2Mbuilding permits from 50+ jurisdictions
432Kpermit inspections
270Kcontractor licenses (MN DLI)
265KOSHA violations across 122K inspections
1,143DLI enforcement actions, $8M+ in penalties
5,145Google Places profiles with review data

What it does

  • One profile per contractor: license status, enforcement history, OSHA record, permit activity, reviews.
  • Fuzzy search across every dataset. Related-contractor detection links relaunched businesses through shared addresses and phone numbers: the same crew under a new LLC stays visible.
  • Live permit queries against city ArcGIS APIs. Statewide permit map with time-window filters. Address lookup: every permit on file across a street, with the contractor behind each one.
  • Accounts: saved properties, favorites, saved searches.

The pipeline

50+ jurisdictions, no shared permit system. Cities buy from different vendors or run their own portals; each speaks its own dialect of HTML, JSON, or ArcGIS. Every platform gets its own scraper: 30+ jobs orchestrated by an ops server, chained so downstream imports fire when upstream collection lands, each streaming logs over SSE and leaving an audit trail.

accela · bs&a · citizenserve · cityview · energov · esuite · ims · iworq · logis · arcgis · custom city portals

Everything scraped flows through normalization: names, streets, cities, contacts. A hand-built address grammar knows Minnesota’s compound city names, so “St Paul” and “Saint Paul” resolve to the same place and “Inver Grove Heights” survives parsing intact.

Identity is the hard part. The same contractor appears as a license number in one dataset, a business name in another, a phone number in a third. The resolver builds identity keys from normalized addresses and phone digits, then clusters records with union-find (path compression, union by rank). A match engine scores name candidates from exact through prefix-stripped; every match carries a confidence grade and the signals that produced it. The same shared-key graph links a fresh LLC back to the business that dissolved at the same address and phone number.

The ops layer

None of it runs by hand. A local ops server owns the fleet: a scheduler submits runs on per-job cadences, a chainer fires downstream imports the moment upstream collection lands, and retry policies decide what a failure means before a human has to.

  • A dashboard over the whole fleet: every job, every run, logs streaming live over SSE, full history with audit trails.
  • Materialization tracking: every table traces back to the run that built it.
  • One briefing endpoint summarizes pipeline state: what ran, what failed, what’s stale. The first thing checked every morning.

The ETL layer underneath is Python end to end: collection, normalization, the address grammar, entity resolution, imports. Local tooling, production data; the same runs that build pullfirst.com.

How it’s built

.NET 10 API on Fly.io with auto-suspend. Next.js frontend on Vercel. Postgres on Neon. The pipeline is Python, driven from an ops dashboard with full job history.

Dynamic filtering and sorting on every endpoint. Rate limiting global and per-route. CI/CD through GitHub Actions.

The hard parts

  • Identity errors cut both ways: merge two unrelated contractors and one wears the other’s enforcement record; miss a merge and a bad actor’s history disappears behind a new LLC. Scoring, confidence grades, and match signals exist so every merge is explainable.
  • Keeping 30+ scrapers healthy against sources that change without notice. Hence the ops dashboard, streaming logs, and audit trails.
PullFirst contractor profile showing license status, enforcement actions, penalties, and permit counts for a Minnesota contractor
fig. 1 · contractor profile: license, enforcement, penalties, permits
Statewide Minnesota permit map with permits clustered by jurisdiction
fig. 2 · statewide permit map, clustered by jurisdiction
[0] ~/portfoliominnesota