The Build
749 commits. 40 days. 2,146,580 documents.
One person. One database. The story of building Epstein Exposed from scratch.
The Rhythm
A heatmap of every commit across the build timeline -- late nights, early mornings, and marathon sessions that shaped the database.
Commits by Hour of Day
Most commits happened between 10 PM and 4 AM.
The Chapters
Six phases of development, from the first schema sketch to 2.1 million documents online.
Genesis
On February 5, 2026, the first commit landed in an empty repository. Within hours it wasn't empty anymore. A Neon Postgres database spun up, Prisma schemas locked into place, and full-text search went live before most people had finished their morning coffee. By the end of day one, 28 commits had already laid the foundation for what would become the largest searchable archive of the Epstein case files ever built. The week that followed was relentless. A 10-phase UI/UX overhaul reshaped every surface of the application while, behind the scenes, bulk ingestion pipelines swallowed 8,186 court documents, 1,659 emails, and biographical data for 1,000 named persons. A Wikidata pipeline pulled 211 headshot images so every profile had a face. The email browser, document browser, flight map, and media gallery all shipped in rapid succession — each one a fully interactive tool, not a static list. By February 11, the site had a newsletter system on Resend, a PayPal donation flow, GitHub Actions CI with a weekly DOJ document checker, and 560 EFTA documents with OCR text extracted from a 221,000-row dataset. Seven days. One developer. The foundation was poured and already curing.
- >Neon Postgres + Prisma ORM + full-text search
- >8,186 documents, 1,659 emails, 1,000 persons ingested
- >211 headshot images via Wikidata pipeline
- >560 EFTA documents with OCR text
- >10-phase UI/UX overhaul
- >Email browser, document browser, flight map, media gallery
- >Resend newsletter, PayPal donations, GitHub Actions CI
Acceleration
February 12 through 18 was the most intense week of the entire build — 205 commits in seven days, peaking at 48 on February 15 alone. That's a commit every 30 minutes, sustained across an entire Saturday, most of them shipping real features rather than fixing typos. The commit graph for this week looks less like software development and more like a seismograph during an earthquake. The output matched the velocity. The Wexner deposition and Maxwell Fifth Amendment transcripts went live alongside six previously unredacted names. A DOJ investigation revealed 892 documents had been quietly deleted from government servers — that became a blog post that would later drive hundreds of thousands of page views. The connection auto-generation system crunched the data and surfaced 51,000 relationships between persons, documents, flights, and emails. A vector search pipeline embedded 2.67 million text chunks for semantic retrieval. By the end of the week, the site had a full public API with 27 documented endpoints, a RAG-powered AI chat that could answer questions about the case files using tool use, and 79 new persons identified from analysis of the DOJ's Section 305 filings. The platform had evolved from a document archive into an investigative tool.
- >205 commits — highest-volume week
- >Wexner deposition + Maxwell Fifth Amendment + 6 unredacted names
- >DOJ deleted-documents investigation (892 files)
- >51,000 auto-generated connections
- >2.67M embedding chunks for vector search
- >API v2 with 27 endpoints
- >RAG-powered AI chat with tool use
- >79 persons identified from DOJ 305 analysis
The Fortress
Week three marked a sharp pivot. The first two weeks had been about building as fast as possible. This week was about building something that could survive. Survivor privacy became the top priority: a three-phase redaction system scrubbed 16,924 instances of victim names across every document, email, and flight record in the database. Redacted victim profiles returned a proper 410 Gone status with a styled explanation page. A dedicated redaction request form gave survivors a direct, private channel to flag content. The legal infrastructure went up in parallel. A comprehensive legal demand response system shipped with a public policy page, email templates for various demand types, and an internal tracking system. This wasn't theoretical — real legal demands had already arrived, and the system needed to handle them with precision and transparency. OFAC, EU, and UN sanctions checking went live alongside a Politically Exposed Persons database, ensuring the platform could flag sanctioned individuals automatically. The crown jewel of the week was the ICIJ Offshore Leaks cross-reference — over 810,000 entities from the Panama Papers, Paradise Papers, and Pandora Papers, matched against the Epstein network. Shell companies that had been opaque suddenly had provenance. The fortress wasn't just defensive. It was an observation tower.
- >16,924 survivor name redactions across all content
- >410 Gone pages for redacted victim profiles
- >Redaction request form for survivors
- >Legal demand response system with policy and templates
- >OFAC, EU, UN sanctions checking + PEP database
- >ICIJ Offshore Leaks cross-reference (810K+ entities)
- >Entity/organization system buildout
Expansion
The commit rate dropped from the frenzy of the previous weeks, but the ambition didn't. Week four was about depth over speed. The flight database doubled: 1,907 new flight logs brought the total to 3,615 records spanning 1991 to 2019, all migrated from flat files into Neon Postgres with proper indexing and geographic queries. A five-tab forensic calendar tool extracted daily schedule data from Epstein's own calendar documents, turning vague references into a structured timeline. The community tools matured. Investigator profiles shipped with streaks, ranks, and expertise tags — a gamification layer designed to reward the researchers actually doing the work. A court-ready PDF report generator meant any investigation thread could be exported as a properly formatted legal document. ICIJ and FinCEN integration added external cross-reference capabilities that could surface connections invisible in the Epstein files alone. A community document review system with consensus scoring went live, turning the platform from a read-only archive into a collaborative investigation tool. Every document could now be reviewed, annotated, and scored by multiple independent researchers, with consensus algorithms ensuring quality.
- >3,615 flights (1,907 new), migrated to Neon Postgres
- >5-tab forensic calendar from Epstein's schedule documents
- >Investigator profiles with streaks, ranks, expertise tags
- >Court-ready PDF report generator
- >ICIJ/FinCEN external cross-reference integration
- >Community document review with consensus scoring
Intelligence
By the second week of March, the site was receiving enough traffic to attract unwanted attention. The Sentinel anti-abuse system deployed a six-layer defense: honeypot traps for automated scrapers, behavioral scoring to detect coordinated attacks, browser fingerprinting to track bad actors across sessions, tarpit endpoints that wasted bot resources, and an AbuseIPDB integration for community-sourced threat intelligence. The admin panel gave real-time visibility into every blocked request and scored session. The financial forensics module was the technical centerpiece of the week. Over $6.3 billion in financial flows were traced across Epstein's network of shell companies, trusts, and banking relationships. IRS Form 990 data for connected nonprofits was cross-referenced to expose grant flows, and Sankey diagrams visualized money moving through layers of entities. The network graph engine got a complete rewrite — from D3 SVG rendering to Sigma.js with WebGL acceleration — allowing the browser to render thousands of nodes and edges without choking. A DOJ document removal investigation system automated the comparison between government file listings and the archive's holdings, flagging removals in near-real-time. A citation monitoring agent tracked when external sources referenced specific documents, building an evidence graph that extended beyond the archive itself.
- >Sentinel anti-abuse: 6-layer defense system
- >$6.3B financial forensics across shell companies and trusts
- >IRS Form 990 nonprofit cross-reference
- >Network graph: D3 SVG to Sigma.js WebGL
- >DOJ document removal investigation system
- >Sankey diagrams for grant flows and entity networks
- >Citation monitoring agent
Community
The final chapter — at least so far — shifted focus from building tools to building a community around them. SEO optimization work that had been accumulating across previous weeks began paying off: the site crossed one million monthly visitors, driven by organic search traffic from people looking for specific names, documents, and connections. The content was findable because it was structured, and it was structured because every previous week had invested in schema markup, metadata, and semantic HTML. Discord integration brought the community into real-time. A bot with 17 slash commands gave researchers instant access to search, flight lookups, and person profiles without leaving their conversation. The Discord Evidence Browser — an embedded single-page application running inside Discord's Activity framework — offered five different views of the archive data, making it possible to conduct research collaboratively in voice channels. Deep-dive investigations like the tuition pipeline scroll-driven page turned raw data into narrative journalism. The Smart Review Queue got a v2 upgrade with adaptive consensus algorithms, expertise-based routing, and an XP system that rewarded thorough, accurate reviews. The Flight Intelligence Center shipped with a 3D globe visualization and animated tactical map. Entity spotting in document reviews meant the system was now helping reviewers find connections they might have missed. Forty days after the first commit, the platform had 2.1 million documents, 1,500 persons, 3,600 flights, and the tools to make sense of all of it.
- >1M+ monthly visitors via SEO optimization
- >Discord bot with 17 slash commands
- >Discord Evidence Browser (embedded SPA, 5 views)
- >Tuition pipeline deep-dive (scroll-driven investigation)
- >Smart Review Queue v2 with XP system
- >Flight Intelligence Center: 3D globe + tactical map
- >Entity spotting in document reviews
By the Numbers
The scale of the database, distilled into the metrics that define it.
The Changelog
A condensed timeline of every major feature, data import, and investigation milestone.
“The documents were always there. They just needed someone to build the index.”