Transparency
Progress Dashboard
Tracking our progress ingesting, indexing, and cross-referencing every publicly released document from the Jeffrey Epstein case. All data is sourced from government releases and court records.
Data Ingestion Status
| Dataset | Description | Documents | Status |
|---|---|---|---|
| DS1 | Initial court filings and depositions | ~2,130 | Complete |
| DS2 | epstein-docs.github.io collection with AI summaries | ~8,186 | Complete |
| DS3 | DocumentCloud court filings | ~1,254 | Complete |
| DS4-8 | EFTA early releases with OCR | ~50K | Complete |
| DS9 | Largest single DOJ release | ~531K | Complete |
| DS10 | Second major DOJ release | ~1M | Complete |
| DS11 | Additional DOJ documents | ~332K | Complete |
| DS12 | Latest DOJ release | ~218K | Complete |
Processing Pipeline
OCR Text Extraction94%
2,013,995 documents processed
Person-Document Linking100%
2,443,851+ links established
Semantic Embeddings100%
2,669,382 chunks embedded (HNSW indexed)
SHA-256 Hash Verification64%
1,380,911 hashes verified
Full-Text Search Index100%
All documents indexed (tsvector/GIN)
Milestones
2026-02-08
Site launched with initial documents and search
2026-02-09
EFTA OCR pipeline: 2,050 documents with text extraction
2026-02-10
8,186 epstein-docs documents ingested with AI summaries
2026-02-11
DS9 ingestion: 531K documents added
2026-02-12
DS10 ingestion: 1M documents, crossing 1.5M total
2026-02-13
DS11-12 ingestion: 550K documents, reaching 2.1M total
2026-02-14
Person enrichment: 2.4M person-document links established
2026-02-16
Document integrity system: 1.38M SHA-256 hashes verified
2026-02-17
AI agent system: 9 autonomous research agents deployed
2026-02-19
Semantic search: 2.67M embeddings + HNSW vector index
2026-02-21
Navigation revamp: mega menu, mobile tab bar, slide-out drawer
2026-02-22
Technical artifacts extraction + /news investigative journalism section
2026-02-23
Self-hosted Discourse forum at board.epsteinexposed.com
2026-02-24
Hybrid search live, Ask AI, review system enhanced
2026-02-25
Public REST API v2, report inaccuracies, review guide
2026-03-01
Follow the Money system ($6.4B traced, 621 entities, 12 analysis modes)
2026-03-01
Codename Decoder (63 pseudonyms from 2.77M pages)
2026-03-01
DOJ Audit tracker (114K+ documents monitored)
2026-03-01
Recovered Text Browser (38,705 hidden pages)
2026-03-02
Research Hub (176 forensic reports)
2026-03-03
Flight expansion (3,615 flights, 7,286 passenger links)
2026-03-05
Daily Schedule extraction (13,000+ entries, 2004-2019)
2026-03-06
Connection Lab (6-tab investigation workspace)
2026-03-06
iMessage Viewer (4,509 messages, 15 threads)
2026-03-06
Photo Evidence Gallery (18,308 photos, face detection)
2026-03-06
Email Archive (405,693 searchable records)
2026-03-06
Review System Overhaul (consensus pipeline, badges, investigations)
2026-03-06
OpenSanctions integration (PEP/sanctions screening)