Skip to main content
Skip to content
Transparency

Progress Dashboard

Tracking our progress ingesting, indexing, and cross-referencing every publicly released document from the Jeffrey Epstein case. All data is sourced from government releases and court records.

Data Ingestion Status

DatasetDescriptionDocumentsStatus
DS1Initial court filings and depositions~2,130Complete
DS2epstein-docs.github.io collection with AI summaries~8,186Complete
DS3DocumentCloud court filings~1,254Complete
DS4-8EFTA early releases with OCR~50KComplete
DS9Largest single DOJ release~531KComplete
DS10Second major DOJ release~1MComplete
DS11Additional DOJ documents~332KComplete
DS12Latest DOJ release~218KComplete

Processing Pipeline

OCR Text Extraction94%
2,013,995 documents processed
Person-Document Linking100%
2,443,851+ links established
Semantic Embeddings100%
2,669,382 chunks embedded (HNSW indexed)
SHA-256 Hash Verification64%
1,380,911 hashes verified
Full-Text Search Index100%
All documents indexed (tsvector/GIN)

Milestones

2026-02-08
Site launched with initial documents and search
2026-02-09
EFTA OCR pipeline: 2,050 documents with text extraction
2026-02-10
8,186 epstein-docs documents ingested with AI summaries
2026-02-11
DS9 ingestion: 531K documents added
2026-02-12
DS10 ingestion: 1M documents, crossing 1.5M total
2026-02-13
DS11-12 ingestion: 550K documents, reaching 2.1M total
2026-02-14
Person enrichment: 2.4M person-document links established
2026-02-16
Document integrity system: 1.38M SHA-256 hashes verified
2026-02-17
AI agent system: 9 autonomous research agents deployed
2026-02-19
Semantic search: 2.67M embeddings + HNSW vector index
2026-02-21
Navigation revamp: mega menu, mobile tab bar, slide-out drawer
2026-02-22
Technical artifacts extraction + /news investigative journalism section
2026-02-23
Self-hosted Discourse forum at board.epsteinexposed.com
2026-02-24
Hybrid search live, Ask AI, review system enhanced
2026-02-25
Public REST API v2, report inaccuracies, review guide
2026-03-01
Follow the Money system ($6.4B traced, 621 entities, 12 analysis modes)
2026-03-01
Codename Decoder (63 pseudonyms from 2.77M pages)
2026-03-01
DOJ Audit tracker (114K+ documents monitored)
2026-03-01
Recovered Text Browser (38,705 hidden pages)
2026-03-02
Research Hub (176 forensic reports)
2026-03-03
Flight expansion (3,615 flights, 7,286 passenger links)
2026-03-05
Daily Schedule extraction (13,000+ entries, 2004-2019)
2026-03-06
Connection Lab (6-tab investigation workspace)
2026-03-06
iMessage Viewer (4,509 messages, 15 threads)
2026-03-06
Photo Evidence Gallery (18,308 photos, face detection)
2026-03-06
Email Archive (405,693 searchable records)
2026-03-06
Review System Overhaul (consensus pipeline, badges, investigations)
2026-03-06
OpenSanctions integration (PEP/sanctions screening)

Related