Transparency
Progress Dashboard
Tracking our progress ingesting, indexing, and cross-referencing every publicly released document from the Jeffrey Epstein case. All data is sourced from government releases and court records.
Data Ingestion Status
| Dataset | Description | Documents | Status |
|---|---|---|---|
| DS1 | Initial court filings and depositions | ~2,130 | Complete |
| DS2 | epstein-docs.github.io collection with AI summaries | ~8,186 | Complete |
| DS3 | DocumentCloud court filings | ~1,254 | Complete |
| DS4-8 | EFTA early releases with OCR | ~50K | Complete |
| DS9 | Largest single DOJ release | ~531K | Complete |
| DS10 | Second major DOJ release | ~1M | Complete |
| DS11 | Additional DOJ documents | ~332K | Complete |
| DS12 | Latest DOJ release | ~218K | Complete |
Processing Pipeline
OCR Text Extraction94%
2,013,995 documents processed
Person-Document Linking100%
2,443,851+ links established
Semantic Embeddings100%
2,669,382 chunks embedded (HNSW indexed)
SHA-256 Hash Verification64%
1,380,911 hashes verified
Full-Text Search Index100%
All documents indexed (tsvector/GIN)
Milestones
2026-02-08
Site launched with initial documents and search
2026-02-09
EFTA OCR pipeline: 2,050 documents with text extraction
2026-02-10
8,186 epstein-docs documents ingested with AI summaries
2026-02-11
DS9 ingestion: 531K documents added
2026-02-12
DS10 ingestion: 1M documents, crossing 1.5M total
2026-02-13
DS11-12 ingestion: 550K documents, reaching 2.1M total
2026-02-14
Person enrichment: 2.4M person-document links established
2026-02-16
Document integrity system: 1.38M SHA-256 hashes verified
2026-02-17
AI agent system: 9 autonomous research agents deployed
2026-02-19
Semantic search: 2.67M embeddings + HNSW vector index
2026-02-21
Navigation revamp: mega menu, mobile tab bar, slide-out drawer
2026-02-22
Technical artifacts extraction + /news investigative journalism section
2026-02-23
Self-hosted Discourse forum at board.epsteinexposed.com
2026-02-24
Hybrid search live, Ask AI, review system enhanced
2026-02-25
Public REST API v2, report inaccuracies, review guide
2026-03-01
Follow the Money system ($2.6B+ traced, 1,699 entities, 14 analysis tabs)
2026-03-01
Codename Decoder (63 pseudonyms from 2.77M pages)
2026-03-01
DOJ Audit tracker (document changes monitored)
2026-03-01
Recovered Text Browser (38,705 hidden pages)
2026-03-02
Research Hub (176 forensic reports)
2026-03-03
Flight expansion (3,615 flights, 7,286 passenger links)
2026-03-05
Daily Schedule extraction (13,000+ entries, 2004-2019)
2026-03-06
Connection Lab (6-tab investigation workspace)
2026-03-06
iMessage Viewer (4,509 messages, 15 threads)
2026-03-06
Photo Evidence Gallery (18,308 photos, face detection)
2026-03-06
Email Archive (405,693 searchable records)
2026-03-06
Review System Overhaul (consensus pipeline, badges, investigations)
2026-03-06
OpenSanctions integration (PEP/sanctions screening)
2026-03-16
Investigation enhancements: external link evidence, editable hypothesis/title
2026-03-29
Prosecution Tracker: 1,054 persons scored across 7 evidence dimensions, SOL countdowns, accountability tracking
2026-03-29
Evidence Chains: collaborative case builder with graph visualization and strength scoring
2026-03-29
Event Reconstruction: auto-discover corroborating evidence across 5 source types with convergence timeline
2026-03-16
FBI FOIA Vault: all 22 parts ingested (archive.org mirror)
2026-03-16
Giuffre v. Maxwell: 4,854 pages of unsealed civil depositions (8 batches)
2026-03-16
Tuition pipeline doc link audit: 2 hallucinated EFTA IDs replaced with verified docs
2026-03-16
Cross-check: DS10 verified 100% complete (503,154 PDFs vs DB)