INDEXIA BLOG

Here’s every name in the Epstein files — indexed.

Indexia Team
Here’s every name in the Epstein files — indexed.

On November 12, the House Oversight Committee released a tranche of Epstein-related materials—thousands of pages of plain-text documents published at oversight.house.gov.

The release was large, unevenly formatted, and varied widely in relevance. To make this material meaningfully searchable, we set out to create a comprehensive index that stayed faithful to the source text while giving researchers, journalists, and the public a way to navigate it with clarity.

The index is linked here: https://www.indexia.tech/public/a176d3bf-c769-4fd9-a647-e58a5c76a46d

Identifying the Relevant Files

Our first step was triage. We built an AI system to sift through every .txt file in the November 12 release and identify the documents that carried evidentiary or contextual significance.

We selected categories including:

  • Emails and private messages between individuals—including informal notes, letters, chats, and message exports—even when they didn't mention Epstein by name or appeared routine.

  • Primary-source documents shedding light on Epstein, his conduct, his associates, or his operations, such as:

    • Emails, correspondence, memos, notes
    • Flight logs, calendars, phone logs, call sheets
    • Bank statements, wire instructions, financial records
    • Legal filings, affidavits, depositions, police reports
    • Internal reports and investigative summaries tied directly to Epstein or his network
  • Files that appear to be raw evidence or contemporaneous documentation created during the events in question.

Using these criteria, we removed 570 files that were duplicative, blank, corrupted, or irrelevant—and kept 2,328 files that met the threshold for inclusion.

Feeding the Files Into Indexia

Those 2,328 documents totaled roughly 7,000 pages of usable material. We then fed the corpus into Indexia to generate a structured index.

Indexia extracted approximately 6,000 unique terms—names, organizations, locations, events, and thematic concepts appearing anywhere in the files. Beyond simple extraction, the system also identified the context in which each term appeared. That means the index doesn't just tell you that a name appears—it captures the surrounding passages, giving you immediate visibility into how that name is actually used in the material.

Indexia also mapped relationships between terms, grouping variants of the same name, linking related concepts, and surfacing recurring patterns. In sprawling, unstructured document sets like these, that relational view is essential for understanding who appears where, and in what connections.

Why This Index Matters

The result is a highly searchable index with direct links back to the original source text. When you click a term, you're taken straight to the passages in the November 12 documents where it appears. Nothing is paraphrased, summarized, or interpreted. The structure is machine-generated, but every meaning remains anchored in the text itself.

This public index serves a simple purpose: to help anyone understand what lies within the November 12 release without altering, filtering, or editorializing its contents. It allows you to:

  • Identify individuals named in the documents
  • Jump back to the original .txt files
  • Read the contexts firsthand

It's a tool designed for transparency.


We built this index solely from the materials released on November 12. Stay tuned for a full index of the complete files once they are released.