Case Study

The Archive

Why do civilizations that never met tell the same stories? 83 research tools across 12 categories, bridging 22 sacred texts and 14 religious traditions with a 50,000-node knowledge graph, four modes of AI reasoning, and the intellectual honesty to distinguish signal from noise.

Status In Development
Role Solo Developer
Stack React · Neo4j · WebGL · Claude AI

Patterns that shouldn't exist

The Hindu Churning of the Ocean and the Norse Ginnungagap. The Egyptian weighing of the heart and the Tibetan Bardo. Flood narratives in Mesopotamia, Genesis, and Mesoamerica. Dying-and-rising gods across cultures separated by oceans and millennia. These parallels are either transmission (stories traveling trade routes), cognitive universals (brains wired the same way produce the same myths), or coincidence.

Existing tools don't help you distinguish between these explanations. Concordances are keyword searches. Comparative religion textbooks are static arguments. AI chatbots give confident answers without showing their reasoning or citing their sources. There was no platform that let you explore these patterns with the rigor they demand — tracing transmission routes, testing cognitive universal hypotheses, measuring pattern integrity, and honestly flagging when the evidence is inconclusive.

83

Research Tools

22

Sacred Texts

14

Traditions

50K+

Knowledge Graph Nodes

211

Thompson Motifs Mapped

152

Etymology Entries

4

AI Reasoning Modes

12

Tool Categories

50,000 nodes. Everything connects.

At the center of The Archive is a knowledge graph with over 50,000 entities — deities, concepts, places, figures, motifs — connected by typed relationships and rendered at 60fps via WebGL (Sigma.js + Graphology). You can trace a path from Isis to the Virgin Mary through intermediate syncretism events. You can measure the conceptual distance between karma and divine judgment using Dijkstra's algorithm.

Every entity is tagged at the verse level across all 22 texts. The graph isn't a visualization of pre-curated connections — it's a queryable database of relationships extracted from the texts themselves, enriched with 211 Thompson motif index entries, 152 etymology chains across 31 language families, and 100+ scholarly positions from 50+ academics ranked by epistemic weight.

Six parallel Neo4j queries power each AI analysis request, aggregating context from the graph, the texts, scholarly positions, and temporal data simultaneously. The system doesn't just know what Osiris is — it knows who said what about Osiris, when they said it, what evidence they cited, and where the scholarly consensus currently stands.

Four ways to think, not one

Most AI tools offer one mode: confident summary. The Archive offers four, because different questions need different approaches.

Authoritative — expert analysis accompanied by counter-evidence. Every claim comes with the strongest argument against it. If the AI can't find a counter-argument, it says so, and that absence becomes evidence in itself.

Transparent — full reasoning chains visible, with a 6-dimension confidence radar (textual, temporal, linguistic, scholarly, archaeological, geographic). You see exactly why the AI is 74% confident, not just that it is.

Dialectical — thesis-antithesis-synthesis. The AI argues both sides of a question and attempts a resolution, showing you where the tension lives and whether synthesis is possible or forced.

Socratic — guided discovery through questions. The AI doesn't tell you what to think. It asks what you've noticed and why, pushing you toward conclusions you arrive at yourself.

Every AI output is reproducible. Input data, model version, prompt, and output are SHA-256 hashed together. Another researcher can verify they get identical results from identical inputs. A calibration dashboard tracks whether the AI's stated confidence actually matches its accuracy over time using Brier scoring.

If an AI tells you it's 74% confident and it's actually right 74% of the time, it's calibrated. If it's right 50% of the time, it's overconfident. The Archive tracks this and tells you.

What made this difficult

  • 50,000-node graph at 60fps WebGL rendering via Sigma.js and Graphology, with force-directed layout, interactive zoom, filter, and click — all performing smoothly on consumer hardware. Every node is clickable, every edge is typed, every cluster is semantically meaningful.
  • Source-critical text layering The Bible isn't one document. It's layers: J, E, P, D sources woven together over centuries. The Quran has Meccan and Medinan periods. The Book of Enoch has five sections from different eras. Implementing real-time toggling between these layers while maintaining verse-level entity tagging required treating each text as a palimpsest, not a flat file.
  • Bayesian uncertainty propagation When you chain five pieces of evidence together, the uncertainty compounds. Weak evidence can't masquerade as strong proof just because you stacked enough of it. The Archive implements Bayesian confidence propagation that honestly composes uncertainty through multi-level inference chains.
  • Temporal anachronism detection If someone claims Persian Zoroastrian influence on a pre-exilic Hebrew text, the dates don't work. 37 work-dating estimates are tracked, and the system automatically flags impossible chronological claims. You can't cite a source that didn't exist yet.
  • Concept phylogenetics Building interactive SVG cladograms that show how ideas evolve, branch, merge (syncretism), and go extinct across cultures and centuries. Like biological phylogenetics, but for concepts — with the added complexity that ideas can merge (gods can be syncretized) in ways that genes typically can't.
  • Epistemic weighting Not all scholars are equal. 23 academics ranked by credentials: language fluency, excavation experience, peer review count, primary source access. A grad student's blog post and a tenured archaeologist's peer-reviewed monograph don't get the same weight. The system knows the difference.

Research tools that insist on honesty

The Archive has a falsifiability engine. It generates testable predictions from hypotheses and tracks whether those predictions hold up against the corpus. A hypothesis engine runs 7 parallel queries against the data and delivers a verdict with confidence scores. A debate simulator runs 3-round pro-vs-con arguments using real scholarly sources. Every claim is traced to its origin.

It also has 8 phenomenological practice guides — vipassana, hesychasm, dhikr, zazen, lectio divina, japa, kabbalistic meditation, taoist sitting — because understanding a tradition intellectually without understanding how it's practiced is like studying music without ever hearing a note. The Archive takes embodied knowledge seriously.

And it meets users where they are. 12 "Teach Me Like I'm Five" explainers break graduate-level concepts into accessible language. 5 difficulty levels per concept. Progressive disclosure that grows the interface with user engagement. The same platform serves the curious amateur and the tenured researcher without condescending to either.

The question — why do civilizations that never met tell the same stories? — deserves better than confident answers. It deserves 83 instruments precise enough to find out whether the patterns are real, and honest enough to say when they aren't.

How it's built

React frontend with WebGL graph rendering (Sigma.js + Graphology). SVG for specialized visualizations: cladograms, cycle overlays, timelines, narrative arcs. Neo4j for the knowledge graph with parallel query execution. Claude AI (Sonnet and Opus with extended thinking) for the four reasoning modes. Server-Sent Events for streaming AI responses. Text-to-speech via OpenAI voices and ElevenLabs HD synthesis.

Three-tier subscription: Free (browse, read, explore graph), Sonnet ($14.99/mo for AI chat), Opus ($39.99/mo for deep analysis with extended thinking, HD voice, API access, and reproducible analysis with SHA-256 hashing).

React TypeScript WebGL Sigma.js Neo4j Claude AI SVG SSE ElevenLabs Zustand Vite
Back to Portfolio