Zereth Codes: unique IDs for every word, UD for every sentence.
Foundation: each source‑language token receives a unique, non‑repeating ID. Source datasets: Leningrad Codex (Hebrew) and Nestle 1904 (Greek). The same surface form in a different verse gets a different code — enabling precise role tracking in the sentence and across the entire Bible.
Next steps: assign Universal Dependencies to every sentence — initially inferred automatically, later aligned to curated resources (e.g., UD‑PTNK for Greek and Hebrew). Then connect every word to all other words in the Bible as a weighted graph (~444,000 tokens), training it using techniques inspired by neural networks to capture graded relatedness.
For each target language we will add: dictionaries of Hebrew/Greek lemmas, per‑language UD profiles, and translation rules that preserve the strength of source dependencies — a measurable notion of translation fidelity.Source Notice The lexical codes and structural representations used in this project are an original system created by the author. The underlying biblical text is aligned with: WLC – Westminster Leningrad Codex (as published by the J. Alan Groves Center / FRDB) Textus Receptus (public domain editions) Nestle 1904 Greek New Testament (public domain) and selected interlinear resources for alignment purposes. This website provides an independent analytical system based on public-domain manuscripts and traditional textual sources. The Zereth Codes and all associated indexing, structural markings, and analytical layers are original intellectual property and are not part of WLC, TR, or Nestle 1904.
Semantic Scripture Search
Search Scripture by meaning, not keywords. Built on the original Hebrew and Greek texts (WLC and Nestle 1904): every word is uniquely coded and enriched with lexical knowledge (including traditional resources such as Strong’s). These word meanings form high‑dimensional representations, then verse‑level meaning embeddings (compressed for fast search) — so related passages can be found even when wording differs.
Unique per‑token IDs
No reuse across occurrences; context is explicit.
UD for every sentence
Auto now, curated later with UD‑PTNK.
Graph of ~444k words
Relations refined like neural nets.
Language pipelines
Dictionaries, UD, rules per language.
Fidelity metric
How well target texts preserve source dependencies.
Continuous updates
New insights propagate across the whole graph.