A CLI toolkit for building Latin Anki flashcards. Inspect and restructure Anki exports, annotate grammar with CLTK, and generate corpus-based cloze-deletion cards from USFX, plain-text, or parallel CSV corpora — designed for learners studying Latin through spaced repetition.
- Inspect Anki deck structure, field names, and sample notes
- Split multi-form cards into one-row-per-form records
- Annotate grammar (lemma, POS, morphology) via CLTK with optional LLM disambiguation
- Generate cloze cards from Latin corpora (USFX XML, plain text, CSV)
- Parallel corpus support — include EN/DE translations alongside Latin clozes
- Difficulty filtering — control cloze complexity (easy / medium / hard)
- APKG rewrite — update Anki packages in place while preserving originals
Prerequisites: Python 3.10–3.12 and Poetry.
git clone https://github.com/fmueller/latinitas-cards.git
cd latinitas-cards
poetry installVerify the installation:
poetry run latinitas-cards --help# 1. Inspect your Anki deck
poetry run latinitas-cards inspect --input data/latin_university.apkg --head 5
# 2. Split multi-form entries into individual rows
poetry run latinitas-cards split \
--input data/latin_university.apkg \
--output split.csv \
--source-field Konstruktion_Hinweise \
--split-mode auto
# 3. Annotate grammar and generate cloze cards
poetry run latinitas-cards annotate --input split.csv --output annotated.csv --form-column form
poetry run latinitas-cards cloze \
--input annotated.csv \
--output cloze.csv \
--corpus data/lat-clementine.usfx.xml \
--corpus-format auto \
--difficulty medium| Command | Description |
|---|---|
inspect |
Inspect deck schema and show a head-like sample preview |
split |
Split multi-form cards into one-row-per-form records |
annotate |
Annotate CSV forms with CLTK lemma/POS/morphology metadata |
cloze |
Generate corpus-based cloze cards for each form in a CSV input |
validate |
Validate USFX parsing integrity and required input columns |
preview |
Show a sample of generated clozes without writing output |
generate |
Update an Anki CSV or APKG file with cloze examples from a Latin USFX corpus |
poetry run latinitas-cards inspect --input data/latin_university.apkg --head 5poetry run latinitas-cards split \
--input input.apkg \
--output split.csv \
--source-field Konstruktion_Hinweise \
--split-mode autoOptional APKG rewrite (keeps originals and adds split cards):
poetry run latinitas-cards split \
--input input.apkg \
--output output.apkg \
--source-field Konstruktion_Hinweise \
--split-mode auto \
--output-format apkgpoetry run latinitas-cards annotate \
--input split.csv \
--output annotated.csv \
--form-column formWith optional Ollama LLM disambiguation:
poetry run latinitas-cards annotate \
--input split.csv \
--output annotated_llm.csv \
--form-column form \
--use-llm \
--llm-provider ollama \
--llm-model ministral-3:8b \
--llm-endpoint http://localhost:11434poetry run latinitas-cards cloze \
--input annotated.csv \
--output cloze.csv \
--corpus data/lat-clementine.usfx.xml \
--corpus-format auto \
--difficulty mediumWith a parallel corpus (including EN/DE translations):
poetry run latinitas-cards cloze \
--input annotated.csv \
--output cloze_parallel.csv \
--corpus opus_subset.csv \
--corpus-format csv \
--latin-column la \
--translation-lang en \
--translation-lang de \
--parallel-mode includeWhen parallel columns are detected and behavior is unspecified:
- Interactive terminal:
latinitas-cardsprompts you - Non-interactive execution: translations are ignored with a warning
poetry run latinitas-cards validate \
--input data/latin_university.apkg \
--usfx data/lat-clementine.usfx.xmlpoetry run latinitas-cards preview \
--input data/latin_university.apkg \
--usfx data/lat-clementine.usfx.xmlpoetry run latinitas-cards generate \
--input data/latin_university.apkg \
--output updated.csv \
--usfx data/lat-clementine.usfx.xmlThe bundled corpus data/lat-clementine.usfx.xml is a Latin Vulgate (Clementine) Bible in USFX format.
Good public sources for additional Latin corpora (including EN/DE parallel data):
- OPUS (recommended starting point)
bible-uedin(strong verse-aligned biblical corpus)Tatoeba(sentence-level data)WikiMatrix/CCMatrix(broader but noisier)
- For direct corpus pair discovery via API:
https://opus.nlpl.eu/opusapi/?corpora=True&source=la&target=enhttps://opus.nlpl.eu/opusapi/?corpora=True&source=la&target=de
See AGENTS.md for coding style, project structure, and commit conventions.
Before submitting changes, run the validation chain:
poetry run ruff check
poetry run mypy
poetry run pytest -v