OmniBioAI Model Registry is a production-grade lifecycle management system for AI/ML models within the OmniBioAI ecosystem.
It provides:
- Immutable model versioning (write-once versions)
- Cryptographic integrity verification (SHA256)
- Provenance-friendly metadata capture
- Staged promotion workflows (latest → staging → production)
- Deterministic resolution by stable reference (
model@aliasormodel@version) - Local-first design with a clean path to future backends (S3/Azure/on-prem)
iThe registry is implemented as a standalone Python library and includes:
- a CLI (
omr) - a minimal REST service (FastAPI)
Biomedical AI requires:
- Reproducibility
- Auditability
- Governance
- Offline / air-gapped deployment
- Cross-infrastructure execution parity
Traditional ML tooling often assumes:
- cloud-first infrastructure
- mutable artifacts
- weak provenance guarantees
OmniBioAI Model Registry is designed differently.
It treats AI models as scientific artifacts that must be immutable, verifiable, and reproducible across environments.
OmniBioAI follows a four-plane architecture:
| Plane | Responsibility |
|---|---|
| Control Plane | UI, registries, metadata, governance |
| Compute Plane | Workflow execution, HPC/cloud adapters |
| Data Plane | Artifacts, datasets, outputs |
| AI Plane | Reasoning, RAG, agents, interpretation |
The Model Registry belongs to the Control Plane and provides:
- AI artifact governance
- deterministic inference references
- promotion and audit workflows
- infrastructure-independent model resolution
Each model version is write-once:
- no overwrites
- no silent mutation
- full historical trace
This guarantees scientific reproducibility.
Every model package includes a SHA256 manifest:
sha256sums.txthashes the package contents (excluding itself)
This enables:
- bit-level reproducibility
- tamper detection
- trustworthy deployment in regulated environments
Each model stores structured metadata via model_meta.json, such as:
- training code version (git commit)
- dataset reference (e.g., DVC / object store ref)
- hyperparameters and preprocessing
- creator and timestamp
Models move through controlled stages:
latest → staging → production
All promotions are:
- explicit
- append-only
- audited (
audit/promotions.jsonl)
v0.1.0 supports:
- local filesystem backend (
localfs)
Planned:
- S3 / Azure Blob / enterprise on-prem backends
omnibioai-model-registry/
├── omnibioai_model_registry/
│ ├── api.py
│ ├── config.py
│ ├── refs.py
│ ├── storage/
│ ├── package/
│ ├── audit/
│ ├── cli/
│ └── service/
├── tests/
├── pyproject.toml
└── README.md
Registered models follow a strict, portable structure:
<OMNIBIOAI_MODEL_REGISTRY_ROOT>/
tasks/<task>/models/<model_name>/
versions/<version>/
model.pt
model_genes.txt
label_map.json
model_meta.json
metrics.json
feature_schema.json
sha256sums.txt
aliases/
latest.json
staging.json
production.json
audit/
promotions.jsonl
This guarantees:
- deterministic loading
- integrity validation
- cross-environment portability
The registry requires a root directory:
export OMNIBIOAI_MODEL_REGISTRY_ROOT=~/Desktop/machine/local_registry/model_registryFrom this repository root:
pip install -e .Verify:
python -c "import omnibioai_model_registry as m; print('OK', m.__file__)"
omr --helpInstall build tooling:
pip install buildBuild:
python -m buildArtifacts are written to dist/:
dist/omnibioai_model_registry-0.1.0-py3-none-any.whldist/omnibioai_model_registry-0.1.0.tar.gz
Install the wheel:
pip install dist/*.whlomr register \
--task celltype_sc \
--model human_pbmc \
--version 2026-02-14_001 \
--artifacts /tmp/model_pkg \
--set-alias latestomr resolve --task celltype_sc --ref human_pbmc@latestomr promote --task celltype_sc --model human_pbmc --version 2026-02-14_001 --alias productionomr verify --task celltype_sc --ref human_pbmc@productionomr show --task celltype_sc --ref human_pbmc@production --jsonfrom omnibioai_model_registry import register_model, resolve_model, promote_model
register_model(
task="celltype_sc",
model_name="human_pbmc",
version="2026-02-14_001",
artifacts_dir="/tmp/model_pkg",
metadata={
"framework": "pytorch",
"model_type": "classifier",
"provenance": {
"git_commit": "abc123",
"training_data_ref": "s3://bucket/datasets/pbmc_v1",
"trainer_version": "0.1.0",
},
},
set_alias="latest",
actor="manish",
reason="initial training",
)
# Resolve by alias (or version)
path = resolve_model("celltype_sc", "human_pbmc@latest", verify=True)
print("Resolved model dir:", path)
# Promote to production
promote_model(
task="celltype_sc",
model_name="human_pbmc",
alias="production",
version="2026-02-14_001",
actor="manish",
reason="validated metrics",
)pip install -r omnibioai_model_registry/service/requirements.txt
uvicorn omnibioai_model_registry.service.app.main:app --host 0.0.0.0 --port 8095Test:
curl -s http://127.0.0.1:8095/health | python -m json.toolEndpoints:
POST /v1/registerGET /v1/resolvePOST /v1/promotePOST /v1/verifyGET /v1/show
pip install -e ".[dev]"
pytest -qThis registry is a control-plane component of OmniBioAI.
Companion repositories:
- omnibioai → AI-powered bioinformatics workbench
- omnibioai-tes → execution orchestration across local/HPC/cloud
- omnibioai-rag → reasoning and literature intelligence
- omnibioai-lims → laboratory data management
- omnibioai-workflow-bundles → reproducible pipelines
- omnibioai-sdk → Python client access
The Model Registry provides the AI artifact governance layer shared by all.
- additional storage backends (S3 / Azure)
- expanded metadata validation + schemas
- model listing and metadata search APIs
- RBAC and governance controls
- richer registry service APIs (auth, pagination, filtering)
- comparison and promotion policies
- enterprise biomedical AI governance platform
- regulatory-ready audit and lineage
- deeper integration with experiment tracking and clinical pipelines
- ✅ Immutable and verifiable model storage
- ✅ Audit-ready promotion workflow
- ✅ CLI + minimal REST service
- ✅ Local-first, cloud-ready design
OmniBioAI Model Registry establishes the foundation for trustworthy, reproducible biomedical AI deployment.