BiomedParse Backend — GPU Cloud Deployment Guide

This document describes how to deploy the RadSysX BiomedParse backend on a cloud VM with NVIDIA GPU.

Overview

Framework: FastAPI (Python)
GPU: NVIDIA CUDA runtime (Docker-based)
Inference: BiomedParse v2 (3D-enabled) via backend/server.py and backend/biomedparse_api.py
API base path: /api/biomedparse/v1
Static artifacts: /files/* -> backend/tmp/biomedparse

Prerequisites

A cloud VM with an NVIDIA GPU (e.g., 12 GB VRAM or more recommended).
SSH access to the VM.
Docker installed.
NVIDIA Container Toolkit installed for GPU inside Docker.
The BiomedParse 3D checkpoint file available on the VM (path used by BP3D_CKPT).

References:

NVIDIA Container Toolkit installation: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
Docker installation: https://docs.docker.com/get-docker/

Quickstart (Docker + GPU)

SSH into your VM.
Clone the repository:

git clone https://github.com/<your-org-or-user>/RadSysX.git
cd RadSysX

Place your 3D checkpoint on the VM and note the absolute path. For example:

mkdir -p /opt/weights
cp /path/to/biomedparse_3D_AllData_MultiView_edge.ckpt /opt/weights/

Build the GPU image:

docker build -t radsysx-backend:gpu -f backend/Dockerfile .

Start the container with GPU access (mapping weights into the container) and environment variables:

docker run --gpus all -p 8000:8000 \
  -e BP3D_CKPT=/weights/biomedparse_3D_AllData_MultiView_edge.ckpt \
  -e BP_TMP_TTL=7200 -e BP_TMP_SWEEP=1800 -e BP_VALIDATE_HEATMAP=1 \
  -v /opt/weights:/weights \
  radsysx-backend:gpu

Verify the service:

curl http://<VM_IP>:8000/api/biomedparse/v1/health

If you see { "status": "healthy", "gpu_available": true }, the API is up with GPU.

Open interactive docs in a browser:

http://<VM_IP>:8000/docs

Endpoints (Smoke Tests)

Health:

curl -s http://<VM_IP>:8000/api/biomedparse/v1/health | jq .

2D predict example (PNG/JPG):

curl -s -X POST \
  -F "file=@/path/to/example.png" \
  -F "prompts=liver" \
  "http://<VM_IP>:8000/api/biomedparse/v1/predict-2d?threshold=0.5&return_heatmap=true" | jq .

3D predict (NIfTI):

curl -s -X POST \
  -F "file=@/path/to/volume.nii.gz" \
  -F "prompts=liver" \
  "http://<VM_IP>:8000/api/biomedparse/v1/predict-3d-nifti?return_heatmap=true" | jq .

Fetch NPZ artifacts (for debugging):

curl -s "http://<VM_IP>:8000/api/biomedparse/v1/fetch-npz?name=seg_XXXX.npz&key=seg" | jq .
curl -s "http://<VM_IP>:8000/api/biomedparse/v1/fetch-npz?name=prob_YYYY.npz&key=prob" | jq .

Environment Variables

Set these in your .env or pass with -e in docker run.

# Required: absolute path inside the CONTAINER to the 3D checkpoint
BP3D_CKPT=/weights/biomedparse_3D_AllData_MultiView_edge.ckpt

# Transient artifact TTL and sweep (seconds)
BP_TMP_TTL=7200
BP_TMP_SWEEP=1800

# Validate that heatmap NPZ contains key 'prob' as uint8 (1=on, 0=off)
BP_VALIDATE_HEATMAP=1

# Optional: force slice batch size; otherwise auto‑tuned by available VRAM
#BP_SLICE_BATCH_SIZE=4

Notes:

Temp files are saved under backend/tmp/biomedparse and served via /files/*.
The cleanup daemon purges .npz artifacts older than BP_TMP_TTL every BP_TMP_SWEEP seconds.
When using Docker, prefer a plain .env file (not .env.local). Load it with --env-file .env or map individual variables with -e flags.
Ensure your cloud firewall/security groups open TCP port 8000 to authorized client IPs only (and any other ports you expose). On the VM itself, allow the same in the OS firewall if enabled.

Without Docker (not recommended for production)

Install Python 3.10+, CUDA drivers, and a CUDA-enabled PyTorch.
Install dependencies:

pip install fastapi uvicorn[standard] python-multipart pydantic numpy pillow nibabel pydicom hydra-core omegaconf python-dotenv
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

Export environment variables (see the section above) and run the API:

uvicorn backend.server:app --host 0.0.0.0 --port 8000

Tuning & Troubleshooting

Out-of-memory (OOM): lower BP_SLICE_BATCH_SIZE or pass ?slice_batch_size=... on 3D endpoints.
GPU not detected: verify nvidia-smi on the host and that the container runs with --gpus all.
Missing checkpoint: ensure BP3D_CKPT points to a readable file inside the container (mount it with -v).
Heatmap NPZ validation errors: set BP_VALIDATE_HEATMAP=1 (default) and ensure the artifact contains prob (uint8).
CORS: the server currently allows all origins; restrict in production (edit backend/server.py).
Security: only expose port 8000 to authorized IPs; consider a reverse proxy with auth for internet-facing deployments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BiomedParse Backend — GPU Cloud Deployment Guide

Overview

Prerequisites

Quickstart (Docker + GPU)

Endpoints (Smoke Tests)

Environment Variables

Without Docker (not recommended for production)

Tuning & Troubleshooting

FilesExpand file tree

DEPLOY_GPU.md

Latest commit

History

DEPLOY_GPU.md

File metadata and controls

BiomedParse Backend — GPU Cloud Deployment Guide

Overview

Prerequisites

Quickstart (Docker + GPU)

Endpoints (Smoke Tests)

Environment Variables

Without Docker (not recommended for production)

Tuning & Troubleshooting