Skip to content

Commit 07c74d6

Browse files
committed
docs: expand paper with additional details and examples
1 parent 69ed4f4 commit 07c74d6

1 file changed

Lines changed: 18 additions & 8 deletions

File tree

paper.md

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -81,18 +81,20 @@ The `FileManager` routes to any `DataSink`. Pipeline code doesn't know whether d
8181

8282
**Streaming Internals Hidden**: Streaming backends handle substantial complexity internally—GPU tensor conversion, shared memory allocation, ZMQ socket management, ROI serialization—all behind the same `save_batch()` interface. The orchestrator remains backend-agnostic.
8383

84-
**Atomic Operations**: Cross-platform file locking (`fcntl` on Unix, `portalocker` on Windows) with `atomic_update_json()` for concurrent metadata writes from multiple pipeline workers.
84+
**Atomic Operations**: Cross-platform file locking (`fcntl` on Unix, `portalocker` on Windows) with `atomic_update_json()` for concurrent metadata writes from multiple pipeline workers. This is critical for OpenHCS where multiple worker processes write metadata simultaneously—without atomic operations, race conditions corrupt JSON files.
8585

86-
Backends auto-register via `metaclass-registry` [@metaclassregistry] and are lazily instantiated, keeping optional dependencies unloaded until used.
86+
**Lazy Backend Instantiation**: Backends auto-register via `metaclass-registry` [@metaclassregistry] and are lazily instantiated, keeping optional dependencies unloaded until used. For example, the Napari streaming backend only imports `napari` when first used, avoiding dependency bloat for users who don't need visualization.
87+
88+
**Batch Operations**: The `save_batch()` and `load_batch()` interfaces accept lists of paths and data, enabling backends to optimize I/O. The Zarr backend can write multiple arrays in a single transaction; the Napari backend can batch ROI updates into a single viewer refresh. This is more efficient than per-file operations.
8789

8890
# Research Application
8991

90-
PolyStore was developed for OpenHCS (Open High-Content Screening) where microscopy pipelines:
92+
PolyStore was developed for OpenHCS (Open High-Content Screening) where microscopy pipelines process thousands of images per experiment. A typical workflow:
9193

92-
- Load images from disk or virtual workspace
93-
- Process in memory (avoiding I/O between steps)
94-
- Write results to Zarr (chunked, compressed)
95-
- Stream intermediate results to Napari for live preview
94+
1. **Load**: Read raw images from disk (TIFF, OME-TIFF) or virtual workspace (lazy-loaded)
95+
2. **Process**: Apply filters, segmentation, feature extraction in memory
96+
3. **Save**: Write results to Zarr (chunked, compressed for efficient storage)
97+
4. **Stream**: Send intermediate results to Napari for live preview and quality control
9698

9799
All through one interface:
98100

@@ -104,7 +106,15 @@ fm.save_batch(processed, paths, backend="zarr")
104106
fm.save_batch(processed, paths, backend="napari_stream")
105107
```
106108

107-
The explicit backend model eliminated an entire class of bugs where code assumed disk storage but ran against memory or streaming backends.
109+
**Concrete Example**: A user processes 10,000 images. Without PolyStore, the pipeline code would contain:
110+
- `np.load()` for disk reads
111+
- `zarr.open_array()` for Zarr writes
112+
- `napari.Viewer.add_image()` for visualization
113+
- Custom socket code for streaming to remote Fiji instances
114+
115+
With PolyStore, all I/O goes through `FileManager`, and the user can switch backends by changing a config parameter—no code changes needed.
116+
117+
**Bug Prevention**: The explicit backend model eliminated an entire class of bugs where code assumed disk storage but ran against memory or streaming backends. For example, a function that called `os.path.exists()` would fail silently against a memory backend. With PolyStore, the backend is explicit, and such mismatches are caught immediately.
108118

109119
# AI Usage Disclosure
110120

0 commit comments

Comments
 (0)