-
Notifications
You must be signed in to change notification settings - Fork 11
docs: Add plugin codecs guide and fix codec notation #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+683
−5
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Changed <hash> to <hash@> in the External dtype row to match the correct store-only notation used throughout the documentation.
Add comprehensive how-to guide for using plugin codecs - codec packages that extend DataJoint via entry point discovery. Uses dj-zarr-codecs as the primary example. Key sections: - Installation and automatic registration via entry points - Complete Zarr codec usage example with storage structure - Finding DataJoint-maintained and community codecs - Comparison with built-in codecs (<npy@>, <blob@>) - Best practices for dependency management - Troubleshooting common issues Terminology: Uses 'plugin codecs' instead of 'external/third-party' to accurately describe the architectural pattern (separate packages with entry point discovery) without implying ownership.
Update plugin codecs documentation to include dj-photon-codecs: - Add to DataJoint-maintained codecs list - Include in imaging domain examples - Reference in See Also section dj-photon-codecs provides Anscombe transformation + Zarr compression for photon-limited imaging data (calcium imaging, low-light microscopy).
Add 'Before Creating Your Own' section to custom-codecs.md that directs readers to check existing plugin codecs (dj-zarr-codecs, dj-photon-codecs, anscombe-transform) before implementing their own. Encourages reuse and ensures users are aware of existing solutions.
anscombe-transform is a Zarr/Numcodecs codec (not a DataJoint codec). It doesn't have a datajoint.codecs entry point - it's a dependency used by dj-photon-codecs, not a standalone DataJoint plugin codec. Removed from: - DataJoint-maintained codecs list in use-plugin-codecs.md - Before Creating Your Own section in custom-codecs.md
Add detailed guidance on versioning plugin codecs for backward compatibility: - Version strategy: package version vs data format version - When to bump versions (breaking vs non-breaking changes) - Implementation patterns for version dispatch - Migration strategies (lazy, explicit, deprecation warnings) - Real-world example with dj-photon-codecs evolution - Testing version compatibility - Semantic versioning guidelines for codec packages Critical for maintaining data accessibility as codecs evolve.
Add section explaining why built-in codecs don't need explicit versioning: - Built-in codecs versioned with DataJoint releases - Plugin codecs have independent lifecycles and need codec_version - DataJoint's semantic versioning handles built-in codec evolution - Plugin versioning protects against independent evolution Key distinction: Built-in codecs are part of DataJoint's API surface (versioned by framework), while plugin codecs are independent packages (need self-versioning).
Add comprehensive documentation of DataJoint's custom blob serialization: Explanation docs (type-system.md): - Protocol headers (mYm for MATLAB compat, dj0 for Python-extended) - Optional zlib compression for data > 1KB - Type-specific encoding with serialization codes - Version detection via embedded protocol headers - Supported types list - Storage modes (<blob> vs <blob@>) Reference docs (type-system.md): - Detailed type code mapping for all supported Python types - Protocol header format (mYm\0, dj0\0) - Version detection mechanism - MD5 deduplication for <blob@> Clarifies that <blob> does NOT use pickle - it uses DataJoint's custom binary format with intrinsic versioning via protocol headers.
Add references to mYm format documentation: - MATLAB FileExchange: https://www.mathworks.com/matlabcentral/fileexchange/81208-mym - GitHub repository: https://github.com/datajoint/mym Add intrinsic versioning explanation to plugin codecs guide: - How built-in codecs embed version in data format - Protocol headers in <blob> (mYm\0, dj0\0) - NumPy format version in <npy@> headers - Self-describing structure in <object@> - Why built-in codecs don't need explicit codec_version field Clarifies the distinction between built-in codecs (intrinsic versioning) and plugin codecs (explicit codec_version field).
MilagrosMarin
approved these changes
Jan 16, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Note: This PR targets
mainbranch. The same changes were previously merged topre/v2.0via PR #121.Changes
1. Fix Codec Notation
File:
src/reference/specs/type-system.md(line 638)Changed the "External dtype" row in the codec comparison table from
<hash>to<hash@>to match the correct store-only notation.Before:
After:
2. Add Plugin Codecs Documentation
New file:
src/how-to/use-plugin-codecs.md(332 lines)Comprehensive guide for using plugin codec packages that extend DataJoint via entry point discovery.
Key sections:
<npy@>,<blob@>)Terminology: Uses "plugin codecs" instead of "external/third-party" to accurately describe the architectural pattern (separate packages with entry point discovery) without implying ownership.
DataJoint Plugin Codecs:
Note: anscombe-transform is a Zarr/Numcodecs codec (dependency), not a DataJoint plugin codec.
Updated navigation:
src/how-to/index.md- Added entrymkdocs.yaml- Added to Object Storage section3. Reference Plugin Codecs in Explanations
File:
src/explanation/custom-codecs.mdAdded "Before Creating Your Own" section that directs readers to check existing plugin codecs before implementing custom solutions:
Context
The
<hash@>codec is external-only and requires the@modifier. The original table incorrectly showed<hash>(without@) in the "External dtype" row.The plugin codecs guide establishes terminology and best practices for DataJoint plugin codecs - packages that register via
datajoint.codecsentry points. Both dj-zarr-codecs and dj-photon-codecs follow this pattern.Related