The NetApp DataOps Toolkit is a collection of Python-based client tools that simplify the management of data volumes and data science/engineering workspaces that are backed by high-performance, scale-out NetApp storage. Key capabilities include:
- Rapidly provision new data volumes (file shares) or JupyterLab workspaces that are backed by high-performance, scale-out NetApp storage.
- Near-instantaneously clone data volumes (file shares) or JupyterLab workspaces in order to enable experimentation or rapid iteration.
- Near-instantaneously save snapshots of data volumes (file shares) or JupyterLab workspaces for backup and/or traceability/baselining.
- Replicate data volumes (file shares) across different environments.
The toolkit includes MCP Servers that expose many of these capabilities as "tools" that can be utilized by AI agents.
The Dataset Manager is a powerful module in the Traditional Environments toolkit that provides a simplified, intuitive interface for managing datasets backed by NetApp ONTAP storage. It abstracts away volume management complexity and exposes datasets as simple directories, with built-in support for instant cloning, snapshots, and space efficiency — all through a clean Python API.
➡️ See the Dataset Manager README to get started.
The NetApp DataOps Toolkit includes the following client tools:
- The NetApp DataOps Toolkit for Kubernetes includes data volume management, JupyterLab management, and data movement capabilities for users that have access to a Kubernetes cluster.
- The NetApp DataOps Toolkit for Traditional Environments includes basic data volume management capabilities. It will run on most Linux and macOS clients, and does not require Kubernetes.
Report any issues via GitHub: https://github.com/NetApp/netapp-dataops-toolkit/issues.