Skip to content

omerferhatt/wandbfs-rust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

WandBFS - Weights & Biases FUSE Filesystem

A read-only FUSE filesystem for browsing Weights & Biases runs, artifacts, and files directly from your local filesystem.

Features

  • πŸ” Browse W&B entities, projects, and runs as a directory tree
  • πŸ“ Access run files directly without downloading
  • 🎯 Navigate artifacts by collection and version
  • πŸ”— Artifact aliases (e.g., latest, best) as symlinks
  • πŸ“Š Separate input/output artifacts for each run
  • πŸš€ Human-readable run names instead of IDs
  • πŸ” Secure API key authentication
  • ⚑ Lazy Loading: Files and metadata are fetched only when accessed
  • πŸ›‘οΈ Read-Only: Ensures data integrity by preventing modifications

Quick Start

Prerequisites

  • Rust 1.70+ (install from rustup.rs)
  • FUSE library: sudo apt-get install fuse libfuse-dev (Ubuntu/Debian)
  • W&B API key (get from wandb.ai/authorize)

Installation

git clone <repository-url>
cd wandbfs-rust
cargo build --release

Usage

  1. Set your API key:
export WANDB_API_KEY=your_api_key_here
  1. Mount the filesystem:
# Mount to default directory (./wandbfs)
cargo run

# Or specify a custom mount point
cargo run -- /tmp/wandb
  1. Browse your data:
# List entities
ls /tmp/wandb/

# Navigate to a project
cd /tmp/wandb/your-entity/your-project/

# View runs
ls runs/

# Access run files
cat runs/run-name/file.txt

# Browse artifacts
ls artifacts/
ls runs/run-name/artifacts/input/
ls runs/run-name/artifacts/output/

# Use artifact aliases
cat runs/run-name/artifacts/output/model/latest/model.ckpt
  1. Unmount when done:
fusermount -u /tmp/wandb
# or just Ctrl+C in the terminal running wandbfs

Architecture

WandBFS is a FUSE (Filesystem in Userspace) implementation that leverages the W&B GraphQL API to fetch metadata and files on-demand.

Component Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   User Space    β”‚
β”‚  (ls, cat, etc) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
    β”‚   FUSE   β”‚
    β”‚  Kernel  β”‚
    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
         β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
  β”‚   WandBFS    β”‚
  β”‚  (src/fs.rs) β”‚
  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  WandB Client β”‚
  β”‚(src/client.rs)β”‚
  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  W&B GraphQL  β”‚
  β”‚      API      β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Feature Details

Run-Specific Artifacts

Each run has an artifacts directory containing:

  • input/ - Artifacts used as input to the run
  • output/ - Artifacts produced by the run

These are fetched using the inputArtifacts and outputArtifacts fields from the GraphQL API.

Artifact Aliases

Artifact aliases like latest, best, etc. are created as symlinks:

  • Fetched from the aliases field in the GraphQL response
  • Represented as FileKind::ArtifactAliasLink with a target version
  • readlink() returns the target (e.g., "v10")
  • Version aliases (e.g., "v0", "v1") are skipped to avoid duplication

Filename Sanitization

Files with / in their names are sanitized by replacing / with _:

  • Prevents invalid directory entries
  • Applied to both run files and artifact files

Directory Structure

/tmp/wandb/
β”œβ”€β”€ entity-name/
β”‚   └── project-name/
β”‚       β”œβ”€β”€ runs/
β”‚       β”‚   └── run-display-name/
β”‚       β”‚       β”œβ”€β”€ file1.txt
β”‚       β”‚       β”œβ”€β”€ file2.csv
β”‚       β”‚       └── artifacts/
β”‚       β”‚           β”œβ”€β”€ input/
β”‚       β”‚           β”‚   └── dataset/
β”‚       β”‚           β”‚       β”œβ”€β”€ v0/
β”‚       β”‚           β”‚       β”œβ”€β”€ v1/
β”‚       β”‚           β”‚       └── latest -> v1
β”‚       β”‚           └── output/
β”‚       β”‚               └── model/
β”‚       β”‚                   β”œβ”€β”€ v0/
β”‚       β”‚                   β”œβ”€β”€ best -> v1
β”‚       β”‚                   └── latest -> v2
β”‚       └── artifacts/
β”‚           └── collection-name/
β”‚               β”œβ”€β”€ v0/
β”‚               β”œβ”€β”€ v1/
β”‚               └── latest -> v1

Configuration

API Key

Provide your API key via:

  • Environment variable: export WANDB_API_KEY=...
  • Interactive prompt (if not set)

Timeout

HTTP requests timeout after 30 seconds by default. This can be modified in src/client.rs.

Performance Considerations

Current Limitations

  1. No Caching: Every file read downloads the full file
  2. Pagination: Limited to first 50 items in most queries
  3. Blocking I/O: File downloads block until complete
  4. No Prefetching: Metadata fetched on-demand

Troubleshooting

Mount fails with "File exists":

fusermount -u /tmp/wandb

Filesystem not visible in df: FUSE filesystems may report 0 blocks and be hidden by default. Use -a to see them:

df -ah | grep wandb

Ctrl+C doesn't work during file read:

  • Wait up to 30 seconds for timeout
  • The filesystem is designed to respect Ctrl+C better now

Empty artifact directories:

  • Check your API permissions
  • Verify the run has artifacts with debug_run tool

GraphQL Errors:

  • "Cannot query field X on type Y": The GraphQL schema may have changed. Verify field names using W&B GraphQL playground.

FUSE Errors:

  • "Transport endpoint is not connected": Filesystem crashed or was forcefully unmounted. Unmount and restart.
  • "Input/output error": Filename contains invalid characters (should be sanitized).

Development

Build

# Debug build
cargo build

# Release build (optimized)
cargo build --release

# Run tests
cargo test

Project Structure

src/
β”œβ”€β”€ lib.rs          # Library exports
β”œβ”€β”€ main.rs         # CLI entry point
β”œβ”€β”€ client.rs       # W&B API client
β”œβ”€β”€ fs.rs           # FUSE filesystem implementation

License

MIT License. See LICENSE file for details.

Contributing

Contributions welcome! Please open an issue or PR.

About

A read-only FUSE filesystem for browsing Weights & Biases runs, artifacts, and files directly from your local filesystem.

Topics

Resources

License

Stars

Watchers

Forks

Contributors