Skip to content

MLflow tracing with async Stop hook (opt-in) — resolves #9 #11

@dgokeeffe

Description

@dgokeeffe

Summary

Fix for #9 (MLflow tracing silently disabled). Migrating from datasciencemonkey PR #139.

  • Opt-in tracing via MLFLOW_CLAUDE_TRACING_ENABLED=true in app.yaml — keeps default behaviour unchanged for existing deployments, but gives users a single env-var to flip.
  • Stop hook delegates to mlflow-trace-stop.sh, which backgrounds the handler via nohup timeout 30 … & disown. Returns in <1s so the rest of the Stop hook chain isn't blocked.
  • Hook-event JSON via temp file captured synchronously before backgrounding — naive nohup would redirect stdin to /dev/null and the handler would lose the transcript path.
  • Hard 30s ceiling on the backgrounded flush prevents a wedged handler from leaking memory/CPU.
  • Pins mlflow-skinny and mlflow-tracing to 3.11.1 to match the Apps runtime — version mismatches caused silent import failures.

Why this resolves #9

#9 documents that MLFLOW_CLAUDE_TRACING_ENABLED="false" is hardcoded today; the upstream Stop hook short-circuits and no traces are written despite the README claiming auto-tracing. This PR introduces the env-var override so users can flip it on without modifying source, and adds the async wrapper so enabling it doesn't slow session teardown.

Branch

feat/mlflow-tracing — about to be pushed.

Diff scope

+128 / -32, 3 files. Tests in tests/test_mlflow_tracing.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions