Skip to content

feat(tracking): honour tracking.enabled and redact sensitive args#1987

Open
YOMXXX wants to merge 1 commit into
rtk-ai:developfrom
YOMXXX:feat/tracking-redaction-and-gating
Open

feat(tracking): honour tracking.enabled and redact sensitive args#1987
YOMXXX wants to merge 1 commit into
rtk-ai:developfrom
YOMXXX:feat/tracking-redaction-and-gating

Conversation

@YOMXXX
Copy link
Copy Markdown
Contributor

@YOMXXX YOMXXX commented May 20, 2026

Summary

This is the second of three security PRs landing the RTK security &
privacy remediation design
.
It closes Findings 2 + 3, which share the same call site
(Tracker::record in src/core/tracking.rs:402-437).

  • Finding 2 (DB stores secrets): original_cmd and project_path are
    now scrubbed before INSERT via a new src/core/redact.rs module
    (shared with the upcoming tee redaction PR). Project paths default to
    <basename>#<8-hex-sha256>, gated by a new
    tracking.hash_project_paths field (default true, forward-compat
    via #[serde(default)]).
  • Finding 3 (tracking.enabled is dead code):
    Tracker::record and Tracker::record_parse_failure now early-return
    Ok(()) when the flag is false. A new RTK_TRACK=0 env var mirrors
    the existing RTK_TEE=0 one-shot opt-out.

A one-shot migration runs on Tracker::new() to rewrite rows from the
last 90 days. It is gated by PRAGMA user_version so it runs once per
binary version.

Reproduction

Finding 2

# Before this PR:
rtk git push https://alice:s3cr3t@github.com/acme/repo.git --token=ghp_ZZZZZ...
sqlite3 ~/.local/share/rtk/history.db \
  'SELECT original_cmd FROM commands ORDER BY timestamp DESC LIMIT 1;'
# → git push https://alice:s3cr3t@github.com/acme/repo.git --token=ghp_ZZZZZ...
#   (secret stored verbatim for 90 days; visible to anyone with read-home)

Finding 3

# ~/.config/rtk/config.toml (or platform equivalent)
[tracking]
enabled = false
# Before this PR:
rtk git status
rtk gain --history | tail -5
# → row still appears — the flag was parsed but never read.

Root cause

Finding 2

Tracker::record bound original_cmd and current_project_path_string()
straight into a SQLite INSERT. The DB is reached from 200+ call sites
via TimedExecution::start(), so the fix had to live inside
Tracker::record itself, not at the call sites.

Finding 3

grep -rn tracking\.enabled src/ showed zero matches outside config.rs.
The field was parsed, defaulted to true, and never read. Both
Tracker::new() and Tracker::record(…) proceeded unconditionally.

Fix approach

  1. New src/core/redact.rs (pub(crate)) — single-pass, lazy_static
    regex bundle:
    • URL userinfo (https://user:pass@hosthttps://****:****@host)
    • Authorization: Bearer … / Basic … / Token …
    • --token=…, --password …, --api-key=…, --secret …,
      --auth=…, --access-token …, --refresh-token …
    • Inline env assignments from a 22-name allowlist
      (AWS_SECRET_ACCESS_KEY, GH_TOKEN, OPENAI_API_KEY, …)
    • Credential-prefix heuristic (ghp_, gho_, sk-, sk_live_,
      xoxb-, AKIA, ASIA, glpat-, …)
    • redact_project_path returns <basename>#<8-hex-sha256>;
      empty paths stay empty so the existing project_path != '' filter
      in projects_count() keeps working; already-hashed paths pass
      through unchanged (idempotent).
  2. Gate Tracker::record and Tracker::record_parse_failure on a
    shared tracking_enabled() helper. RTK_TRACK=0 short-circuits
    without touching disk; otherwise Config::load() decides; on
    missing/corrupt config we fall open (unwrap_or(true)).
  3. One-shot migration in Tracker::new() — reads rows newer than
    90 days, redacts in-Rust, UPDATEs by id, bumps
    PRAGMA user_version from 0/1 to 2. Errors are logged to stderr but
    never block tracker creation (rtk gain keeps working on
    half-broken DBs).
  4. TrackingConfig::hash_project_paths (default true,
    #[serde(default = "default_true")]) so configs predating this PR
    still parse.
  5. RTK_CONFIG_PATH override added to Config::get_config_path().
    This is a pure test/CI/container convenience that mirrors the
    existing RTK_DB_PATH pattern — required because dirs::config_dir()
    resolves via $HOME/Library/Application Support on macOS, and our
    gating tests cannot safely mutate HOME while other parallel tests
    are calling Tracker::new().

Behavior changes

  • DB rows are lossy from now on. --token=ghp_abc becomes
    --token=****. Ecosystem aggregation (top_commands,
    categorize_command, top_passthrough) only looks at the first
    whitespace-delimited token, so headline rtk gain output is
    unchanged.
  • --by-project filters by hash now. Same semantics (one row per
    project), shorter DB row.
  • tracking.enabled = false is now honoured. No-op INSERT; reads
    are still allowed so rtk gain keeps showing past stats.
  • RTK_TRACK=0 one-shot env opt-out.
  • RTK_CONFIG_PATH explicit config override.

No DB schema change, no public Rust API change, no CLI flag removal.

Test plan

  • cargo fmt --all — clean
  • cargo clippy --all-targets -- -D warnings — clean
  • cargo test --bin rtk -- --test-threads=81920 passed, 0 failed, 6 ignored
  • 19 new tests:
    • src/core/redact.rs: 6 (URL userinfo, Bearer, flag-value, inline env,
      credential heuristic, idempotent).
    • src/core/tracking.rs: 8 Finding 2 tests + 5 Finding 3 tests, all
      serialised via a per-module TRACKING_ENV_LOCK mutex.
  • Migration test seeds a pre-redaction row directly, resets
    PRAGMA user_version = 0, runs the migration, verifies token /
    user / password gone and second-run is a no-op.
  • Release-build perf sanity: target/release/rtk --version runs in
    <10ms on warm cache (rtk itself), rtk gh --version ~50ms (dominated
    by gh startup, not rtk overhead).

Refs

Two RTK security findings (rtk-ai#1875 Findings 2 + 3) share the same call
site in Tracker::record:

1. tracking.enabled = false was parsed but never enforced — the field
   was dead code. Now Tracker::record and Tracker::record_parse_failure
   return Ok(()) silently when the flag is false. A new RTK_TRACK=0
   env var provides an additional one-shot opt-out (matches existing
   RTK_TEE=0 / RTK_DB_PATH).

2. original_cmd and project_path were stored verbatim, capturing
   inline tokens, URL userinfo, and private filesystem layout. The
   new src/core/redact.rs module (shared with the upcoming tee
   redaction PR) scrubs known credential shapes via a lazy_static
   regex bundle, and project paths are stored as
   <basename>#<8-hex-sha256> by default (gated by
   tracking.hash_project_paths, default true).

A one-shot migration runs on Tracker::new() and rewrites rows from
the last 90 days. Existing rtk gain output stays semantically
identical because aggregation reads input_tokens/output_tokens, not
the command string.

Also exposes RTK_CONFIG_PATH for tests / containers that need to
point Config::load() at a deterministic file without mutating HOME,
mirroring the existing RTK_DB_PATH override.

Refs rtk-ai#1875 (Findings 2, 3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant