Skip to content

Add 'z' redaction option to normalise timezone to UTC#51

Open
Melusion wants to merge 1 commit into
EMPRI-DEVOPS:masterfrom
Melusion:feature/redact-timezone-utc
Open

Add 'z' redaction option to normalise timezone to UTC#51
Melusion wants to merge 1 commit into
EMPRI-DEVOPS:masterfrom
Melusion:feature/redact-timezone-utc

Conversation

@Melusion

Copy link
Copy Markdown

Add z redaction option to normalise timezone to UTC

Closes #39.

What

Adds z to the redaction pattern. When present, author and committer
timestamps are converted to UTC (offset +00:00), removing the timezone
offset as a location fingerprint.

git config privacy.pattern z      # only strip the timezone
git config privacy.pattern mhsz   # full hour + zeroed minutes/seconds, in UTC

Why

git-privacy already redacts when in the day someone commits, but the
timezone offset (+0200, -0700, …) survived untouched. Over a history it
reveals the committer's region — and, via shifts in the offset, relocations and
travel. Issue #39 requested exactly this; the tzcheck hook so far only warns
about timezone changes rather than removing them.

Behaviour

  • Applied before the M/d/h/m/s precision reductions, so those operate on
    the resulting UTC wall-clock time. (Example covered by a test: 01:30 +02:00
    2018-12-17 23:30 UTC → with h00:30 UTC; the date rolling back to the
    17th proves the conversion ran first.)
  • With a limit, the working-hours window is therefore interpreted in UTC.
  • Naive datetimes are treated as already-UTC and just tagged as such, so the
    result is deterministic regardless of the host's local timezone.
  • No change in behaviour when z is absent from the pattern.

Implementation note (open question for maintainers)

z uses datetime.astimezone(UTC) — it preserves the instant and changes
the offset to +00:00 (14:42 +02:0012:42 +00:00). This pairs cleanly with
utils.dt2gitdate, which writes int(d.timestamp()) plus %z.

The alternative would be a relabel-only semantic (keep the wall-clock numbers,
just set the offset to +00:00), which shifts the instant. I went with
astimezone as the natural reading of "redact to UTC", but I'm happy to switch
if you'd prefer the relabel behaviour.

Tests

New TimezoneTestCase in tests/test_timestamp.py covers:

  • conversion to UTC,
  • instant preservation,
  • ordering before the precision reductions,
  • the naive-input case.

All existing tests pass unchanged (python -m unittest tests.test_timestamp).

Docs

Help text in gitprivacy.py and both pattern lists in README.md updated to
document the z identifier.

Adds 'z' to the reduction pattern: it converts author and committer
timestamps to UTC (offset +00:00), removing the timezone offset as a
location fingerprint. 'z' is applied before the existing M/d/h/m/s
precision reductions, so those operate on the resulting UTC wall-clock
time (and, with a limit, the working-hours window is interpreted in UTC).

Naive datetimes are treated as UTC for a deterministic result regardless
of the host's local timezone.

Existing behaviour is unchanged when 'z' is not present in the pattern.
Adds TimezoneTestCase covering conversion, instant preservation, ordering
before reductions, and the naive-input case. Help text and README updated.

Closes EMPRI-DEVOPS#39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a z redaction option to redact timezones to UTC

1 participant