Add 'z' redaction option to normalise timezone to UTC#51
Open
Melusion wants to merge 1 commit into
Open
Conversation
Adds 'z' to the reduction pattern: it converts author and committer timestamps to UTC (offset +00:00), removing the timezone offset as a location fingerprint. 'z' is applied before the existing M/d/h/m/s precision reductions, so those operate on the resulting UTC wall-clock time (and, with a limit, the working-hours window is interpreted in UTC). Naive datetimes are treated as UTC for a deterministic result regardless of the host's local timezone. Existing behaviour is unchanged when 'z' is not present in the pattern. Adds TimezoneTestCase covering conversion, instant preservation, ordering before reductions, and the naive-input case. Help text and README updated. Closes EMPRI-DEVOPS#39
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add
zredaction option to normalise timezone to UTCCloses #39.
What
Adds
zto the redaction pattern. When present, author and committertimestamps are converted to UTC (offset
+00:00), removing the timezoneoffset as a location fingerprint.
Why
git-privacyalready redacts when in the day someone commits, but thetimezone offset (
+0200,-0700, …) survived untouched. Over a history itreveals the committer's region — and, via shifts in the offset, relocations and
travel. Issue #39 requested exactly this; the
tzcheckhook so far only warnsabout timezone changes rather than removing them.
Behaviour
M/d/h/m/sprecision reductions, so those operate onthe resulting UTC wall-clock time. (Example covered by a test:
01:30 +02:00→
2018-12-17 23:30 UTC→ withh→00:30 UTC; the date rolling back to the17th proves the conversion ran first.)
limit, the working-hours window is therefore interpreted in UTC.result is deterministic regardless of the host's local timezone.
zis absent from the pattern.Implementation note (open question for maintainers)
zusesdatetime.astimezone(UTC)— it preserves the instant and changesthe offset to
+00:00(14:42 +02:00→12:42 +00:00). This pairs cleanly withutils.dt2gitdate, which writesint(d.timestamp())plus%z.The alternative would be a relabel-only semantic (keep the wall-clock numbers,
just set the offset to
+00:00), which shifts the instant. I went withastimezoneas the natural reading of "redact to UTC", but I'm happy to switchif you'd prefer the relabel behaviour.
Tests
New
TimezoneTestCaseintests/test_timestamp.pycovers:All existing tests pass unchanged (
python -m unittest tests.test_timestamp).Docs
Help text in
gitprivacy.pyand both pattern lists inREADME.mdupdated todocument the
zidentifier.