feat: mask sensitive data inside objects and URLs in code variables#688
Open
ablaszkiewicz wants to merge 3 commits into
Open
feat: mask sensitive data inside objects and URLs in code variables#688ablaszkiewicz wants to merge 3 commits into
ablaszkiewicz wants to merge 3 commits into
Conversation
Code variable masking previously only inspected dicts/lists/tuples/strings, and fell back to a raw repr() on serialization failure. As a result, secrets held as attributes of custom objects (e.g. a PostgresSourceConfig with a `password` field) were emitted verbatim via the unmasked repr() path. This hardens masking to be fail-closed: - Traverse custom objects (dataclasses / objects with a populated __dict__) so sensitive fields are redacted by their real attribute name. This is both safer (a custom __repr__ can't relabel a field out of the mask) and higher-fidelity (only the sensitive field is redacted, surrounding context is kept). - Replace the leaky repr() fallback with a fail-closed _safe_repr() that redacts the whole value when any masking rule matches, redacts when the repr is too long to scan, and emits a type-name placeholder when __repr__ raises. json.dumps gets a default= net so no raw object can slip through. - Scrub credentials embedded in URLs/DSNs (postgresql://user:pass@host) from string values regardless of the surrounding key name. Add `connection_string` to the default mask patterns. Adds a `code_variables_mask_url_credentials` config option (default True), wired through the constructor, module-level global, and per-context override, mirroring code_variables_mask_patterns. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Contributor
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
posthog/exception_utils.py:1083-1096
**`mask_url_credentials` silently inert when `mask_patterns` is empty**
The early return `if not compiled_mask: return value` means URL credential scrubbing is bypassed entirely whenever `compiled_mask` is `None` — which happens when `mask_patterns=[]`. The same guard appears in `_serialize_variable_value` (`elif compiled_mask and mask_url_credentials:`), so a user who explicitly disables name-based masks but still expects URL credentials to be scrubbed gets no protection. The two features are advertised as independent toggles but share a single gate.
### Issue 2 of 2
posthog/test/test_exception_capture.py:984-1002
**Prefer `@pytest.mark.parametrize` for multi-case unit tests**
`test_redact_url_credentials` bundles four distinct input/output assertions in a single test body. Per the team convention, these cases should be expressed as separate parametrize entries so each case gets its own pass/fail signal and name. The same applies to `test_mask_url_credentials_can_be_toggled` (two cases: enabled vs disabled) and the inline assertions inside `test_compile_patterns_fast_path_and_regex_fallback`.
Reviews (1): Last reviewed commit: "feat: mask sensitive data inside objects..." | Re-trigger Greptile |
Contributor
posthog-python Compliance ReportDate: 2026-06-19 21:10:25 UTC ✅ All Tests Passed!45/45 tests passed Capture Tests✅ 29/29 tests passed View Details
Feature_Flags Tests✅ 16/16 tests passed View Details
|
Contributor
|
Reviews (2): Last reviewed commit: "fix: comments" | Re-trigger Greptile |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
Hardens exception
code_variablesmasking so secrets can no longer leak through untraversed objects, therepr()fallback, or URLs/DSNs.repr(); anything we can't safely decompose becomes a placeholder.connection_string/dsnto the default patterns.code_variables_mask_url_credentials(defaultTrue), wired through the constructor, module global, and per-context override (mirrorscode_variables_mask_patterns).Before / after
PostgresSourceConfig(host="db", password="hunter2")— custom object…password='hunter2'…(whole object dumped viarepr()){"host": "db", "password": "‹redacted›", "__class__": "PostgresSourceConfig"}"postgresql://user:hunter2@db:5432/app""postgresql://‹redacted›@db:5432/app""ssh://git@github.com/repo"— username, no password__repr__raises or isn't JSON-serializablerepr()(only length-truncated)"‹TypeName›"placeholder"‹value too long›"(capped at 100, same as dict/list)"‹circular ref›"mask_patterns=[]and URL scrubbing onLimitation
Blocklist masking can't catch a secret stored under an unrecognised name with no detectable shape (e.g. a bare password in a local named
pw). Source context lines are intentionally left untouched.Tests
Object traversal + nested context preservation,
_safe_repr(redact-on-match / clean passthrough / broken__repr__/ too-long), URL scrubbing (multi-URL,@-in-password, IPv6, bare username, other schemes), the size/depth caps, the independent-toggle behaviour, the per-context override, and an end-to-end test mirroring the original leaked event.🤖 Generated with Claude Code