Skip to content

Add PII correlation tool#2049

Open
ArtOfCode- wants to merge 12 commits into
developfrom
art/2000/pii-correlation
Open

Add PII correlation tool#2049
ArtOfCode- wants to merge 12 commits into
developfrom
art/2000/pii-correlation

Conversation

@ArtOfCode-
Copy link
Copy Markdown
Member

Closes #2000.

Adds a tool for moderators to correlate user PII, as described in the linked issue.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.06%. Comparing base (9c10754) to head (5cc919e).

Additional details and impacted files
Components Coverage Δ
controllers 76.15% <100.00%> (+0.07%) ⬆️
helpers 85.43% <100.00%> (+0.12%) ⬆️
jobs 68.32% <100.00%> (+1.43%) ⬆️
models 93.04% <ø> (ø)
tasks 61.11% <ø> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ArtOfCode- ArtOfCode- closed this May 15, 2026
@ArtOfCode- ArtOfCode- deleted the art/2000/pii-correlation branch May 15, 2026 12:37
@ArtOfCode- ArtOfCode- restored the art/2000/pii-correlation branch May 15, 2026 13:09
@ArtOfCode- ArtOfCode- reopened this May 15, 2026
@ArtOfCode- ArtOfCode- marked this pull request as ready for review May 18, 2026 19:48
@ArtOfCode- ArtOfCode- requested review from Oaphi and cellio May 18, 2026 19:48
Copy link
Copy Markdown
Member

@cellio cellio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on my dev server. The tool looks good, a user with a different email domain shows as different (good), all the IPs look the same. I tried using a VPN to log in one of my users but it still showed up all red; I'm not sure if localhost overrides VPN or the weekly job needs to run or something else. We should test this on the dev server before deploying to make sure we aren't unintentionally implicating a lot of people who aren't really on the same IPs.

For email addresses, I only saw "unknown number of users", even when I made up fake domains. What's an example of a domain that should produce some other description there?

Most of the code here is above my head; someone else should definitely review.

@ArtOfCode-
Copy link
Copy Markdown
Member Author

ArtOfCode- commented May 19, 2026

For email addresses, I only saw "unknown number of users", even when I made up fake domains. What's an example of a domain that should produce some other description there?

These stats are recalculated every week and cached in between. To generate the initial cache, run UpdateUserStatsJob.perform_now in a Rails console.

I'm not sure if localhost overrides VPN or the weekly job needs to run or something else.

Localhost probably overrides VPN. You can tell if it's a localhost address - for IPv4 that's 127.0.0.1; although it's hashed you'll be able to see that the middle two octets are the same and the first and last are different. For IPv6 it's ::1, which expands to a lot of zeroes and a 1; again, the hash for every hexadecet (favourite new word) except the last will be the same.

@cellio
Copy link
Copy Markdown
Member

cellio commented May 19, 2026

These two users have the same email domain; should that have been highlighted? A domain isn't as strong an indicator as IP is (particularly with popular providers), but we say we highlight matches, so seems like we should either move that notice to before the IP block or do it for email too.

[Screenshot 2026-05-19 at 10 36 41 AM](screenshot: matching domains with no red, matching IPs with red)

@ArtOfCode-
Copy link
Copy Markdown
Member Author

These two users have the same email domain; should that have been highlighted?

Yep, apparently I completely missed that

Copy link
Copy Markdown
Member

@cellio cellio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a PII-correlation tool for moderators

2 participants