Add redis backed download count validator#1644
Add redis backed download count validator#1644janbro wants to merge 6 commits intoeclipse:masterfrom
Conversation
|
This PR needs to be discussed first before integrating. My understanding was that we do a download count validator for analysing the download loads rather for the metric that are anyway just a fraction of the actual downloads. Furthermore the metrics themselves are not taken into account for anything only serve informational purpose, while the download logs directly affect the download counts. |
|
Was looking at the wrong place for download count services used for production. Updating this PR with changes to both the Aws and Azure download count handlers. This will need some changes to the log records to support the deduping efforts |
ef9ee7a to
e3fe741
Compare
e3fe741 to
f9ce3a3
Compare
|
Updated the handlers to use the count validator. Had to update the log parsers to read the information relevant to preventing download inflation. Example configuration for download count validation: Important Old configuration, see new configuration below Since the count handlers are scheduled to run every hour, it's important to set the late arrival hours to at least that window, as that adds additional time to the TTL of the redis keys in order to still have them alive during the next run. The keys though are aggregated into time buckets based on the dedup window so otherwise are not dependent on when the count handlers run. |
|
will be updating this to have the configuration reference download count per hour rather than time window per feedback |
|
Updated the logic to use max downloads per hour over a deduplication window to simplify the configuration and allow higher download counts per hour. New configuration example is: |
Adds
DownloadCountValidatorthat gatesincreaseDownloadCountinStorageUtilService.When Redis is enabled, duplicate downloads from the same IP within 30 minutes are deduplicated and bot user-agents are ignored. Downloads are still served normally, only the counter is affected.
IPs are SHA-256 hashed before storage. Reuses the existing ip-address-function SpEL config for consistent IP resolution with rate limiting.
Before merging this PR ideally the 30 minute window is reviewed and agreed upon based on existing usage and data.