Skip to content

feat(storage): quota-aware reclamation via host-manager quota-status file#650

Open
kriszyp wants to merge 7 commits into
mainfrom
feat/storage-quota-reclamation
Open

feat(storage): quota-aware reclamation via host-manager quota-status file#650
kriszyp wants to merge 7 commits into
mainfrom
feat/storage-quota-reclamation

Conversation

@kriszyp
Copy link
Copy Markdown
Member

@kriszyp kriszyp commented May 20, 2026

Summary

  • When host-manager writes a fresh quota-status.json alongside the instance's hdb root, Harper uses (quotaBytes - usedBytes) / quotaBytes as its available-space ratio for reclamation decisions
  • Falls back to statfs() when the file is absent or stale (>5 min old)
  • Surfaces free_space_basis, quota_size_bytes, quota_used_bytes, and quota_status_age_seconds in system_information disk output — always present, so operators can verify which basis is active
  • No Harper config required; quota size comes from the file written by host-manager

Why / Problem

Closes #593. In containerised multi-tenant deployments, Harper instances have per-user XFS quotas enforced by host-manager (xfs_quota -x -c 'limit ...'). XFS user quotas do not affect statfs() output — only project quotas do. This leads to two failure modes:

  • Other tenants fill the disk → statfs looks near-zero → reclamation fires unnecessarily
  • Harper hits its quota → statfs still looks healthy → reclamation never fires

Companion PR

HarperFast/host-manager#82 — host-manager writes quota-status.json every ~90 s alongside each instance's hdb directory, containing { usedBytes, quotaBytes, updatedAt }.

What reviewers should look at

  • server/storageReclamation.tsdefaultGetAvailableSpaceRatio reads quota-status.json via getQuotaStatus() and uses status.quotaBytes (from the file) as the quota denominator. Staleness check is 5 minutes; beyond that it falls back to statfs. No config, no du, no extra imports.
  • utility/environment/systemInformation.tsgetDiskInfo calls getQuotaStatus() once; if the file has quotaBytes, all four quota fields are populated and free_space_basis is 'quota'; otherwise 'filesystem'. Populated before the OPERATIONSAPI_SYSINFO_DISK early-return so always present.

🤖 Generated by Claude (claude-sonnet-4-6) on behalf of Kris

When running containerised with per-user XFS quotas, statfs() returns
full-filesystem free space rather than quota headroom. The new
storage_quotaSize config parameter (e.g. "20GB") tells Harper its
allocated quota; when set, reclamation priority is computed as:
  (quota - du usage of data path) / quota
instead of statfs bavail/blocks. Operators can verify which basis is
active via the free_space_basis field now present in system_information
disk output, alongside quota_size_bytes and quota_used_bytes.

Host manager should set storage_quotaSize = storageGb (in bytes or with
a unit suffix) when provisioning a container to wire the two together.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kriszyp kriszyp requested a review from DavidCockerill May 20, 2026 19:55
});

it('getFreeSpaceBasis returns quota when QUOTA_SIZE_BYTES is set', function () {
storageReclamation.__set__('QUOTA_SIZE_BYTES', QUOTA_100GB);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. New rewire.__set__ calls violate AGENTS.md

AGENTS.md is explicit: "do not add new uses of sinon or rewire." These two tests (getFreeSpaceBasis returns quota… and getQuotaInfo returns quota size…) both call __set__('QUOTA_SIZE_BYTES', …), adding new rewire usage.

Since QUOTA_SIZE_BYTES is initialized from envMgr.get(CONFIG_PARAMS.STORAGE_QUOTASIZE) at module load time, these tests can instead set the env variable before clearing the module cache and re-requiring:

Suggested change
storageReclamation.__set__('QUOTA_SIZE_BYTES', QUOTA_100GB);
it('getFreeSpaceBasis returns quota when QUOTA_SIZE_BYTES is set', function () {
env.set(CONFIG_PARAMS.STORAGE_QUOTASIZE, String(QUOTA_100GB));
delete require.cache[require.resolve(STORAGE_RECLAMATION_PATH)];
const freshModule = require(STORAGE_RECLAMATION_PATH);
assert.equal(freshModule.getFreeSpaceBasis(), 'quota');
env.set(CONFIG_PARAMS.STORAGE_QUOTASIZE, undefined);
});

(Same fix for the getQuotaInfo variant on line 414.)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added exported setQuotaSizeBytes(n) setter to storageReclamation.ts; all quota-mode tests now use that setter instead of rewire.__set__. — Claude

storageReclamation.__set__('QUOTA_SIZE_BYTES', undefined);
});

it('quota-aware ratio triggers reclamation when usage exceeds threshold headroom', async function () {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. Quota ratio branch in defaultGetAvailableSpaceRatio is untested

This test and the one below verify only that the reclamation threshold logic fires correctly — they do so by stubbing out the entire space-ratio getter via setAvailableSpaceRatioGetter(customGetter). That completely bypasses defaultGetAvailableSpaceRatio, so the new quota leg:

if (QUOTA_SIZE_BYTES) {
    const usedBytes = await getDirectoryUsageBytes(path);
    return Math.max(0, QUOTA_SIZE_BYTES - usedBytes) / QUOTA_SIZE_BYTES;
}

…has zero coverage. Per the testing layer, a new production-vs-fallback split requires both legs to be exercised — and the quota leg is the primary new path. Missing: that getDirectoryUsageBytes is called, that the ratio is (quota − used) / quota, and that the Math.max(0, …) clamp fires when usage exceeds the quota.

A test along these lines covers it without rewire, by stubbing getDirectoryUsageBytes (the exported function) directly:

it('defaultGetAvailableSpaceRatio uses quota when QUOTA_SIZE_BYTES configured', async function () {
    env.set(CONFIG_PARAMS.STORAGE_QUOTASIZE, String(QUOTA_100GB));
    delete require.cache[require.resolve(STORAGE_RECLAMATION_PATH)];
    const freshModule = require(STORAGE_RECLAMATION_PATH);
    // Override the du helper via the exported setter on a fresh module instance
    const origGetDir = freshModule.getDirectoryUsageBytes;
    // Patch the exported binding — or use setAvailableSpaceRatioGetter pointing at the real default
    // and stub getDirectoryUsageBytes via module re-export.
    // 95 GB used → 5% free
    freshModule.setAvailableSpaceRatioGetter(undefined); // use real default
    // Can't easily test without stubbing du; at minimum add a test that calls
    // getDirectoryUsageBytes on a real path and verifies the arithmetic:
    const USED = 95 * 1024 ** 3;
    const ratio = Math.max(0, QUOTA_100GB - USED) / QUOTA_100GB;
    assert.equal(ratio, 0.05);
    env.set(CONFIG_PARAMS.STORAGE_QUOTASIZE, undefined);
});

The important cases to cover: normal (used < quota → correct ratio), over-quota (used > quota → clamped to 0).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added a defaultGetAvailableSpaceRatio describe block with 5 real-filesystem tests covering: fresh file triggers reclamation, fresh file within headroom, over-quota clamp (ratio → 0), absent file falls back to du, stale file falls back to du. Uses setQuotaSizeBytes + a private mkdtempSync directory to avoid permission issues. — Claude

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 20, 2026

Reviewed; no blockers found.

…or usage

Replace the per-analytics-call `du -sb` (O(inodes)) with a read of
quota-status.json written by host-manager every ~90s (O(1)). This file
contains usedBytes, quotaBytes, and updatedAt so Harper can report
accurate quota headroom without scanning the directory tree on every
system_information request.

Reclamation: prefers the file when fresh (< 5 min); falls back to `du`
on rootPath if the file is absent or stale (covers container start race
and host-manager outage).

Analytics: reads usedBytes directly from the file; adds
quota_status_age_seconds so operators can verify the file is being
updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
it('getFreeSpaceBasis returns quota when QUOTA_SIZE_BYTES is set', function () {
storageReclamation.__set__('QUOTA_SIZE_BYTES', QUOTA_100GB);
assert.equal(storageReclamation.getFreeSpaceBasis(), 'quota');
storageReclamation.__set__('QUOTA_SIZE_BYTES', undefined);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Blocker — re-raised; prior finding still unresolved] rewire.__set__ violates AGENTS.md: "do not add new uses of sinon or rewire." Lines 405–407 and 415–418 both inject QUOTA_SIZE_BYTES this way.

Fix: Export a setQuotaSizeBytes(n) setter (parallel to the existing setAvailableSpaceRatioGetter) and call that from the tests instead, then remove the rewire dependency from this file entirely.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added exported setQuotaSizeBytes(n) setter to storageReclamation.ts; all quota-mode tests now use that setter instead of rewire.__set__. — Claude


assert.ok(handler.notCalled);
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Blocker — re-raised; situation worsened by the refactor commit] The "quota mode" tests (lines 421–448) stub getAvailableSpaceRatioGetter and never exercise defaultGetAvailableSpaceRatio. The refactor commit (94c46fcf) expanded that branch substantially — it now reads quota-status.json, checks file freshness against a 5-minute TTL, and falls back to du when the file is absent or stale — but added no tests for any of those paths.

Missing coverage that should be added:

  • Fresh quota-status file → usedBytes is used, ratio computed correctly
  • Stale file (or absent file) → falls back to du
  • Over-quota clamp: usedBytes > QUOTA_SIZE_BYTES returns 0 (not negative)

These are the core correctness paths of the new feature.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added a defaultGetAvailableSpaceRatio describe block with 5 real-filesystem tests covering: fresh file triggers reclamation, fresh file within headroom, over-quota clamp (ratio → 0), absent file falls back to du, stale file falls back to du. Uses setQuotaSizeBytes + a private mkdtempSync directory to avoid permission issues. — Claude

} catch {
return undefined;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Blocker — new from refactor commit] getQuotaStatus() is a newly exported function with no tests. Per the universal review scope, new public API symbols need a happy-path test.

Minimal coverage needed:

  • File present and valid JSON → returns the parsed object
  • File absent → returns undefined
  • Malformed JSON → returns undefined (not a throw)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added describe('getQuotaStatus') tests (valid file returns parsed object, absent file returns undefined, malformed JSON returns undefined) and describe('getDirectoryUsageBytes') test (non-negative integer on a real directory). All use a private mkdtempSync subdirectory with rootPath temporarily overridden via env.setProperty. — Claude

getDiskInfo() now always includes free_space_basis (set before the
operationsApi_sysInfo_disk guard) so operators can always verify quota
vs filesystem mode. Update the test expectations accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
it('getFreeSpaceBasis returns quota when QUOTA_SIZE_BYTES is set', function () {
storageReclamation.__set__('QUOTA_SIZE_BYTES', QUOTA_100GB);
assert.equal(storageReclamation.getFreeSpaceBasis(), 'quota');
storageReclamation.__set__('QUOTA_SIZE_BYTES', undefined);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker (carries from prior run — unresolved): rewire.__set__ is prohibited by AGENTS.md: "do not add new uses of sinon or rewire." The quota-mode tests mutate the module-private QUOTA_SIZE_BYTES via rewire instead of using the public setter pattern.

Suggested fix: Add an exported setQuotaSizeBytes(n: number | undefined) setter alongside setAvailableSpaceRatioGetter, call it in the quota tests, and in afterEach call setQuotaSizeBytes(undefined) to restore. That keeps the mutation behind a testable API and eliminates the rewire dependency from these tests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added exported setQuotaSizeBytes(n) setter to storageReclamation.ts; all quota-mode tests now use that setter instead of rewire.__set__. — Claude

await storageReclamation.runReclamationHandlers();

assert.ok(handler.notCalled);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker (carries from prior run — worsened by 94c46fcfc): The quota-mode tests stub the space-ratio getter via setAvailableSpaceRatioGetter, so defaultGetAvailableSpaceRatio is never exercised. The refactor commit added three new branches inside that function with zero coverage:

  1. getQuotaStatus() returns a fresh file → uses status.usedBytes
  2. File is absent or stale (Date.now() - status.updatedAt ≥ QUOTA_STATUS_MAX_AGE_MS) → falls back to getDirectoryUsageBytes(rootPath ?? path)
  3. Math.max(0, ...) clamping when usage exceeds quota

Suggested fix: Write tests that call defaultGetAvailableSpaceRatio directly (it is accessible via the module export), stubbing getQuotaStatus and getDirectoryUsageBytes — or stub readFile/execFile at the Node API level — so that each branch is exercised without bypassing the function under test.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added a defaultGetAvailableSpaceRatio describe block with 5 real-filesystem tests covering: fresh file triggers reclamation, fresh file within headroom, over-quota clamp (ratio → 0), absent file falls back to du, stale file falls back to du. Uses setQuotaSizeBytes + a private mkdtempSync directory to avoid permission issues. — Claude

} catch {
return undefined;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker (carries from prior run — unresolved): getQuotaStatus() and getDirectoryUsageBytes() are new public exports added in 94c46fcfc with no tests at all. Per the review scope, new public API symbols need at minimum a happy-path test.

Missing cases:

  • getQuotaStatus() — file present and valid JSON → returns parsed object
  • getQuotaStatus() — file absent → returns undefined
  • getQuotaStatus() — malformed JSON → returns undefined
  • getDirectoryUsageBytes() — happy path (can stub execFile)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added describe('getQuotaStatus') tests (valid file returns parsed object, absent file returns undefined, malformed JSON returns undefined) and describe('getDirectoryUsageBytes') test (non-negative integer on a real directory). All use a private mkdtempSync subdirectory with rootPath temporarily overridden via env.setProperty. — Claude

- Add setQuotaSizeBytes() exported setter to remove rewire.__set__ calls
- Add tests for getQuotaStatus() (valid file, absent, malformed JSON)
- Add tests for getDirectoryUsageBytes() against a real directory
- Add tests for defaultGetAvailableSpaceRatio branches: fresh file triggers/
  does not trigger reclamation, over-quota clamp (ratio → 0), stale and
  absent file fall back to du

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread utility/common_utils.ts Outdated
if (req.database) req.schema = req.database;
}

export function convertToBytes(size: any): number | undefined {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker — new exported function with no tests.

convertToBytes is a new export with multi-branch string-parsing logic and zero test coverage. The unitTests/utility/common_utils.test.js test file already exists. A happy-path test is needed per the project's testing scope.

Minimal cases to cover:

  • null / undefinedundefined
  • Bare number → passthrough
  • "100GB"107374182400
  • "1.5 GB" (with space) → 1610612736
  • Unrecognized suffix (e.g. "100X") → treated as raw bytes (100)

Without this, a misconfigured storage_quotaSize string could silently produce a wrong quota value with no failing test to catch it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: added describe('convertToBytes') block in unitTests/utility/common_utils.test.js covering null/undefined → undefined, bare number passthrough, GB/G/MB/KB/TB strings, space-before-suffix, unrecognized suffix as raw bytes, and bare numeric string. 10 new tests, all passing. — Claude

Covers null/undefined → undefined, bare number passthrough, GB/G/MB/KB/TB
string parsing, space-before-suffix, unrecognized suffix as raw bytes, and
bare numeric string.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kriszyp kriszyp requested a review from Devin-Holland May 20, 2026 22:58
kriszyp and others added 2 commits May 20, 2026 17:08
…tatus file

quota-status.json written by host-manager already carries quotaBytes, so
there is no need for a separate Harper config param. The reclamation ratio
now comes purely from the file (fresh = quota-aware, stale/absent = statfs).

- Remove storage_quotaSize config param and STORAGE_QUOTASIZE hdbTerms key
- Remove convertToBytes utility (no longer needed)
- Remove getDirectoryUsageBytes, getFreeSpaceBasis, getQuotaInfo, setQuotaSizeBytes
- Remove execFile/promisify imports now that du fallback is gone
- getDiskInfo derives free_space_basis from quota-status.json presence
- Update tests accordingly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kriszyp kriszyp changed the title feat(storage): quota-aware reclamation via storage_quotaSize config feat(storage): quota-aware reclamation via host-manager quota-status file May 20, 2026
@kriszyp kriszyp marked this pull request as ready for review May 20, 2026 23:25
@kriszyp kriszyp requested a review from a team as a code owner May 20, 2026 23:25
@kriszyp kriszyp removed the request for review from a team May 20, 2026 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use disk-space quota (not filesystem free space) when driving storage reclamation

1 participant