fix: api memory improvement by baktun14 · Pull Request #2762 · akash-network/console

baktun14 · 2026-02-17T18:13:01Z

Why

The memory-cache doesn't have any limit

What

Replace with LRUCache

Summary by CodeRabbit

Improvements
- Switched to bounded LRU caching across the API for better memory use and predictable eviction.
- Adjusted health-check/readiness logic for clearer success tracking and stricter initial-failure behavior.
Chores
- Removed legacy in-memory cache dependency and migrated caching helpers and usages to the new implementation.
Tests
- Added comprehensive cache unit tests and updated test cleanup to use the new caching helpers.

coderabbitai · 2026-02-17T18:13:30Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e635033 and 5da1d52.

📒 Files selected for processing (1)

apps/api/src/healthz/services/healthz/healthz.service.spec.ts

📝 Walkthrough

Walkthrough

Replaces the in-repo memory-cache usage with lru-cache across the API app: dependency removal, MemoryCacheEngine and helpers switched to LRUCache (with signature changes and a new clearByKey), service caches migrated to LRU, functional tests updated to use cache helper, and healthcheck logic updated to track prior success with a new hasSucceeded flag. (47 words)

Changes

Cohort / File(s)	Summary
Dependency Update `apps/api/package.json`	Removed `memory-cache` and `@types/memory-cache` from dependencies/devDependencies.
Cache Engine & Helpers `apps/api/src/caching/memoryCacheEngine.ts`, `apps/api/src/caching/helpers.ts`	Replaced `memory-cache` with `lru-cache`; introduced `CacheValue` type and global LRU instances; `getFromCache<T>` now returns `T
Service Cache Migration `apps/api/src/deployment/services/cached-balance/cached-balance.service.ts`	Replaced internal `Map<string, CachedBalance>` with `LRUCache<string, CachedBalance>({ max: 10000 })` (implementation change only).
Healthcheck Logic `apps/api/src/healthz/services/healthz/healthz.service.ts`	Added private `hasSucceeded` flag and revised readiness logic: first-failure tolerance now depends on prior `hasSucceeded` instead of nullable `isFailed`; `isFailed` behavior adjusted accordingly.
Auth Cache Typing `apps/api/src/auth/services/user-auth-token/user-auth-token.service.ts`	Updated cache typing to use `CacheValue` for `kvStore.put` calls (type refinement only).
Tests — Cache Cleanup `apps/api/test/functional/provider-deployments.spec.ts`, `apps/api/test/functional/provider-earnings.spec.ts`, `apps/api/test/functional/providers.spec.ts`	Replaced direct `memory-cache` usage with `cacheEngine` helper and switched teardown `mcache.clear()` to `cacheEngine.clearAllKeyInCache()`.
New Tests `apps/api/src/caching/memoryCacheEngine.spec.ts`	Added comprehensive unit tests covering get/store/TTL/clear/clearByPrefix/getKeys behaviors for the MemoryCacheEngine backed by LRU.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰
I swapped the crumbs for tidy bins,
LRU hums where hoarding ends.
Keys pruned short, old maps set free,
A new flag watches first victory.
Hop, cache — neat memory, please and thanks. 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'fix: api memory improvement' is vague and doesn't clearly describe the specific change (replacing memory-cache with LRUCache).	Make the title more specific, e.g., 'fix: replace unbounded memory-cache with LRUCache' to clearly convey the main change.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description addresses both 'Why' and 'What' sections with relevant information about the memory limitation issue and the solution.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/api-memory-leaks

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-02-17T18:15:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 53.96%. Comparing base (fd91ace) to head (5da1d52).
⚠️ Report is 21 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2762      +/-   ##
==========================================
+ Coverage   53.64%   53.96%   +0.32%     
==========================================
  Files        1019     1020       +1     
  Lines       23591    23608      +17     
  Branches     5754     5771      +17     
==========================================
+ Hits        12655    12741      +86     
+ Misses       9531     9476      -55     
+ Partials     1405     1391      -14

Flag	Coverage Δ
api	`76.92% <100.00%> (-0.01%)`	⬇️
deploy-web	`36.38% <ø> (+0.45%)`	⬆️
log-collector	`75.35% <ø> (ø)`
notifications	`85.56% <ø> (ø)`
provider-console	`81.48% <ø> (ø)`
provider-proxy	`85.93% <ø> (+3.52%)`	⬆️
tx-signer	`79.29% <ø> (ø)`

Files with missing lines	Coverage Δ
...ervices/user-auth-token/user-auth-token.service.ts	`82.60% <100.00%> (ø)`
apps/api/src/caching/helpers.ts	`100.00% <100.00%> (ø)`
apps/api/src/caching/memoryCacheEngine.ts	`100.00% <100.00%> (+45.00%)`	⬆️
.../services/cached-balance/cached-balance.service.ts	`100.00% <100.00%> (ø)`
...pi/src/healthz/services/healthz/healthz.service.ts	`100.00% <100.00%> (ø)`

... and 34 files with indirect coverage changes

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/api/src/caching/memoryCacheEngine.ts`:
- Around line 3-4: Replace the explicit any used in the LRUCache instantiation
with unknown and remove the eslint-disable comment: change the cache declaration
from LRUCache<string, any> to LRUCache<string, unknown> so the MemoryCacheEngine
wrapper’s generic get/set methods maintain type safety without suppressing the
`@typescript-eslint/no-explicit-any` rule (look for the cache variable and
MemoryCacheEngine usages to update).
- Around line 25-26: The TTL unit is mismatched: make storeInCache treat its
duration parameter as seconds (to match cacheResponse callers) and convert to
milliseconds before passing to lru-cache; inside storeInCache(key: string, data:
T, duration?: number) compute const ttl = duration ? duration * 1000 : undefined
and call cache.set(key, data, ttl ? { ttl } : undefined). Update any references
to storeInCache (and leave cacheResponse(seconds) calls unchanged) so callers
can continue passing seconds without breaking behavior.

ygrishajev · 2026-02-18T12:28:17Z

although this is not introducing radical changes I think the cache engine should be reworked as an interface utilising various storages. E.g. in local dev LRUCache is totally find, in prod we should use a distributed storage like redis.

@stalniy wdyt?

stalniy · 2026-02-18T17:18:27Z

although this is not introducing radical changes I think the cache engine should be reworked as an interface utilising various storages. E.g. in local dev LRUCache is totally find, in prod we should use a distributed storage like redis.

@stalniy wdyt?

I actually like this one because it's minus 1 dependency. And this is the thing which I wanted to do by myself for a long time :)

I think that we should make a discussion for this and consider other options like https://bentocache.dev/docs/introduction

The healthz service was returning unhealthy immediately on the first failed check after startup (when isFailed was null), bypassing the failure tolerance logic. This could cause liveness probe failures during pod startup when DB connections aren't yet established. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace `any` with `NonNullable<unknown>` cache value type, change `getFromCache` sentinel from `false` to `undefined`, and convert TTL duration from seconds to milliseconds for lru-cache compatibility.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/api/src/caching/memoryCacheEngine.spec.ts`:
- Around line 169-173: The setup function currently has no parameters; change
its signature to accept a single destructured empty object with an inline type
to satisfy the test guideline (e.g., function setup({}: {}) { ... }), keep the
body as-is (new MemoryCacheEngine(), engine.clearAllKeyInCache(), return {
engine }) and ensure this setup remains at the bottom of the root describe
block; reference: setup, MemoryCacheEngine, engine.clearAllKeyInCache.

Track whether a healthcheck has succeeded at least once before tolerating failures, preventing false-positive liveness on cold start.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/api/src/healthz/services/healthz/healthz.service.ts (1)
62-99: ⚠️ Potential issue | 🟠 Major

Clear checkedAt on first-time failure to prevent false-positive liveness.

The check() method always updates checkedAt in its finally block (line 105), even when the ping fails. This means a failed first call still caches a timestamp; subsequent isHealthy() calls within cacheTTL skip the check and return true, yielding false-positive liveness during cold start. The hasSucceeded guard on line 95 correctly returns false on initial failure, but the cached timestamp bypasses this guard on subsequent calls within the TTL window.
🛠️ Minimal fix to avoid caching failures before first success
      const prevIsFailed = this.isFailed;
+      if (!this.hasSucceeded) {
+        this.checkedAt = null;
+        return false;
+      }
-      if (prevIsFailed || !this.hasSucceeded || options?.ignoreCache) return false;
+      if (prevIsFailed || options?.ignoreCache) return false;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/healthz/services/healthz/healthz.service.ts` around lines 62 -
99, The Healthcheck's checkedAt is being set even when check() fails, causing a
failed first ping to be cached and producing false-positive liveness; update the
logic so checkedAt is only recorded after a successful ping (or explicitly clear
checkedAt on first-time failure). Specifically, modify
Healthcheck.check()/isHealthy() behavior so that checkedAt is not updated in the
finally block when the ping throws, and if a ping fails while hasSucceeded is
false then set checkedAt = null (or move the checkedAt update into the success
path) so subsequent isHealthy() calls within cacheTTL will re-run check()
instead of returning a cached true.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@apps/api/src/healthz/services/healthz/healthz.service.ts`:
- Around line 62-99: The Healthcheck's checkedAt is being set even when check()
fails, causing a failed first ping to be cached and producing false-positive
liveness; update the logic so checkedAt is only recorded after a successful ping
(or explicitly clear checkedAt on first-time failure). Specifically, modify
Healthcheck.check()/isHealthy() behavior so that checkedAt is not updated in the
finally block when the ping throws, and if a ping fails while hasSucceeded is
false then set checkedAt = null (or move the checkedAt update into the success
path) so subsequent isHealthy() calls within cacheTTL will re-run check()
instead of returning a cached true.

baktun14 requested a review from a team as a code owner February 17, 2026 18:13

baktun14 changed the title ~~Fix/api memory leaks~~ fix: api memory improvement Feb 17, 2026

github-actions Bot added size: XS experienced-contributor labels Feb 17, 2026

github-actions Bot reviewed Feb 17, 2026

View reviewed changes

Comment thread package-lock.json

coderabbitai Bot reviewed Feb 17, 2026

View reviewed changes

Comment thread apps/api/src/caching/memoryCacheEngine.ts Outdated

Comment thread apps/api/src/caching/memoryCacheEngine.ts Outdated

ygrishajev requested changes Feb 18, 2026

View reviewed changes

Comment thread apps/api/src/caching/memoryCacheEngine.ts Outdated

Comment thread apps/api/src/caching/memoryCacheEngine.ts Outdated

stalniy approved these changes Feb 18, 2026

View reviewed changes

baktun14 and others added 2 commits February 19, 2026 11:35

fix: replace memory-cache with LRUCache for improved caching performance

562ff51

baktun14 force-pushed the fix/api-memory-leaks branch from f9d164b to 238392c Compare February 19, 2026 16:36

fix(config): address PR review comments on LRUCache implementation

924e7da

Replace `any` with `NonNullable<unknown>` cache value type, change `getFromCache` sentinel from `false` to `undefined`, and convert TTL duration from seconds to milliseconds for lru-cache compatibility.

github-actions Bot added size: S and removed size: XS labels Feb 19, 2026

feat(config): add unit tests for MemoryCacheEngine

529ce28

github-actions Bot added size: M and removed size: S labels Feb 19, 2026

coderabbitai Bot reviewed Feb 19, 2026

View reviewed changes

Comment thread apps/api/src/caching/memoryCacheEngine.spec.ts

fix(config): only tolerate first liveness failure after a prior success

e635033

Track whether a healthcheck has succeeded at least once before tolerating failures, preventing false-positive liveness on cold start.

coderabbitai Bot reviewed Feb 19, 2026

View reviewed changes

ygrishajev approved these changes Feb 19, 2026

View reviewed changes

fix(config): remove dead cacheEngine import and fix test grammar

5da1d52

baktun14 enabled auto-merge February 24, 2026 17:03

baktun14 added this pull request to the merge queue Feb 24, 2026

Merged via the queue into main with commit 7792ab8 Feb 24, 2026
54 of 55 checks passed

baktun14 deleted the fix/api-memory-leaks branch February 24, 2026 17:12

Conversation

baktun14 commented Feb 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Summary by CodeRabbit

Uh oh!

Uh oh!

coderabbitai Bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

codecov Bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ygrishajev commented Feb 18, 2026

Uh oh!

Uh oh!

Uh oh!

stalniy commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

baktun14 commented Feb 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 17, 2026 •

edited

Loading

codecov Bot commented Feb 17, 2026 •

edited

Loading

stalniy commented Feb 18, 2026 •

edited

Loading