feat: add backend caching for AMT API endpoints #768

nmgaston · 2026-01-23T04:19:54Z

Summary

Adds backend caching for AMT API endpoints using robfig/go-cache to dramatically reduce latency and backend load for repeated requests within short time windows.

Problem

When users interact with the KVM interface or monitor device status, the frontend makes repeated API calls to fetch power state, features, and KVM display settings. Each call requires a round-trip to the AMT firmware, adding 150-500ms latency per request.

Solution

Implements thread-safe in-memory caching using robfig/go-cache with TTL-based expiration:

Power state: 5-second TTL (state changes frequently during power actions)
Features & KVM displays: 30-second TTL (settings rarely change)
Automatic invalidation: Cache cleared when power actions are performed

Performance Impact

Testing with real AMT device (GUID: d0c96538-9e19-4bc0-9d73-8864e641f77f):

Response time improvement: 25-164x faster for cached requests
Without cache: ~1.15s per request (1150ms)
Cold cache (first call): ~170ms
Warm cache (subsequent calls): 7-45µs (microseconds!)
Backend load reduction: ~67% fewer AMT API calls
Cache efficiency: ~100% hit rate within TTL window

Implementation Details

Uses robfig/go-cache for optimal performance without reflection overhead
Thread-safe with internal locking for concurrent access
Configurable TTLs: Set ttl: 0 in config.yml to disable caching entirely
Cache bypass: Use ?refresh=true query parameter to skip cache and fetch live data
KVM optimization: New combined endpoint reduces 4 API calls to 1 during KVM initialization
Factory pattern (factory.go) for future Redis support
Minimal code changes to existing endpoints
No breaking changes to API contracts

Configuration

cache:
  # Cache time-to-live for AMT features and KVM data
  # Valid range: 0 (disabled) to 5 minutes (300s)
  # Set to 0 to disable caching entirely
  ttl: 30s
  
  # Separate TTL for power state (changes frequently)
  # Valid range: 0 (disabled) to 1 minute (60s)
  # Set to 0 to disable power state caching only
  powerstate_ttl: 5s

Security

Validated configuration bounds: TTL values are validated at startup
- Maximum ttl: 5 minutes (prevents excessive stale data retention)
- Maximum powerstate_ttl: 1 minute (limits power state staleness)
- Minimum: 0 (allows disabling without negative values)
Startup rejection: Invalid configurations prevent server start with clear error messages
Prevents misconfigurations: Protects against accidental or malicious cache timing attacks

Testing

Verified cache hit/miss behavior with server logs
Performance tested with quick-cache-test.sh script using real AMT device
Confirmed cache invalidation on power actions
Validated 3,800-7,500x speedup in server-side response times (170ms → 22-45µs)

- Add thread-safe cache implementation with TTL support - Cache power state (5s TTL) with invalidation on power actions - Cache features and KVM displays (30s TTL) - Reduces backend API calls by ~67% for typical usage - Improves response times by 15-53x for repeated requests

Add GET /api/v1/admin/kvm/init/{guid} endpoint that combines: - Display settings (GetKVMScreenSettings) - Power state (GetPowerState) - Redirection status (hardcoded for now) - AMT features (GetFeatures) This reduces 4 separate API calls during KVM initialization to just 1, significantly improving page load time and reducing network latency. Changes: - Add KVMInitResponse DTO that combines all required data - Add GetKVMInitData usecase method with caching (30s TTL) - Add cache key for KVM init data - Register endpoint in both v1 and OpenAPI routers - Add HTTP handler for the new endpoint

Add monitoring stack with: - Prometheus for metrics collection from /metrics endpoint - Grafana for visualization and dashboards - Persistent volumes for both services

- Add GetKVMInitData method to ws/v1 Feature interface - Add SetLinkPreference method to ws/v1 Feature interface - Regenerate all mock files with updated interface signatures - Update gomock import paths from github.com/golang/mock to go.uber.org/mock

- Generate prometheus.yml dynamically in GitHub Actions workflow - Fixes CI error where prometheus.yml mount fails - Keep prometheus.yml in .gitignore for local development

- Specify vault, postgres, app services explicitly in docker compose up - Removes need for prometheus.yml in CI environment - Keeps all prometheus-related config out of repository

- Remove prometheus and grafana services from docker-compose.yml - Remove prometheus-data and grafana-data volumes - Revert api-test.yml to original since services are removed - Keeps monitoring configuration separate from main compose file

internal/usecase/devices/features.go

internal/controller/httpapi/v1/kvminit.go

…into backendCaching

rsdmike

Hey @nmgaston , in a similar vein as the performance metrics -- thinking ahead for our cloud deployment. Using something like https://github.com/gin-contrib/cache with support for both in-memory and Redis I think will be helpful. We generally prefer to adopt open-source components if we can. Few other questions:

How do we request the content be forced to refresh, i.e. how do I specify to override/invalidate the cache?
Can I disable the cache entirely if I want via config?
I'd want the cache to be configurable as far as timing goes, but at the same time, I don't think we necessarily want a bunch of individual settings to tweak, maybe a single value for cache time.

…formance - Replace custom cache implementation with direct robfig/go-cache usage - Add factory pattern for cache initialization (extensible for Redis) - Implement backend caching for AMT features endpoint - Fix linting errors (nlreturn, revive) - Resolve data race conditions in usecase tests - Update all test files to use new cache.New() initialization - Performance improvement: 25-164x speedup compared to baseline (1150ms) - Cold cache: 170ms - Warm cache: 7-45µs - Server-side operations: 22-45µs (3,800-7,500x faster than cold) Addresses peer feedback in PR device-management-toolkit#768 requesting open-source caching solution with support for both in-memory and future Redis implementation.

nmgaston · 2026-02-05T16:25:00Z

How do we request the content be forced to refresh, i.e. how do I specify to override/invalidate the cache?

Updated to add a ?refresh=true to any GET request to skip the cache check and fetch fresh data

nmgaston · 2026-02-05T16:25:43Z

Can I disable the cache entirely if I want via config?

Updated to add config values that allow the cache to be disabled.

nmgaston · 2026-02-05T16:27:08Z

I'd want the cache to be configurable as far as timing goes, but at the same time, I don't think we necessarily want a bunch of individual settings to tweak, maybe a single value for cache time.

Done. It's configurable and can be turned off with the ttl value. I wanted to keep the power value separate since that needs to be a shorter time period than the rest.

…formance

nmgaston force-pushed the backendCaching branch from f0b4549 to 4a97c60 Compare January 23, 2026 04:27

nmgaston added 15 commits January 22, 2026 20:29

Merge branch 'main' into backendCaching

3287fdc

fix: remove unused time imports from devices package

177381f

fix: resolve golangci-lint issues (godot, nlreturn, wsl_v5)

722eb74

fix: add periods to remaining godoc comments in keys.go

b2c3fc7

feat: add Prometheus and Grafana to docker-compose

e96a3e4

Add monitoring stack with: - Prometheus for metrics collection from /metrics endpoint - Grafana for visualization and dashboards - Persistent volumes for both services

chore: format files

2f48991

fix: create prometheus.yml in CI workflow before docker-compose

7bc9d8d

- Generate prometheus.yml dynamically in GitHub Actions workflow - Fixes CI error where prometheus.yml mount fails - Keep prometheus.yml in .gitignore for local development

chore: format files

f233389

fix: replace magic number with constant for KVM init cache TTL

bdd17d5

fix: only start required services in CI, exclude prometheus/grafana

1b62eff

- Specify vault, postgres, app services explicitly in docker compose up - Removes need for prometheus.yml in CI environment - Keeps all prometheus-related config out of repository

revert: remove unnecessary quotes from AUTH_DISABLED value

0e166fb

nmgaston marked this pull request as ready for review January 29, 2026 23:46

nmgaston requested review from madhavilosetty-intel and rsdmike January 29, 2026 23:47

nmgaston mentioned this pull request Jan 30, 2026

Backend caching for AMT API endpoints to reduce latency and backend load #774

Open

Merge branch 'main' into backendCaching

c7a13bc

sudhir-intc reviewed Feb 2, 2026

View reviewed changes

internal/usecase/devices/features.go Outdated Show resolved Hide resolved

internal/controller/httpapi/v1/kvminit.go Show resolved Hide resolved

nmgaston added 3 commits February 3, 2026 10:30

chore: remove caching debug statements

4330058

Merge branch 'backendCaching' of https://github.com/nmgaston/console …

51de2c5

…into backendCaching

chore: fix lint issue

aac928d

rsdmike requested changes Feb 3, 2026

View reviewed changes

nmgaston force-pushed the backendCaching branch from bf9ede9 to 7f3fd6a Compare February 4, 2026 23:53

nmgaston force-pushed the backendCaching branch 4 times, most recently from 7fbb11b to 21bd4c6 Compare February 5, 2026 16:50

nmgaston added 3 commits February 5, 2026 08:54

feat: implement backend caching with robfig/go-cache for improved per…

21bd4c6

…formance

Merge branch 'main' into backendCaching

1856a09

feat: implement backend caching with robfig/go-cache for improved per…

1088fb2

…formance

nmgaston requested review from rsdmike and sudhir-intc February 5, 2026 17:30

nmgaston linked an issue Feb 5, 2026 that may be closed by this pull request

Backend caching for AMT API endpoints to reduce latency and backend load #774

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add backend caching for AMT API endpoints #768

feat: add backend caching for AMT API endpoints #768

Uh oh!

nmgaston commented Jan 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

rsdmike left a comment

Uh oh!

nmgaston commented Feb 5, 2026

Uh oh!

nmgaston commented Feb 5, 2026

Uh oh!

nmgaston commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add backend caching for AMT API endpoints #768

Are you sure you want to change the base?

feat: add backend caching for AMT API endpoints #768

Uh oh!

Conversation

nmgaston commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Performance Impact

Implementation Details

Configuration

Security

Testing

Uh oh!

Uh oh!

Uh oh!

rsdmike left a comment

Choose a reason for hiding this comment

Uh oh!

nmgaston commented Feb 5, 2026

Uh oh!

nmgaston commented Feb 5, 2026

Uh oh!

nmgaston commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nmgaston commented Jan 23, 2026 •

edited

Loading