-
Notifications
You must be signed in to change notification settings - Fork 12
feat: add backend caching for AMT API endpoints #768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f0b4549 to
4a97c60
Compare
- Add thread-safe cache implementation with TTL support - Cache power state (5s TTL) with invalidation on power actions - Cache features and KVM displays (30s TTL) - Reduces backend API calls by ~67% for typical usage - Improves response times by 15-53x for repeated requests
Add GET /api/v1/admin/kvm/init/{guid} endpoint that combines:
- Display settings (GetKVMScreenSettings)
- Power state (GetPowerState)
- Redirection status (hardcoded for now)
- AMT features (GetFeatures)
This reduces 4 separate API calls during KVM initialization to just 1,
significantly improving page load time and reducing network latency.
Changes:
- Add KVMInitResponse DTO that combines all required data
- Add GetKVMInitData usecase method with caching (30s TTL)
- Add cache key for KVM init data
- Register endpoint in both v1 and OpenAPI routers
- Add HTTP handler for the new endpoint
Add monitoring stack with: - Prometheus for metrics collection from /metrics endpoint - Grafana for visualization and dashboards - Persistent volumes for both services
- Add GetKVMInitData method to ws/v1 Feature interface - Add SetLinkPreference method to ws/v1 Feature interface - Regenerate all mock files with updated interface signatures - Update gomock import paths from github.com/golang/mock to go.uber.org/mock
- Generate prometheus.yml dynamically in GitHub Actions workflow - Fixes CI error where prometheus.yml mount fails - Keep prometheus.yml in .gitignore for local development
- Specify vault, postgres, app services explicitly in docker compose up - Removes need for prometheus.yml in CI environment - Keeps all prometheus-related config out of repository
- Remove prometheus and grafana services from docker-compose.yml - Remove prometheus-data and grafana-data volumes - Revert api-test.yml to original since services are removed - Keeps monitoring configuration separate from main compose file
rsdmike
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @nmgaston , in a similar vein as the performance metrics -- thinking ahead for our cloud deployment. Using something like https://github.com/gin-contrib/cache with support for both in-memory and Redis I think will be helpful. We generally prefer to adopt open-source components if we can. Few other questions:
- How do we request the content be forced to refresh, i.e. how do I specify to override/invalidate the cache?
- Can I disable the cache entirely if I want via config?
- I'd want the cache to be configurable as far as timing goes, but at the same time, I don't think we necessarily want a bunch of individual settings to tweak, maybe a single value for cache time.
…formance - Replace custom cache implementation with direct robfig/go-cache usage - Add factory pattern for cache initialization (extensible for Redis) - Implement backend caching for AMT features endpoint - Fix linting errors (nlreturn, revive) - Resolve data race conditions in usecase tests - Update all test files to use new cache.New() initialization - Performance improvement: 25-164x speedup compared to baseline (1150ms) - Cold cache: 170ms - Warm cache: 7-45µs - Server-side operations: 22-45µs (3,800-7,500x faster than cold) Addresses peer feedback in PR device-management-toolkit#768 requesting open-source caching solution with support for both in-memory and future Redis implementation.
bf9ede9 to
7f3fd6a
Compare
Updated to add a ?refresh=true to any GET request to skip the cache check and fetch fresh data |
Updated to add config values that allow the cache to be disabled. |
Done. It's configurable and can be turned off with the ttl value. I wanted to keep the power value separate since that needs to be a shorter time period than the rest. |
7fbb11b to
21bd4c6
Compare
Summary
Adds backend caching for AMT API endpoints using robfig/go-cache to dramatically reduce latency and backend load for repeated requests within short time windows.
Problem
When users interact with the KVM interface or monitor device status, the frontend makes repeated API calls to fetch power state, features, and KVM display settings. Each call requires a round-trip to the AMT firmware, adding 150-500ms latency per request.
Solution
Implements thread-safe in-memory caching using robfig/go-cache with TTL-based expiration:
Performance Impact
Testing with real AMT device (GUID: d0c96538-9e19-4bc0-9d73-8864e641f77f):
Response time improvement: 25-164x faster for cached requests
Without cache: ~1.15s per request (1150ms)
Cold cache (first call): ~170ms
Warm cache (subsequent calls): 7-45µs (microseconds!)
Backend load reduction: ~67% fewer AMT API calls
Cache efficiency: ~100% hit rate within TTL window
Implementation Details
ttl: 0in config.yml to disable caching entirely?refresh=truequery parameter to skip cache and fetch live dataConfiguration
Security
ttl: 5 minutes (prevents excessive stale data retention)powerstate_ttl: 1 minute (limits power state staleness)Testing