Implement safe MLX KV reuse with scoped GPU cache eviction#147
Conversation
d8f85d7 to
d8c6072
Compare
d8c6072 to
770e3bf
Compare
There was a problem hiding this comment.
Pull request overview
This PR extends MLXLanguageModel with session-scoped KV cache reuse and introduces a scoped GPU buffer-cache limit/eviction mechanism to improve multi-turn stability and reduce redundant prefill work in MLX-backed sessions.
Changes:
- Add MLX-specific
GenerationOptionscustomizations (KV cache sizing + quantization knobs) and map them into MLX generate parameters. - Implement per-
LanguageModelSessionKV cache storage/reuse plus a global “one active generation per session” gate and GPU cache-limit scoping. - Add tests for multi-turn behavior and cache-clearing safety; document new MLX configuration in the README.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
Sources/AnyLanguageModel/Models/MLXLanguageModel.swift |
Adds MLX custom options, session KV reuse, concurrency gate, and GPU cache-limit scoping/eviction. |
Tests/AnyLanguageModelTests/MLXLanguageModelTests.swift |
Adds multi-turn same-session test and cache-clear-then-respond test. |
Tests/AnyLanguageModelTests/CustomGenerationOptionsTests.swift |
Adds tests for MLXLanguageModel.CustomGenerationOptions initialization, integration, and Codable. |
README.md |
Documents MLX KV-cache tuning via GenerationOptions and GPU cache configuration via MLXLanguageModel init. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
27f0292 to
3cf7548
Compare
Alternative to #139