-
Notifications
You must be signed in to change notification settings - Fork 588
Pull requests: ml-explore/mlx-lm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Auto-discover tool-call markers from tokenizer config fields
#1163
opened Apr 18, 2026 by
michaelstingl
Loading…
6 tasks done
feat(nemotron_h): add Multi-Token Prediction (MTP) module
#1161
opened Apr 16, 2026 by
Thump604
Loading…
Fix Gemma 4 KV-shared layers creating unused projections
#1158
opened Apr 15, 2026 by
glyphVault
Loading…
5 tasks done
Add TurboQuantKVCache: data-oblivious 2-4 bit KV cache compression
#1144
opened Apr 12, 2026 by
Smilefounder
•
Draft
3 tasks done
fix(gemma4): return [] instead of raising on empty tool-call match
#1142
opened Apr 10, 2026 by
gofastercloud
Loading…
Add pipeline parallel support for Qwen3 MoE and MiniMax models
#1138
opened Apr 9, 2026 by
qubitcontracting
Loading…
Pipeline parallel: memory-proportional splitting and inference sync
#1137
opened Apr 9, 2026 by
qubitcontracting
Loading…
Add RAG example using mlx-lm hidden state embeddings
#1130
opened Apr 8, 2026 by
ManjushaMotamarry
Loading…
feat(tuner): support loading PEFT/Unsloth LoRA adapters in load_adapters()
#1120
opened Apr 7, 2026 by
YUGOROU
Loading…
fix: honor --prompt-cache-bytes in sequential serve mode
#1118
opened Apr 7, 2026 by
Jw983cam
Loading…
fix: BatchRotatingKVCache.merge() shape mismatch with different fill levels
#1116
opened Apr 7, 2026 by
Jw983cam
Loading…
fix: enable speculative decoding for hybrid models (Qwen3.5, fixes #846)
#1111
opened Apr 5, 2026 by
alexlee2046
Loading…
perf: reduce peak memory during model quantization
#1102
opened Apr 3, 2026 by
matteocelani
Contributor
Loading…
5 tasks done
feat: memory-aware auto-config + BatchQuantizedKVCache for batched quantized KV
#1101
opened Apr 3, 2026 by
deceptech-packet-ninja
Loading…
8 tasks done
perf: reduce peak memory when loading AutoAWQ/GPTQ models
#1098
opened Apr 2, 2026 by
matteocelani
Contributor
Loading…
4 tasks done
perf: reduce GPU sync frequency in GPTQ quantization
#1094
opened Apr 2, 2026 by
matteocelani
Contributor
Loading…
3 tasks done
fix(gpt_oss): handle BF16 expert weights in sanitize
#1080
opened Mar 31, 2026 by
FaisalFehad
Loading…
feat: QuantizedRotatingKVCache + KVSplit (K/V different bits)
#1074
opened Mar 30, 2026 by
deceptech-packet-ninja
Loading…
13 tasks done
feat: add KV cache quantization args to server
#1073
opened Mar 30, 2026 by
deceptech-packet-ninja
Loading…
4 tasks done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.