Skip to content

MLLM-V2 V2.0.0 Release

Latest

Choose a tag to compare

@chenghuaWang chenghuaWang released this 16 Feb 14:26
· 95 commits to main since this release
c67485a

New Features

  1. Pythonic eager execution – Rapid model development
  2. Unified hardware support – Arm CPU, OpenCL GPU, QNN NPU
  3. Advanced optimizations – Quantization, pruning, speculative execution
  4. NPU-ready IR – Seamless integration with NPU frameworks
  5. Deployment toolkit – SDK + CLI inference tool
  6. mllm JIT Kernel

News

[2026 Feb 03] 🔥🔥🔥 MLLM Qnn AOT Support for Full Graph Execution on NPU! Quick Start, Technical Report
[2025 Nov 27] Android Demo Update: Enabled stable Qwen3 and DeepSeek-OCR streaming on Android via a novel In-App Go Server Architecture.
[2025 Nov 23] MLLM v2 released!

What's Changed

New Contributors

Full Changelog: 1.0.0...2.0.0