Memory error - won't unload old model

Use case:
loaded and used  [qwen3-4b-16k ](https://huggingface.co/dulimov/Qwen3-4B-rk3588-1.2.1-unsloth-16k)
Sending another request with: [Qwen3-8B](https://huggingface.co/dulimov/Qwen3-8B-rk3588-1.2.1-unsloth-16k)

Device:
Orangepi 5 max - 16gb - nvme drive

Error:

```
I rkllm: rkllm-toolkit version: 1.2.1b1, max_context_limit: 16384, npu_core_num: 3, target_platform: RK3588, model_dtype: W8A8_G256
I rkllm: Enabled cpus: [0, 1, 2, 3, 4, 5, 6, 7]
I rkllm: Enabled cpus num: 8
2025-10-26 06:53:48,920 - rkllama.worker - INFO - Worker for model qwen3-4b-16k:g256-o1 created and running...
2025-10-26 06:53:50,002 - rkllama.worker - INFO - Running inference for model qwen3-4b-16k:g256-o1...
 I'm Terry, your tech assistant. Let me help you update Lobe Chat.

Would you like me to proceed with the update? If so, I'll call the appropriate function to update your Lobe Chat instance.

2025-10-26 06:54:06,903 - werkzeug - INFO - 172.18.0.3 - - [26/Oct/2025 06:54:06] "POST /api/chat HTTP/1.1" 200 -
2025-10-26 06:54:46,015 - werkzeug - INFO - 172.18.0.3 - - [26/Oct/2025 06:54:46] "GET /api/tags HTTP/1.1" 200 -
FROM: Qwen3-8B-rk3588-w8a8_g512-opt-1-hybrid-ratio-1.0.rkllm
HuggingFace Path: dulimov/Qwen3-8B-rk3588-1.2.1-unsloth-16k
I rkllm: rkllm-runtime version: 1.2.2, rknpu driver version: 0.9.8, platform: RK3588
I rkllm: loading rkllm model from /opt/rkllama/models/qwen3-8b-16k/Qwen3-8B-rk3588-w8a8_g512-opt-1-hybrid-ratio-1.0.rkllm
I rkllm: rkllm-toolkit version: 1.2.1b1, max_context_limit: 16384, npu_core_num: 3, target_platform: RK3588, model_dtype: W8A8_G512
E RKNN: [06:55:12.197] failed to allocate handle, ret: -1, errno: 14, errstr: Bad address
E RKNN: [06:55:12.197] failed to malloc npu memory, size: 3900702720, flags: 0x2
E RKNN: [06:55:12.227] load model file error!

E rkllm: rkllm_init failed2025-10-26 06:55:12,321 - rkllama.worker - ERROR - Failed creating the worker for model 'qwen3-8b-16k': Failed to initialize RKLLM model: -1
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory error - won't unload old model #86

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Memory error - won't unload old model #86

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions