Skip to content

Memory error - won't unload old model #86

@hajajmaor

Description

@hajajmaor

Use case:
loaded and used qwen3-4b-16k
Sending another request with: Qwen3-8B

Device:
Orangepi 5 max - 16gb - nvme drive

Error:

I rkllm: rkllm-toolkit version: 1.2.1b1, max_context_limit: 16384, npu_core_num: 3, target_platform: RK3588, model_dtype: W8A8_G256
I rkllm: Enabled cpus: [0, 1, 2, 3, 4, 5, 6, 7]
I rkllm: Enabled cpus num: 8
2025-10-26 06:53:48,920 - rkllama.worker - INFO - Worker for model qwen3-4b-16k:g256-o1 created and running...
2025-10-26 06:53:50,002 - rkllama.worker - INFO - Running inference for model qwen3-4b-16k:g256-o1...
 I'm Terry, your tech assistant. Let me help you update Lobe Chat.

Would you like me to proceed with the update? If so, I'll call the appropriate function to update your Lobe Chat instance.

2025-10-26 06:54:06,903 - werkzeug - INFO - 172.18.0.3 - - [26/Oct/2025 06:54:06] "POST /api/chat HTTP/1.1" 200 -
2025-10-26 06:54:46,015 - werkzeug - INFO - 172.18.0.3 - - [26/Oct/2025 06:54:46] "GET /api/tags HTTP/1.1" 200 -
FROM: Qwen3-8B-rk3588-w8a8_g512-opt-1-hybrid-ratio-1.0.rkllm
HuggingFace Path: dulimov/Qwen3-8B-rk3588-1.2.1-unsloth-16k
I rkllm: rkllm-runtime version: 1.2.2, rknpu driver version: 0.9.8, platform: RK3588
I rkllm: loading rkllm model from /opt/rkllama/models/qwen3-8b-16k/Qwen3-8B-rk3588-w8a8_g512-opt-1-hybrid-ratio-1.0.rkllm
I rkllm: rkllm-toolkit version: 1.2.1b1, max_context_limit: 16384, npu_core_num: 3, target_platform: RK3588, model_dtype: W8A8_G512
E RKNN: [06:55:12.197] failed to allocate handle, ret: -1, errno: 14, errstr: Bad address
E RKNN: [06:55:12.197] failed to malloc npu memory, size: 3900702720, flags: 0x2
E RKNN: [06:55:12.227] load model file error!

E rkllm: rkllm_init failed2025-10-26 06:55:12,321 - rkllama.worker - ERROR - Failed creating the worker for model 'qwen3-8b-16k': Failed to initialize RKLLM model: -1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions