Skip to content

Memory Issue #255

@Pengy-X

Description

@Pengy-X

Checks

  • I have searched existing issues and discussions
  • I can reproduce this with the latest main or release

Describe the bug

Whenever I use a model, regardless of the size of model, each request (either through Osaurus built-in chat window or an external application like BoltAI) causes the model to load an additional time. I tested with Qwen3 4B 2507 (around 2gb) and after each question, Osaurus' memory usage would multiply in size. For larger models like qwen3 30b, Osaurus consumed 40gb+ ram after just two messages and MacOS started using 6gb+ swap. Hitting "unload model" or "clear all" did not unload the model, and the only way to unload models was to quit Osaurus.

Steps to reproduce

No response

Osaurus version / commit

v0.5.54

macOS version

26.2

Apple Silicon chip

M4 Pro 48gb

Xcode version

26.2

Logs

Screenshots

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions