Skip to content

OVMS stuck in busy loop at "initializing model" (formerly titled "Enforce AVX2 requirement in code (and perhaps clarity in docs)") #4059

@Kalthorn

Description

@Kalthorn

EDIT I changed the issue title to accurately describe the problem and not be a guess about the underlying issue ****

System is Arch Linux running on x86/64 Intel processor (3570k) with an Arc A770 16GB GPU on which I want to run inference.

I download any of the 2026.0 docker images, run it with correct params to pass through access to /dev/dri (see later comment for exact command). Then I --pull a small openvino model (Qwen3-4b most recently) and --add_to_config with correct params for LLMs on a GPU. Then I try to run OVMS with the model with debug output and it just hangs/spins with 100% of on CPU and almost no memory allocated and never progresses or changes despite waiting hours.

Running "clinfo" in the container gives the same results as on the host. I think I have all the requirements installed on the host machine but even after banging my head against this for days, I am still not sure. The python stack (at least benchmark_app.py) seems to have no trouble utilizing the GPU so that's at least a hint that I do. I've also installed the openvino-gpu-plugin package from the AUR.

** ORIGINAL TEXT BELOW **

Hi, thanks for the amazing project here!

I'm trying to run inference on my A770 with an old (AVX1) CPU and running into this:

Symptom: running OVMS just hangs in a busy loop at '[servable_initializer.cpp:420] Initializing Language Model...."

A lot of debugging later leads me to believe this is from my lack of AVX2. I would have expected an error message or an outright crash.

It wasn't clear from the documentation whether or not the requirement was for running inference on a CPU, as I'd hoped, or any/all usage. I realize mine is a corner case to say the least but it seems like good code practice to make the problem/failure explicit with a log error message.

That's all. If I'm way off base then I apologise for the noise. If anyone thinks they could give me a pointer or two to work around the problem, please do!

My options:

  1. I tried rebuilding OVMS and got the same result. The .bazelrc hints that this is possible but also suggests that it builds against the native system capabilities, which is what I expected, but I guess not and I need some specific settings or flags.

  2. There exists on github and the AUR a code project called "intel-graphics-compiler-legacy" that is forked from IGC but I can't find any readme/docs/comments that actually say how it differs (other than the AUR description), it still contains AVX512 instructions, aaaand I'm not entirely sure where it'd go in the build chain. Just swap it out for IGC on the host and not need it in the build? No idea.

It seems like my best bet is 3) running on the python stack. Running benchmark_app.py on the host machine was able to load the model into VRAM and complete the test. Assuming the 'python' aspect doesn't scare me off, would this path significantly hurt inference performance for a single local user?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions