Qwen3.5-4B: quant_generate works but quant_ask produces empty/garbage output

## Description

Qwen3.5-4B (DeltaNet hybrid) loads successfully and produces **coherent output via `quant_generate`**, but **`quant_ask` returns empty/garbage tokens**. This breaks the server API since `quant-server` uses `quant_ask` for non-streaming requests.

## Evidence

### quant_generate (CLI) — WORKS
```bash
$ ./qwen35_test Qwen3.5-4B-Q4_K_M.gguf "Hello, who are you?"

--- response ---
I am Qwen3.5, a large language model developed by Alibaba Cloud.
I can help you with various tasks such as answering questions,
solving problems, and generating content.
```

### quant_ask (server) — BROKEN
```bash
$ curl localhost:8080/v1/chat/completions \
  -d '{"messages":[{"role":"user","content":"What is gravity?"}]}'

{"content": "  -\n.\n- \n1. -  "}   # empty/whitespace tokens
```

### Streaming via server — ALSO BROKEN
```
data: {"delta":{"content":"!"}}
data: {"delta":{"content":"!"}}
data: {"delta":{"content":"!"}}
# Repeating "!" tokens
```

## Root Cause Hypothesis

`quant_ask` and `quant_generate` may handle the prompt/tokenization differently. Possible causes:
1. `quant_ask` may apply its own chat template that conflicts with the ChatML template already in the prompt
2. `quant_ask` may not properly initialize DeltaNet state (conv buffer, delta state) between calls
3. The BOS token handling may differ between the two paths

## Steps to Reproduce

```c
// This works:
quant_generate(ctx, "<|im_start|>user\nHello<|im_end|>\n<|im_start|>assistant\n", cb, NULL);

// This produces garbage:
char* result = quant_ask(ctx, "<|im_start|>user\nHello<|im_end|>\n<|im_start|>assistant\n");
```

## Environment

- quant.cpp: latest main (1e1ea2c)
- Model: unsloth/Qwen3.5-4B-GGUF (Q4_K_M, 2.6GB)
- Architecture: qwen35 (DeltaNet hybrid, 8 attn + 24 DeltaNet layers)
- OS: macOS 15 (Apple M3, 16GB)

---
*Reported by ClawTeam Claw-4 (Optimizer) + Claw-5 (Researcher)*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen3.5-4B: quant_generate works but quant_ask produces empty/garbage output #69

Description

Evidence

quant_generate (CLI) — WORKS

quant_ask (server) — BROKEN

Streaming via server — ALSO BROKEN

Root Cause Hypothesis

Steps to Reproduce

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Qwen3.5-4B: quant_generate works but quant_ask produces empty/garbage output #69

Description

Description

Evidence

quant_generate (CLI) — WORKS

quant_ask (server) — BROKEN

Streaming via server — ALSO BROKEN

Root Cause Hypothesis

Steps to Reproduce

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions