OllamaModelBackend.generate_from_raw silently swallows batch exceptions
Description
generate_from_raw uses asyncio.gather(..., return_exceptions=True) to run concurrent requests, but silently converts any exception into ModelOutputThunk(value=""), storing the error only in result._generate_log.extra["error"] — invisible to callers.
ollama.py:465-474
Impact
Callers have no way to detect failures. Tests asserting result.value is not None pass silently even when requests are failing, since empty string is not None.
Related
OllamaModelBackend.generate_from_rawsilently swallows batch exceptionsDescription
generate_from_rawusesasyncio.gather(..., return_exceptions=True)to run concurrent requests, but silently converts any exception intoModelOutputThunk(value=""), storing the error only inresult._generate_log.extra["error"]— invisible to callers.ollama.py:465-474
Impact
Callers have no way to detect failures. Tests asserting
result.value is not Nonepass silently even when requests are failing, since empty string is not None.Related