clamp angular weight span before narrowing to int#649
Conversation
|
This is the wrong fix for this - it slows down the codec for an unlikely corner case. Inf and NaN should instead be stripped as part of the LNS encoding on entry. Can you share a reproducer that triggers this one? |
|
Makes sense, and you know the codec internals better than I do here. I came at it from the narrowing site because that's where UBSan trapped, but scrubbing inf/NaN during the LNS conversion on entry is clearly the better place: it fixes it once at the boundary and keeps the hot angular loop untouched, so the per-step clamp I added becomes redundant. Happy to drop this in favour of that. I'll leave the PR open for now in case you'd rather I retarget it, but no objection to you handling it on the entry path. On the reproducer: I hit it fuzzing // repro.cpp
#include "astcenc.h"
#include <cstdint>
#include <limits>
#include <vector>
int main()
{
const unsigned int dim = 4;
astcenc_config config{};
astcenc_config_init(ASTCENC_PRF_HDR, dim, dim, 1, ASTCENC_PRE_MEDIUM, 0, &config);
astcenc_context* ctx = nullptr;
astcenc_context_alloc(&config, 1, &ctx, nullptr);
std::vector<float> texels(dim * dim * 4, 1.0f);
texels[0] = std::numeric_limits<float>::infinity(); // R
texels[5] = -std::numeric_limits<float>::infinity(); // G
texels[10] = std::numeric_limits<float>::quiet_NaN(); // B
void* slices[1] = { texels.data() };
astcenc_image image{};
image.dim_x = dim; image.dim_y = dim; image.dim_z = 1;
image.data_type = ASTCENC_TYPE_F32;
image.data = slices;
astcenc_swizzle swz { ASTCENC_SWZ_R, ASTCENC_SWZ_G, ASTCENC_SWZ_B, ASTCENC_SWZ_A };
uint8_t out[16];
astcenc_compress_image(ctx, &image, &swz, out, sizeof(out), 0);
astcenc_context_free(ctx);
return 0;
}Built with the default scalar |
compute_lowest_and_highest_weight() in the angular weight search narrows the per-step span with float_to_int(maxidx - minidx + 1). The span is derived from the block's ideal decimated weights, which in turn come from the input texels, so an HDR source carrying +/-inf or NaN (both legal in fp16/fp32 and routinely present in EXR content) makes the span non-finite and the conversion runs outside the representable int range. UBSan reports "inf is outside the range of representable values of type int" at the vecmathlib float_to_int, reached directly from astcenc_compress_image on a normal HDR compress; the value is silently garbage on the platforms where it does not trap.
The span is already clamped into [2, max_quant_steps + 3] on the next two lines, so I clamp the float into that same range before the conversion rather than altering the shared float_to_int primitive. That keeps the change at the one spot that can observe non-finite input and stays bit-identical for every finite in-range value, leaving the existing integer min/max untouched.