Skip to content

fix: return error instead of panicking on zero-dimension fixed-size-list columns#7247

Open
DanielMao1 wants to merge 1 commit into
lance-format:mainfrom
DanielMao1:fix/zero-dim-fsl-panic
Open

fix: return error instead of panicking on zero-dimension fixed-size-list columns#7247
DanielMao1 wants to merge 1 commit into
lance-format:mainfrom
DanielMao1:fix/zero-dim-fsl-panic

Conversation

@DanielMao1

@DanielMao1 DanielMao1 commented Jun 12, 2026

Copy link
Copy Markdown

Closes #5102

Problem

A fixed-size-list column with dimension 0 panics with attempt to divide by zero
(rust/lance-encoding/src/data.rs, FixedSizeListBlock::num_values). As of pylance 7.0.0
the panic fires on write for every storage version (stable/2.1/2.2), and reading
datasets persisted by older writers (which accepted such columns) panics as well.

Reproduction details are in the issue comment:
#5102 (comment)

Approach

Following the maintainer guidance in #5102 (error, not panic), this adds two small guards at
boundaries that already return Result, instead of changing DataBlock::num_values() to
return Result (the approach that made #5159 balloon across the whole encoding crate):

  1. Write side: Schema::validate() rejects zero-dimension fixed-size-list fields
    (including nested ones). validate() runs inside Schema::try_from(&ArrowSchema),
    so every write entry point surfaces a clean schema error instead of a panic. Writes
    currently panic on every storage version, so no working flow changes behavior.
  2. Read side (defensive): the structural and legacy field-scheduler builders reject
    zero-dimension fixed-size lists with an invalid-input error, so datasets persisted by
    old writers fail cleanly at scheduling time instead of crashing the process.

How the guards sit in the data flow

guards

Two facts that shape the design:

  • Schema::try_from(&ArrowSchema) calls validate() internally and every write path performs
    this conversion, so guard 1 in one place covers all write entry points.
  • Guard 2 exists because writers up to ~2026-04 could still persist zero-dimension columns
    under the stable (2.0) storage version; reading those files must not crash the process.

Tests

  • lance-core: Schema::try_from rejects zero-dim FSL at top level and nested in a struct;
    positive dimensions still validate.
  • lance-encoding: the scheduler guard rejects zero-dim FSL, including FSL-nested-in-FSL,
    and accepts positive dimensions.
  • Python: parametrized over legacy/stable/2.1, write_dataset now raises a clean
    OSError (same mapping as other schema validation errors) instead of PanicException.

@github-actions github-actions Bot added A-python Python bindings A-encoding Encoding, IO, file reader/writer bug Something isn't working labels Jun 12, 2026
@DanielMao1 DanielMao1 force-pushed the fix/zero-dim-fsl-panic branch from b47648a to 06de9c9 Compare June 12, 2026 09:15
…ist columns

A fixed-size-list column with dimension 0 used to panic with 'attempt to
divide by zero' (rust/lance-encoding/src/data.rs) on every write path and
when reading datasets persisted by older writers.

Two guards, following the maintainer guidance in lance-format#5102 (error, not panic):

- Schema::validate() rejects zero-dimension fixed-size-list fields, turning
  every write into a clean schema error. validate() is invoked from
  Schema::try_from(&ArrowSchema), so all write entry points are covered.
- The decoder field-scheduler builders reject zero-dimension fixed-size
  lists with an invalid-input error, so datasets persisted by old writers
  fail cleanly instead of panicking at scheduling time.

Closes lance-format#5102
@DanielMao1 DanielMao1 force-pushed the fix/zero-dim-fsl-panic branch from 143f1e1 to 240a45a Compare June 12, 2026 09:26
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.01493% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-encoding/src/decoder.rs 94.28% 0 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-encoding Encoding, IO, file reader/writer A-python Python bindings bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Zero dimension vectors cause Lance to panic

1 participant