fix: tableFromJSON and tableFromArrays crash on arrays of strings#405
fix: tableFromJSON and tableFromArrays crash on arrays of strings#405kentkwu wants to merge 1 commit intoapache:mainfrom
Conversation
|
@matheuzinoficial @hemkumarrao-ni @knight556 Could you try this fix? |
There was a problem hiding this comment.
Pull request overview
Fixes tableFromJSON() / tableFromArrays() type inference crashes when encountering arrays of strings by comparing inferred types structurally (ignoring Dictionary IDs that auto-increment per construction), including nested List and Struct cases.
Changes:
- Added an internal
compareTypesForInference()helper to compareDictionarytypes without requiring matching IDs, with recursion forListandStruct. - Updated list homogeneity checking in
inferType()to use the new inference comparator. - Added unit tests covering arrays of strings and nested objects containing strings for both
tableFromArrays()andtableFromJSON().
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/factories.ts |
Introduces ID-agnostic type comparison for inference and applies it to list inference checks to prevent false mismatches for string dictionaries. |
test/unit/table/table-test.ts |
Adds regression tests demonstrating correct behavior for string arrays and nested string-containing structures. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| const array = value as Array<unknown>[]; | ||
| const childType = inferType(array[array.findIndex((ary) => ary != null)]); | ||
| if (array.every((ary) => ary == null || compareTypes(childType, inferType(ary)))) { | ||
| if (array.every((ary) => ary == null || compareTypesForInference(childType, inferType(ary)))) { |
There was a problem hiding this comment.
What if we just passed the dictionaryId down through inferType? e.g.
| if (array.every((ary) => ary == null || compareTypesForInference(childType, inferType(ary)))) { | |
| if (array.every((ary) => ary == null || compareTypes(childType, inferType(ary, dtypes.DataType.isDictionary(childType) ? childType.id : null)))) { |
Then we could pass the dictionary id into the Dictionary constructor on L142?
There was a problem hiding this comment.
Would probably need to maintain a Map of key -> dictionary id in order to do the same thing for dictionary-encoded struct children.
trxcllnt
left a comment
There was a problem hiding this comment.
Left a comment with a suggestion, curious to hear your thoughts.
What's Changed
Applies a fix for type inference crashing on encountering an array of strings.
Added a
compareTypesForInferencehelper that compares Dictionary types structurally, ignoring IDs. It recurses through List and Struct for nested cases. The public compareTypes is unchanged. ID equality still matters for IPC comparisons.Root cause:
Strings infer to Dictionary<Utf8, Int32>, and every Dictionary construction auto-increments a global ID. The array homogeneity check in inferType calls compareTypes on separately inferred types, which requires matching IDs — so it always fails for string arrays.
Closes #86, #50