feat(rust): add configurable size guardrails (max_string_bytes, max_collection_size, max_map_size)#3421
Open
Zakir032002 wants to merge 3 commits intoapache:mainfrom
Open
feat(rust): add configurable size guardrails (max_string_bytes, max_collection_size, max_map_size)#3421Zakir032002 wants to merge 3 commits intoapache:mainfrom
Zakir032002 wants to merge 3 commits intoapache:mainfrom
Conversation
…ollection_size, max_map_size) Adds three Fory builder methods that let callers cap the byte length of deserialized strings, element count of collections, and entry count of maps. When a limit is exceeded an informative Error is returned instead of a blind allocation, preventing OOM from crafted payloads. - config.rs: three Option<usize> fields - context.rs: check_string_bytes / check_collection_size / check_map_size helpers - fory.rs: builder methods for all three limits - buffer.rs: read_varuint36small() helper used by string check - serializer/string.rs: check before string allocation - serializer/collection.rs: check in generic Vec / collection read - serializer/primitive_list.rs: check for Vec<primitive> fast path - serializer/map.rs: check before HashMap / BTreeMap allocation - tests/tests/test_size_guardrails.rs: 6 integration tests - tests/tests/mod.rs: register new test module Fixes apache#3409
- Bug 1: move check_collection_size before polymorphic dispatch in read_collection_data so HashSet/LinkedList/BTreeSet with polymorphic elements no longer bypass the limit - Bug 2: adjust string byte budget for UTF-16 (len is code-units; multiply by 2 before checking) so an adversary cannot double the allocation limit by forcing UTF-16 encoding - Bug 3: remove duplicate check_bound in read_latin1_string that regressed every latin1/ASCII string on the hot deserialization path - Bug 4: move check_map_size before BTreeMap::new() for a consistent 'check before allocate' pattern matching the HashMap path
Author
|
@chaokunyang , let me know if you need any changes |
Collaborator
|
Please set default value to max value of int32. And also check length is less or equal than buffer remaining size |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #3409
Adds three opt-in
Forybuilder methods that let callers enforce upper bounds on the size of data allocated during deserialization. Without these guards a crafted payload can contain an absurdly large length prefix, causingVec::with_capacity/ string allocation to exhaust heap memory before a single byte of real data is read.All three limits default to
None(no limit), so this is 100 % backwards-compatible.Testing