Skip to content

feat(rust): add configurable size guardrails (max_string_bytes, max_collection_size, max_map_size)#3421

Open
Zakir032002 wants to merge 3 commits intoapache:mainfrom
Zakir032002:issue-3409-size-guardrails
Open

feat(rust): add configurable size guardrails (max_string_bytes, max_collection_size, max_map_size)#3421
Zakir032002 wants to merge 3 commits intoapache:mainfrom
Zakir032002:issue-3409-size-guardrails

Conversation

@Zakir032002
Copy link

@Zakir032002 Zakir032002 commented Feb 25, 2026

Summary

Fixes #3409

Adds three opt-in Fory builder methods that let callers enforce upper bounds on the size of data allocated during deserialization. Without these guards a crafted payload can contain an absurdly large length prefix, causing Vec::with_capacity / string allocation to exhaust heap memory before a single byte of real data is read.

let fory = Fory::default()
    .max_string_bytes(1024 * 1024)    
    .max_collection_size(100_000)     
    .max_map_size(100_000);           

All three limits default to None (no limit), so this is 100 % backwards-compatible.

Testing

cargo fmt --check   ✓
cargo clippy --all-targets --all-features -- -D warnings   ✓
cargo test test_size_guardrails   → 6 passed, 0 failed
cargo test                        → all existing tests pass

…ollection_size, max_map_size)

Adds three Fory builder methods that let callers cap the byte length of
deserialized strings, element count of collections, and entry count of
maps. When a limit is exceeded an informative Error is returned instead
of a blind allocation, preventing OOM from crafted payloads.

- config.rs: three Option<usize> fields
- context.rs: check_string_bytes / check_collection_size / check_map_size helpers
- fory.rs: builder methods for all three limits
- buffer.rs: read_varuint36small() helper used by string check
- serializer/string.rs: check before string allocation
- serializer/collection.rs: check in generic Vec / collection read
- serializer/primitive_list.rs: check for Vec<primitive> fast path
- serializer/map.rs: check before HashMap / BTreeMap allocation
- tests/tests/test_size_guardrails.rs: 6 integration tests
- tests/tests/mod.rs: register new test module

Fixes apache#3409
- Bug 1: move check_collection_size before polymorphic dispatch in
  read_collection_data so HashSet/LinkedList/BTreeSet with polymorphic
  elements no longer bypass the limit
- Bug 2: adjust string byte budget for UTF-16 (len is code-units;
  multiply by 2 before checking) so an adversary cannot double the
  allocation limit by forcing UTF-16 encoding
- Bug 3: remove duplicate check_bound in read_latin1_string that
  regressed every latin1/ASCII string on the hot deserialization path
- Bug 4: move check_map_size before BTreeMap::new() for a consistent
  'check before allocate' pattern matching the HashMap path
@Zakir032002 Zakir032002 marked this pull request as ready for review February 25, 2026 19:35
@Zakir032002
Copy link
Author

@chaokunyang , let me know if you need any changes

@chaokunyang
Copy link
Collaborator

chaokunyang commented Feb 26, 2026

Please set default value to max value of int32. And also check length is less or equal than buffer remaining size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Rust] configurable size guardrails for untrusted payloads

2 participants