Skip to content

Verify memory safety of String functions (Challenge 10)#541

Open
jrey8343 wants to merge 2 commits intomodel-checking:mainfrom
jrey8343:challenge-10-string
Open

Verify memory safety of String functions (Challenge 10)#541
jrey8343 wants to merge 2 commits intomodel-checking:mainfrom
jrey8343:challenge-10-string

Conversation

@jrey8343
Copy link

@jrey8343 jrey8343 commented Feb 7, 2026

Summary

Adds 15 Kani verification harnesses proving the memory safety of all functions listed in Challenge 10, covering all unsafe operations in library/alloc/src/string.rs.

Functions verified

Function Harness Unsafe operation
from_utf16le check_from_utf16le align_to::<u16>()
from_utf16le_lossy check_from_utf16le_lossy align_to::<u16>()
from_utf16be check_from_utf16be align_to::<u16>()
from_utf16be_lossy check_from_utf16be_lossy align_to::<u16>()
pop check_pop set_len()
remove check_str_remove ptr::copy + set_len
remove_matches check_remove_matches ptr::copy + set_len in pattern search loop
retain check_retain ptr::copy_nonoverlapping + set_len + Drop guard
insert check_insert ptr::copy + encode_utf8_raw_unchecked + set_len
insert_str check_insert_str ptr::copy + ptr::copy_nonoverlapping + set_len
split_off check_split_off from_utf8_unchecked + set_len
drain check_drain get_unchecked + raw pointer arithmetic
replace_range check_replace_range as_mut_vec() + Vec::splice
into_boxed_str check_into_boxed_str from_boxed_utf8_unchecked
leak check_leak from_utf8_unchecked_mut

Verification approach

  • Symbolic ASCII strings: Helper any_ascii_string<N>() generates symbolic String values of 0..N bytes where each byte is constrained to be valid ASCII (<=127), ensuring UTF-8 validity. Used for pop, insert, insert_str, split_off, drain, replace_range, into_boxed_str, and leak.
  • Concrete strings for char-iteration functions: remove, remove_matches, and retain use concrete strings (e.g., String::from("abcd")) because Chars iterator raw pointer internals cause CBMC memory explosion with symbolic strings.
  • UTF-16 functions: 8-byte symbolic byte arrays as input, exercising align_to::<u16>() alignment cast and decode_utf16 iteration.
  • All harnesses use #[kani::unwind(6)] for bounded loop unwinding.

All 15 harnesses verified locally.

Test plan

  • All 15 harnesses pass with kani verify-std using -Z loop-contracts --cbmc-args --object-bits 12
  • Code formatted with rustfmt

Add proof harnesses for all 15 public String functions that are safe
abstractions over unsafe code:
- UTF-16 decoding: from_utf16le, from_utf16le_lossy, from_utf16be, from_utf16be_lossy
- Element operations: pop, remove, remove_matches, retain
- Insertion: insert, insert_str
- Splitting/draining: split_off, drain, replace_range
- Conversion: into_boxed_str, leak

All 15 harnesses verified with Kani 0.65.0.
@jrey8343 jrey8343 requested a review from a team as a code owner February 7, 2026 10:22
@jrey8343
Copy link
Author

CI is passing — ready for review.

@feliperodri feliperodri added the Challenge Used to tag a challenge label Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Challenge Used to tag a challenge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants