Problem
The current implementation of extract for bitvectors in Smtml uses byte offsets instead of bit offsets. This is non-standard and contradicts SMT-LIB's (_ extract high low) operator, where high and low are inclusive bit indices.
Currently:
Expr.Extract(e, high, low) stores byte offsets.
Expr.ty calculates bitwidth as (high - low) * 8.
Expr.extract converts bytes to bits internally: high_bit = (high * 8) - 1, low_bit = low * 8.
- Solver mappings in
mappings.ml perform the same conversion.
Smtml.Typed.Bitv modules documentation and implementation use byte-based indices.
This leads to confusion when parsing standard SMT2 files and makes the API inconsistent with other SMT solvers.
Proposed Changes
-
Core AST (src/smtml/expr.ml)
- Redefine
Extract node to store bit indices directly.
- Update
Expr.ty to return Ty_bitv (high - low + 1).
- Update
Expr.extract and Expr.raw_extract to accept bit indices. Remove internal byte-to-bit conversion.
-
Typed API (src/smtml/typed.ml, src/smtml/typed.mli)
- Update
Bitv.extract signature and documentation to use bit indices.
- Update all internal usages of
extract:
Bitv32.to_bytes, Bitv64.to_bytes
Bitv128.to_i32x4, Bitv128.to_i64x2
Bitv*.extend_s
- Update example in
typed.mli documentation.
-
Parser (src/smtml/smtlib.ml)
- Simplify the
extract case in the SMT-LIB parser to pass the indices directly from the SMT2 file without dividing by 8.
-
Mappings (src/smtml/mappings.ml)
- Remove the
(h * 8) - 1 and l * 8 conversions in mappings.ml. Pass the indices directly to the solver mapping functions.
Implementation Plan
Test Plan for Bit-level Extraction
1. Unit Tests (test/unit/test_extract.ml)
Create a new unit test suite to verify the correctness of the new bit-level extraction.
Test Cases:
- Basic bit extraction:
extract (i32 0xF) ~high:0 ~low:0 should result in 1.
extract (i32 0xF) ~high:3 ~low:0 should result in 0xF.
- Non-byte-aligned extraction:
extract (i32 0xAA) ~high:5 ~low:2 (binary 10101010, bits 5-2 are 1010) -> 0xA.
- Width verification:
Expr.ty (extract e ~high:7 ~low:0) must be Ty_bitv 8.
Expr.ty (extract e ~high:15 ~low:0) must be Ty_bitv 16.
- Boundary conditions:
extract e ~high:31 ~low:0 for 32-bit value.
extract e ~high:0 ~low:0.
2. Integration Tests
- SMT-LIB Compatibility:
- Add a new test case in
test/integration/test_solver.ml that uses a hand-written SMT-LIB string with bit-level extraction: (assert (= ((_ extract 3 0) #xAF) #xF)).
- Verify it passes with all enabled solvers (Z3, Bitwuzla, etc.).
- Typed API consistency:
- Verify
Bitv32.to_bytes still produces the correct 4 bytes after being updated to use bit offsets.
3. Regression Tests
- Run the full suite:
dune runtest.
- Pay special attention to
test/cli/smt2/ regression tests. If any SMT2 files fail, they might have been relying on the previous (incorrect) behavior or the parser change might have fixed them.
Problem
The current implementation of
extractfor bitvectors inSmtmluses byte offsets instead of bit offsets. This is non-standard and contradicts SMT-LIB's(_ extract high low)operator, wherehighandloware inclusive bit indices.Currently:
Expr.Extract(e, high, low)stores byte offsets.Expr.tycalculates bitwidth as(high - low) * 8.Expr.extractconverts bytes to bits internally:high_bit = (high * 8) - 1,low_bit = low * 8.mappings.mlperform the same conversion.Smtml.Typed.Bitvmodules documentation and implementation use byte-based indices.This leads to confusion when parsing standard SMT2 files and makes the API inconsistent with other SMT solvers.
Proposed Changes
Core AST (
src/smtml/expr.ml)Extractnode to store bit indices directly.Expr.tyto returnTy_bitv (high - low + 1).Expr.extractandExpr.raw_extractto accept bit indices. Remove internal byte-to-bit conversion.Typed API (
src/smtml/typed.ml,src/smtml/typed.mli)Bitv.extractsignature and documentation to use bit indices.extract:Bitv32.to_bytes,Bitv64.to_bytesBitv128.to_i32x4,Bitv128.to_i64x2Bitv*.extend_styped.mlidocumentation.Parser (
src/smtml/smtlib.ml)extractcase in the SMT-LIB parser to pass the indices directly from the SMT2 file without dividing by 8.Mappings (
src/smtml/mappings.ml)(h * 8) - 1andl * 8conversions inmappings.ml. Pass the indices directly to the solver mapping functions.Implementation Plan
src/smtml/expr.mlandsrc/smtml/expr.mli.src/smtml/typed.mlandsrc/smtml/typed.mli.src/smtml/smtlib.ml.src/smtml/mappings.ml.extractcalls in the codebase.Test Plan for Bit-level Extraction
1. Unit Tests (
test/unit/test_extract.ml)Create a new unit test suite to verify the correctness of the new bit-level extraction.
Test Cases:
extract (i32 0xF) ~high:0 ~low:0should result in1.extract (i32 0xF) ~high:3 ~low:0should result in0xF.extract (i32 0xAA) ~high:5 ~low:2(binary10101010, bits 5-2 are1010) ->0xA.Expr.ty (extract e ~high:7 ~low:0)must beTy_bitv 8.Expr.ty (extract e ~high:15 ~low:0)must beTy_bitv 16.extract e ~high:31 ~low:0for 32-bit value.extract e ~high:0 ~low:0.2. Integration Tests
test/integration/test_solver.mlthat uses a hand-written SMT-LIB string with bit-level extraction:(assert (= ((_ extract 3 0) #xAF) #xF)).Bitv32.to_bytesstill produces the correct 4 bytes after being updated to use bit offsets.3. Regression Tests
dune runtest.test/cli/smt2/regression tests. If any SMT2 files fail, they might have been relying on the previous (incorrect) behavior or the parser change might have fixed them.