Skip to content

Commit e9fef58

Browse files
aakoshhAztecBot
authored andcommitted
feat: msgpack encoding for Program and WitnessStack (#12841)
Adds `msgpack` serialisation to the generated Acir and Witness C++ code. I moved the alterations described in `dsl/README.md` into the code generation itself, so no manual work is required. The PR is running against a feature branch with the same name in Noir, here's the upstream PR: noir-lang/noir#7716 With this PR is merged, `bb` should be able to handle `msgpack` or `bincode`. Once that's released we can switch to using `msgpack` in Noir in both native and wasm by merging noir-lang/noir#7810. And then we can remove the `msgpack` format detection and the fallback to `bincode`. **TODO**: - [x] Get it to compile - [x] Change `nargo` to allow compiling contracts with `msgpack` format: added `NOIR_SERIALIZATION_FORMAT` env var - [x] Add a first byte to the data to say which serialization format it is. There is a chance that it would clash with existing bincode data though, so a fallback would anyway be necessary. (Or we should ascertain that bincode never starts with some specific bit sequence). - [x] ~Change the `bb` code so it tries `bincode`, then falls back to `msgpack` - this way the currently used format stays fast, but we can feed it new data.~ _This looks problematic, as exceptions in the wasm build is disabled in `arch.cmake` and `throw_or_abort` aborts in wasm. Instead we run [msgpack::parse](https://c.msgpack.org/cpp/namespacemsgpack.html#ad844d148ad1ff6c9193b02529fe32968) first to check if the data looks like msgpack; if not, we use bincode._ - [x] Run integration tests with `msgpack` on both sides in AztecProtocol/aztec-packages#13021 - [x] Ignore the Brillig opcodes in Barretenberg - [x] Change the serialization of `WitnessStack` and `WitnessMap` to use the env var, add fallbacks in `bb` for them - [x] Revert the change to `noir-repo-ref` before merging ### Use of `MSGPACK_FIELDS` The generated code is using `MSGPACK_FIELDS` for structs, to keep it more terse. At some point during debugging the memory issue below I changed it so that I can have more direct control by generating code for individual fields. That needed some helper functions which I looted from the `msgpack-c` library and injected into the namespaces as a `Helpers` struct. This approach might be useful if we wanted to have extra checks, for example rejecting the data if there are extra fields, indicating a type has been extended with things we don't recognise, or if we wanted handle renamed fields. I left it out so there is less code to maintain, but if we need it we can recover it from the [commit history](noir-lang/noir@b0a612d). ### Compile `nargo` with the `msgpack` feature ```bash echo af/msgpack-codegen > noir/noir-repo-ref noir/bootstrap.sh ``` ### Generate and compile C++ code ```bash cd noir/noir-repo && NOIR_CODEGEN_OVERWRITE=1 cargo test -p acir cpp_codegen && cd - cp noir/noir-repo/acvm-repo/acir/codegen/acir.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/acir.hpp cp noir/noir-repo/acvm-repo/acir/codegen/witness.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp cd barretenberg/cpp && ./format.sh changed && cd - barretenberg/cpp/bootstrap.sh ``` ### Test `nargo` with `bb` One example of an integration test that uses `bb` and noir in the Noir repo is https://github.com/noir-lang/noir/actions/runs/13631231158/job/38099477964 We can call it like this: ```bash cd noir/noir-repo && cargo install --path tooling/nargo_cli && cd - ./barretenberg/cpp/bootstrap.sh export BACKEND=$(pwd)/barretenberg/cpp/build/bin/bb export NOIR_SERIALIZATION_FORMAT=msgpack noir/noir-repo/examples/prove_and_verify/test.sh ``` If it works, it should print this: ```console % unset NOIR_SERIALIZATION_FORMAT % noir/noir-repo/examples/prove_and_verify/test.sh [hello_world] Circuit witness successfully solved [hello_world] Witness saved to /mnt/user-data/akosh/aztec-packages/noir/noir-repo/examples/prove_and_verify/target/witness.gz Finalized circuit size: 18 Proof saved to "./proofs/proof" Finalized circuit size: 18 VK saved to "./target/vk" Proof verified successfully ``` Whereas if it doesn't: ```console % export NOIR_SERIALIZATION_FORMAT=msgpack % noir/noir-repo/examples/prove_and_verify/test.sh [hello_world] Circuit witness successfully solved [hello_world] Witness saved to /mnt/user-data/akosh/aztec-packages/noir/noir-repo/examples/prove_and_verify/target/witness.gz Length is too large ``` I attached the final artefacts to the PR so it's easier to test with just `bb`. [hello_world.json](https://github.com/user-attachments/files/19391072/hello_world.json) [witness.gz](https://github.com/user-attachments/files/19391074/witness.gz) ### Further testing With the `noir-repo-ref` pointing at the feature `af/msgpack-codegen` feature branch, we can run all the contract compilations and tests with `msgpack` as follows: ```shell export NOIR_SERIALIZATION_FORMAT=msgpack ./bootstrap.sh ci ``` This is tested in AztecProtocol/aztec-packages#13021 ### Peek into artefacts We can inspect the file in JSON format using [this](https://crates.io/crates/msgpack-cli) msgpack CLI tool. ```shell jq -r '.bytecode' ./target/program.json | base64 --decode | gunzip | tail -c +2 | mpk --to-json | jq ``` Thanks Tom for the [spell](AztecProtocol/msgpack-c#5 (comment)) 🙏 ### False bug At some point I thought had to make some fixes in `msgpack-c` itself to make this work: AztecProtocol/msgpack-c#5 A similar [blocking bug](AztecProtocol/aztec-packages#12841 (comment)) was encountered when running the entire `ci` build with msgpack format. It turned out it was a [dangling pointer](msgpack/msgpack-c#695 (comment)) issue, fixed in AztecProtocol/aztec-packages@5810e3b Much of the comments below are related to my struggles that came from this mistake; you can ignore them.
1 parent d017763 commit e9fef58

7 files changed

Lines changed: 4136 additions & 1183 deletions

File tree

cpp/cmake/msgpack.cmake

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@ ExternalProject_Add(
1313
BUILD_COMMAND "" # No build step
1414
INSTALL_COMMAND "" # No install step
1515
UPDATE_COMMAND "" # No update step
16-
)
16+
)

cpp/src/barretenberg/dsl/README.md

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,19 @@ There are two types of breaking serialization changes. One that alters the inter
88

99
1. Internal Structure Change
1010

11-
Go to the ACVM `acir` crate and re-run the [serde reflection test](https://github.com/noir-lang/noir/blob/master/acvm-repo/acir/src/lib.rs#L51). Remember to comment out the hash check to write the updated C++ serde file. Copy that file into the `dsl` package's [serde folder](./acir_format/serde/) where you will see an `acir.hpp` file.
12-
13-
You will have to update a couple things in the new `acir.hpp`:
14-
15-
- Replace all `throw serde::deserialization_error` with `throw_or_abort`
16-
- The top-level struct (such as `Program`) will still use its own namespace for its internal fields. This extra `Program::` can be removed from the top-level `struct Program { .. }` object.
17-
18-
The same can then be done for any breaking changes introduced to `witness_stack.hpp`.
11+
The C++ bindings are generated by the ACVM `acir` crate and while running the [serde reflection test](https://github.com/noir-lang/noir/blob/master/acvm-repo/acir/src/lib.rs#L51).
12+
By default it just ascertains that the format has not changed, but it can be told to
13+
write the updated mappings to `acir.cpp` and `witness_stack.cpp`,
14+
which become `acir.hpp` and `witness_stack.hpp` in `aztec-packages`.
15+
16+
The code can be regenerated and copied over with the following commands:
17+
18+
```shell
19+
cd noir/noir-repo && NOIR_CODEGEN_OVERWRITE=1 cargo test -p acir cpp_codegen && cd -
20+
cp noir/noir-repo/acvm-repo/acir/codegen/acir.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/acir.hpp
21+
cp noir/noir-repo/acvm-repo/acir/codegen/witness.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp
22+
cd barretenberg/cpp && ./format.sh changed && cd -
23+
```
1924

2025
2. Full Breaking Change
2126

cpp/src/barretenberg/dsl/acir_format/acir_to_constraint_buf.cpp

Lines changed: 151 additions & 57 deletions
Large diffs are not rendered by default.

cpp/src/barretenberg/dsl/acir_format/acir_to_constraint_buf.hpp

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,18 @@
44

55
namespace acir_format {
66

7-
AcirFormat circuit_buf_to_acir_format(std::vector<uint8_t> const& buf, uint32_t honk_recursion);
8-
97
/**
10-
* @brief Converts from the ACIR-native `WitnessMap` format to Barretenberg's internal `WitnessVector` format.
8+
* @brief Converts from the ACIR-native `WitnessStack` format to Barretenberg's internal `WitnessVector` format.
119
*
12-
* @param buf Serialized representation of a `WitnessMap`.
13-
* @return A `WitnessVector` equivalent to the passed `WitnessMap`.
10+
* @param buf Serialized representation of a `WitnessStack`.
11+
* @return A `WitnessVector` equivalent to the last `WitnessMap` in the stack.
1412
* @note This transformation results in all unassigned witnesses within the `WitnessMap` being assigned the value 0.
1513
* Converting the `WitnessVector` back to a `WitnessMap` is unlikely to return the exact same `WitnessMap`.
1614
*/
1715
WitnessVector witness_buf_to_witness_data(std::vector<uint8_t> const& buf);
1816

17+
AcirFormat circuit_buf_to_acir_format(std::vector<uint8_t> const& buf, uint32_t honk_recursion);
18+
1919
std::vector<AcirFormat> program_buf_to_acir_format(std::vector<uint8_t> const& buf, uint32_t honk_recursion);
2020

2121
WitnessVectorStack witness_buf_to_witness_stack(std::vector<uint8_t> const& buf);
@@ -25,4 +25,4 @@ AcirProgramStack get_acir_program_stack(std::string const& bytecode_path,
2525
std::string const& witness_path,
2626
uint32_t honk_recursion);
2727
#endif
28-
} // namespace acir_format
28+
} // namespace acir_format

0 commit comments

Comments
 (0)