fix(json): base64-decode bytes values on deserialization#646
Conversation
JsonParseNode._get_bytes_value() was encoding the raw base64 string as UTF-8 bytes instead of decoding it. Edm.Binary values in OData JSON are transmitted as base64 strings, so the correct behaviour is to call base64.b64decode() to recover the original bytes. The non-string path (json.dumps fallback) is also removed: the OData JSON spec only represents binary values as base64 strings, so passing a non-string to get_bytes_value() is not a valid call and now returns None. Fixes microsoft#636
|
There was a problem hiding this comment.
Pull request overview
Fixes asymmetric bytes handling in the JSON serialization layer by ensuring JsonParseNode.get_bytes_value() base64-decodes string values (so Edm.Binary fields round-trip correctly with JsonSerializationWriter.write_bytes_value()).
Changes:
- Base64-decode string values in
JsonParseNode._get_bytes_value()(returnNonefor non-string inputs). - Update JSON parse node unit tests to validate decoded bytes behavior, including empty-string and non-string inputs.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| packages/serialization/json/kiota_serialization_json/json_parse_node.py | Switch bytes deserialization to base64.b64decode for string nodes, matching writer behavior. |
| packages/serialization/json/tests/unit/test_json_parse_node.py | Adjust/add tests to assert decoded bytes output (and expected edge cases). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -1,3 +1,4 @@ | |||
| import base64 | |||
baywet
left a comment
There was a problem hiding this comment.
Thanks for the contribution!
Would you mind addressing copilot's comment please? (If I do it, I won't be able to approve/merge). Besides that, we should be good to go.



Overview
JsonParseNode.get_bytes_value()was returning the raw base64 string re-encoded as UTF-8 bytes instead of decoding it. So anyEdm.Binaryproperty from Microsoft Graph (likefileAttachment.contentBytes) would come back as the base64 text in bytes form rather than the actual file data. The serialization writer already usesbase64.b64encodecorrectly, so the reader wasn't matching it.The fix replaces the broken body with
base64.b64decode(value)for string inputs andNonefor non-strings. This mirrors how the text-format parse node handles the same field.Related Issue
Fixes #636
Testing Instructions
cd packages/serialization/jsonpip install -e ".[dev]"(orpoetry install) if not already set uppytest tests/unit/test_json_parse_node.py -k bytes -vtest_get_bytes_value,test_get_bytes_value_empty_string, andtest_get_bytes_value_non_string_returns_nonepytest(89 tests, all pass)Notes
Thanks to @RockyMM for the detailed bug report, precise code location, and suggested fix in #636. This PR implements exactly that approach.