Skip to content

fix(json): base64-decode bytes values on deserialization#646

Open
Zelys-DFKH wants to merge 1 commit into
microsoft:mainfrom
Zelys-DFKH:fix/json-bytes-base64-decode
Open

fix(json): base64-decode bytes values on deserialization#646
Zelys-DFKH wants to merge 1 commit into
microsoft:mainfrom
Zelys-DFKH:fix/json-bytes-base64-decode

Conversation

@Zelys-DFKH

Copy link
Copy Markdown

Overview

JsonParseNode.get_bytes_value() was returning the raw base64 string re-encoded as UTF-8 bytes instead of decoding it. So any Edm.Binary property from Microsoft Graph (like fileAttachment.contentBytes) would come back as the base64 text in bytes form rather than the actual file data. The serialization writer already uses base64.b64encode correctly, so the reader wasn't matching it.

The fix replaces the broken body with base64.b64decode(value) for string inputs and None for non-strings. This mirrors how the text-format parse node handles the same field.

Related Issue

Fixes #636

Testing Instructions

  • cd packages/serialization/json
  • pip install -e ".[dev]" (or poetry install) if not already set up
  • Run the bytes-specific tests: pytest tests/unit/test_json_parse_node.py -k bytes -v
  • All three tests should pass: test_get_bytes_value, test_get_bytes_value_empty_string, and test_get_bytes_value_non_string_returns_none
  • Full suite: pytest (89 tests, all pass)

Notes

Thanks to @RockyMM for the detailed bug report, precise code location, and suggested fix in #636. This PR implements exactly that approach.

JsonParseNode._get_bytes_value() was encoding the raw base64 string as
UTF-8 bytes instead of decoding it. Edm.Binary values in OData JSON are
transmitted as base64 strings, so the correct behaviour is to call
base64.b64decode() to recover the original bytes.

The non-string path (json.dumps fallback) is also removed: the OData
JSON spec only represents binary values as base64 strings, so passing a
non-string to get_bytes_value() is not a valid call and now returns None.

Fixes microsoft#636
@Zelys-DFKH Zelys-DFKH requested a review from a team as a code owner June 29, 2026 19:39
@sonarqubecloud

Copy link
Copy Markdown

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes asymmetric bytes handling in the JSON serialization layer by ensuring JsonParseNode.get_bytes_value() base64-decodes string values (so Edm.Binary fields round-trip correctly with JsonSerializationWriter.write_bytes_value()).

Changes:

  • Base64-decode string values in JsonParseNode._get_bytes_value() (return None for non-string inputs).
  • Update JSON parse node unit tests to validate decoded bytes behavior, including empty-string and non-string inputs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
packages/serialization/json/kiota_serialization_json/json_parse_node.py Switch bytes deserialization to base64.b64decode for string nodes, matching writer behavior.
packages/serialization/json/tests/unit/test_json_parse_node.py Adjust/add tests to assert decoded bytes output (and expected edge cases).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -1,3 +1,4 @@
import base64

@baywet baywet left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

Would you mind addressing copilot's comment please? (If I do it, I won't be able to approve/merge). Besides that, we should be good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

JsonParseNode.get_bytes_value() does not base64-decode Edm.Binary (asymmetric with write_bytes_value)

3 participants