Add JSON support for Erlang and Elixir by bettio · Pull Request #2247 · atomvm/AtomVM

bettio · 2026-03-29T18:35:10Z

Add JSON encoding and decoding to both estdlib (Erlang) and
exavmlib (Elixir), fully conforming to RFC 8259.

Erlang json module:

Pure Erlang encoder/decoder, API-compatible with OTP's json module
Tests from JSONTestSuite (valid, invalid, implementation-defined)
plus Erlang-specific API tests covering custom decoders, streaming,
round-trips, and error handling
format/1,2,3 and related pretty-printing functions are not
implemented

Elixir JSON module:

Adapted from upstream Elixir (commit 03b9fde6), built on top of
the Erlang json module
JSON.Encoder protocol with implementations for Atom, BitString,
Integer, Float, List, and Map
Removed protocol deriving macro and Calendar-based encoders
(not available on AtomVM)

Also adds Keyword.put_new/3 to exavmlib, needed by the Elixir
JSON decoder.

Add compile_multi macro to BuildElixir.cmake and EXTRA_BEAMS
keyword to pack_archive for compiling multi-module .ex sources.

This PR also remove the ancient json_encoder.

These changes are made under both the "Apache 2.0" and the
"GNU Lesser General Public License 2.1 or later" license terms
(dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

petermm · 2026-03-30T14:33:50Z

PR #2247 Review — Add JSON support for Erlang and Elixir

Commits reviewed: a6dda3a (estdlib: Add json module), 3bd5f29 (Keyword: Add put_new/3), 4bec6b5 (exavmlib: Add JSON module)

Overall: Well-structured PR with comprehensive test coverage (JSONTestSuite corpus + API tests + round-trips). The implementation is clean and the compile_multi cmake macro is a good solution for multi-module .ex files. Several issues need addressing before the OTP-compatibility claim holds.

🔴 Must-Fix

1. Streaming decoder finalizes numbers prematurely

decode_start/3 calls parse_value which returns immediately when a number consumes the entire buffer (no trailing bytes). OTP treats a number at buffer end as ambiguous until a delimiter or end_of_input arrives.

Problem: If a stream splits "12" as "1" + "2", the decoder commits 1 on the first chunk instead of continuing.

Also, decode_continue(end_of_input, ...) always raises unexpected_end instead of finalizing a pending number.

File: libs/estdlib/src/json.erl lines 132–155

%% Current code — numbers at buffer end are finalized immediately,
%% and end_of_input always errors:

decode_start(Bin, Acc, Decoders) ->
    Cbs = make_callbacks(Decoders),
    try
        Bin1 = skip_whitespace(Bin),
        parse_value(Bin1, Acc, Cbs)
    catch
        error:unexpected_end ->
            {continue, {Bin, Acc, Cbs}}
    end.

decode_continue(end_of_input, {_Buf, _Acc, _Cbs}) ->
    error(unexpected_end);
decode_continue(NewData, {Buf, Acc, Cbs}) ->
    Combined = <<Buf/binary, NewData/binary>>,
    try
        Bin1 = skip_whitespace(Combined),
        parse_value(Bin1, Acc, Cbs)
    catch
        error:unexpected_end ->
            {continue, {Combined, Acc, Cbs}}
    end.

Expected OTP behavior:

%% OTP: number at buffer end is ambiguous
{continue, S} = json:decode_start(<<"1">>, ok, #{}),
%% More digits arrive — number becomes 12
{12, ok, <<>>} = json:decode_continue(<<"2">>, S),

%% OTP: end_of_input finalizes pending numbers
{continue, S2} = json:decode_start(<<"42">>, ok, #{}),
{42, ok, <<>>} = json:decode_continue(end_of_input, S2).

Suggested approach: The number parser needs to raise unexpected_end when it reaches end-of-buffer with no trailing delimiter (the number could have more digits). Then decode_continue(end_of_input, ...) should attempt a final parse where reaching buffer end is a valid terminator. This requires either a flag/mode for the parser or a separate finalization step for numbers.

2. Malformed `\uXXXX` escapes raise wrong error tag

hex/1 raises {invalid_byte, D} for bad hex digits inside unicode escapes. OTP classifies these as {unexpected_sequence, Bytes}.

File: libs/estdlib/src/json.erl lines 343–349

%% Current: bad hex digit raises {invalid_byte, D}
hex4(H1, H2, H3, H4) ->
    (hex(H1) bsl 12) bor (hex(H2) bsl 8) bor (hex(H3) bsl 4) bor hex(H4).

hex(D) when D >= $0, D =< $9 -> D - $0;
hex(D) when D >= $a, D =< $f -> D - $a + 10;
hex(D) when D >= $A, D =< $F -> D - $A + 10;
hex(D) -> error({invalid_byte, D}).

Suggested fix: Catch the bad-hex error in parse_unicode_escape and reclassify:

-%% Unicode escape: \uXXXX with surrogate pair support
 parse_unicode_escape(<<H1, H2, H3, H4, Rest/binary>>, Parts, Acc, Cbs) ->
-    CP = hex4(H1, H2, H3, H4),
-    if
-        CP >= 16#D800, CP =< 16#DBFF ->
-            %% High surrogate - must be followed by low surrogate \uDC00-\uDFFF
-            parse_surrogate_low(Rest, CP, H1, H2, H3, H4, Parts, Acc, Cbs);
-        CP >= 16#DC00, CP =< 16#DFFF ->
-            %% Lone low surrogate
-            error({unexpected_sequence, <<$\\, $u, H1, H2, H3, H4>>});
-        true ->
-            Utf8 = <<CP/utf8>>,
-            string_slow(Rest, [Utf8 | Parts], Acc, Cbs)
-    end;
+    case safe_hex4(H1, H2, H3, H4) of
+        error ->
+            error({unexpected_sequence, <<$\\, $u, H1, H2, H3, H4>>});
+        CP when CP >= 16#D800, CP =< 16#DBFF ->
+            parse_surrogate_low(Rest, CP, H1, H2, H3, H4, Parts, Acc, Cbs);
+        CP when CP >= 16#DC00, CP =< 16#DFFF ->
+            error({unexpected_sequence, <<$\\, $u, H1, H2, H3, H4>>});
+        CP ->
+            Utf8 = <<CP/utf8>>,
+            string_slow(Rest, [Utf8 | Parts], Acc, Cbs)
+    end;
 parse_unicode_escape(_, _, _, _) ->
     error(unexpected_end).

+safe_hex4(H1, H2, H3, H4) ->
+    case {safe_hex(H1), safe_hex(H2), safe_hex(H3), safe_hex(H4)} of
+        {error, _, _, _} -> error;
+        {_, error, _, _} -> error;
+        {_, _, error, _} -> error;
+        {_, _, _, error} -> error;
+        {V1, V2, V3, V4} -> (V1 bsl 12) bor (V2 bsl 8) bor (V3 bsl 4) bor V4
+    end.
+
+safe_hex(D) when D >= $0, D =< $9 -> D - $0;
+safe_hex(D) when D >= $a, D =< $f -> D - $a + 10;
+safe_hex(D) when D >= $A, D =< $F -> D - $A + 10;
+safe_hex(_) -> error.

Similarly, parse_surrogate_low calls hex4 for the low surrogate — bad hex digits there should also become {unexpected_sequence, ...} covering both \uHHHH\uLLLL escapes:

 parse_surrogate_low(
     <<$\\, $u, L1, L2, L3, L4, Rest/binary>>, High, H1, H2, H3, H4, Parts, Acc, Cbs
 ) ->
-    Low = hex4(L1, L2, L3, L4),
-    case Low >= 16#DC00 andalso Low =< 16#DFFF of
-        true ->
+    case safe_hex4(L1, L2, L3, L4) of
+        error ->
+            error({unexpected_sequence,
+                <<$\\, $u, H1, H2, H3, H4, $\\, $u, L1, L2, L3, L4>>});
+        Low when Low >= 16#DC00, Low =< 16#DFFF ->
             Combined = 16#10000 + ((High - 16#D800) bsl 10) + (Low - 16#DC00),
             Utf8 = <<Combined/utf8>>,
             string_slow(Rest, [Utf8 | Parts], Acc, Cbs);
-        false ->
+        _Low ->
             error(
                 {unexpected_sequence, <<$\\, $u, H1, H2, H3, H4, $\\, $u, L1, L2, L3, L4>>})
     end;

3. `encode_map*/2` ignores custom encoder for object keys

The encoder always uses encode_binary/1 for keys, bypassing the user-supplied Encoder function. OTP applies the encoder to binary/atom keys.

File: libs/estdlib/src/json.erl lines 710–736, 772–775

%% Current: encode_key always uses encode_binary directly
encode_key(Key) when is_binary(Key) -> encode_binary(Key);
encode_key(Key) when is_atom(Key) -> encode_binary(atom_to_binary(Key, utf8));
encode_key(Key) when is_integer(Key) -> encode_binary(integer_to_binary(Key));
encode_key(Key) when is_float(Key) -> encode_binary(float_to_binary(Key, [short])).

Impact: A custom encoder using encode_binary_escape_all would encode values with \uXXXX escapes but keys would get raw UTF-8:

%% User expects all non-ASCII escaped in both keys and values:
Enc = fun(B, E) when is_binary(B) -> json:encode_binary_escape_all(B);
         (Other, E) -> json:encode_value(Other, E)
      end,
json:encode_map(#{<<"é">> => <<"é">>}, Enc).
%% Expected: {"\u00E9":"\u00E9"}
%% Actual:   {"é":"\u00E9"}     ← key not passed through Encoder

Suggested fix:

-encode_object(IterFun, State, Encoder) ->
+encode_object(IterFun, State, Encoder) ->
     case IterFun(State) of
         none ->
             <<"{}">>;
         {Key, Value, Next} ->
             [
                 ${,
-                encode_key(Key),
+                encode_key(Key, Encoder),
                 $:,
                 Encoder(Value, Encoder)
                 | encode_object_rest(IterFun, Next, Encoder)
             ]
     end.

-encode_object_rest(IterFun, State, Encoder) ->
+encode_object_rest(IterFun, State, Encoder) ->
     case IterFun(State) of
         none ->
             [$}];
         {Key, Value, Next} ->
             [
                 $,,
-                encode_key(Key),
+                encode_key(Key, Encoder),
                 $:,
                 Encoder(Value, Encoder)
                 | encode_object_rest(IterFun, Next, Encoder)
             ]
     end.

-encode_key(Key) when is_binary(Key) -> encode_binary(Key);
-encode_key(Key) when is_atom(Key) -> encode_binary(atom_to_binary(Key, utf8));
-encode_key(Key) when is_integer(Key) -> encode_binary(integer_to_binary(Key));
-encode_key(Key) when is_float(Key) -> encode_binary(float_to_binary(Key, [short])).
+encode_key(Key, Encoder) when is_binary(Key) -> Encoder(Key, Encoder);
+encode_key(Key, Encoder) when is_atom(Key) -> Encoder(atom_to_binary(Key, utf8), Encoder);
+encode_key(Key, _Encoder) when is_integer(Key) -> encode_binary(integer_to_binary(Key));
+encode_key(Key, _Encoder) when is_float(Key) -> encode_binary(float_to_binary(Key, [short])).

The same change applies to encode_object_checked / encode_object_rest_checked.

4. Elixir `decode/3` always reports byte offset `0`

File: libs/exavmlib/lib/json.ex lines 191–202

    try do
      :json.decode(binary, acc, Map.new(decoders))
    catch
      :error, :unexpected_end ->
        {:error, {:unexpected_end, byte_size(binary)}}

      :error, {:invalid_byte, byte} ->
        {:error, {:invalid_byte, 0, byte}}          # ← always 0

      :error, {:unexpected_sequence, bytes} ->
        {:error, {:unexpected_sequence, 0, bytes}}   # ← always 0
    end

The underlying Erlang parser does not track byte offsets in error tuples. Upstream Elixir reports accurate offsets.

Options:

(a) Thread byte offsets through the Erlang parser errors (significant refactor).
(b) Document as an AtomVM limitation — update the moduledoc to note that error offsets are not available (report 0).

At minimum, if going with (b):

   @moduledoc ~S"""
   JSON encoding and decoding.

   Both encoder and decoder fully conform to
   [RFC 8259](https://tools.ietf.org/html/rfc8259) and
   [ECMA 404](https://ecma-international.org/publications-and-standards/standards/ecma-404/)
   standards.
+
+  > **AtomVM limitation:** Decode error offsets are not tracked.
+  > `{:error, {:invalid_byte, offset, byte}}` and similar tuples
+  > always report offset `0`.

🟡 Important — Should Document or Address

5. Recursive descent — no depth protection

parse_value → parse_array → parse_value → ... is recursive with no depth limit. On AtomVM's constrained process stacks, deeply nested JSON from untrusted input can crash the VM.

OTP uses an iterative state-machine parser. This PR's approach is fine for trusted input but should be documented:

 %% @doc Decodes a JSON binary.
+%% Note: the parser uses recursive descent. Deeply nested JSON
+%% (hundreds of levels) may exhaust the process stack. Do not use
+%% with untrusted input of arbitrary nesting depth.
 -spec decode(binary()) -> decode_value().
 decode(Bin) ->

6. Streaming reparse is O(n²)

decode_continue/2 concatenates <<Buf/binary, NewData/binary>> and reparses from the start. For many small chunks this is quadratic. Acceptable for the typical AtomVM use case (small payloads) but worth a doc note for users considering large streaming JSON:

%% Note: each call reparses the accumulated buffer from the start.
%% For large payloads split across many small chunks, consider
%% accumulating the full binary before calling decode/1.

7. Sub-binary memory retention

The string parser returns sub-binaries via binary_part/3 — this pins the entire original JSON binary in memory. For large JSON where only a few small values are retained long-term, this can be a significant memory issue on microcontrollers.

Not a code bug, but worth documenting for AtomVM users.

✅ Looks Good

Build system (`BuildElixir.cmake`)

The compile_multi macro uses mkdir -p, sh -c "mv ...", rmdir — this matches the existing pack_archive macro pattern at lines 90–96 of the same file, so it's consistent with project conventions. While cmake -E would be more portable, changing it just for compile_multi would be inconsistent.

`Keyword.put_new/3`

Implementation is correct and matches upstream Elixir:

def put_new(keywords, key, value) when is_list(keywords) and is_atom(key) do
  case :lists.keyfind(key, 1, keywords) do
    {^key, _} -> keywords
    false -> [{key, value} | keywords]
  end
end

Number parser

RFC 8259 grammar is correctly implemented (no leading zeros, fraction/exponent require digits).
-0 → integer 0 is correct (matches binary_to_integer(<<"-0">>) and OTP).
float_exp normalization correctly handles negatives: -1e5 → -1.0e5 (the scan finds e at position 2, inserts .0 before it).

String parser

Fast path (no escapes) / slow path (with escapes) split is clean.
UTF-8 validation via /utf8 pattern matching.
Surrogate pair handling is correct.
Encoder escape sequences including \uXXXX for non-BMP via surrogate pairs.

Test coverage

Very comprehensive:

566+ JSONTestSuite corpus entries (y/n/i/t categories)
Round-trip tests for both valid and implementation-defined cases
Custom decoder callbacks (array, object, string, integer, float, null)
Streaming across multiple calls
Realistic weather JSON with nested objects, arrays, UTF-8
Encoder: strings, arrays, maps, key-value lists, checked duplicates, errors
Elixir: decode/1, decode/3, decode!/1, encode!/1, encode_to_iodata!/1, round-trips

Elixir `JSON` module

Clean adaptation from upstream. Correct removal of unsupported features (protocol deriving, Calendar encoders). protocol_encode fast-path dispatch is a good optimization.

📋 Suggested Additional Tests

%% 1. Malformed unicode escape classification
?ASSERT_ERROR_MATCH({unexpected_sequence, _}, json:decode(<<"\"\\uqqqq\"">>)),
?ASSERT_ERROR_MATCH({unexpected_sequence, _}, json:decode(<<"\"\\u00Aq\"">>)),
?ASSERT_ERROR_MATCH({unexpected_sequence, _}, json:decode(<<"\"\\uD800\\u12x4\"">>)),

%% 2. Custom encoder applied to map keys
EscAllEnc = fun(B, E) when is_binary(B) -> json:encode_binary_escape_all(B);
               (Other, E) -> json:encode_value(Other, E)
            end,
%% Key with non-ASCII should be escaped via custom encoder
Encoded = to_bin(json:encode_map(#{<<16#C3, 16#A9>> => 1}, EscAllEnc)),
true = binary:match(Encoded, <<"\\u00E9">>) =/= nomatch,

%% 3. Streaming number edge cases (once streaming is fixed)
{continue, S1} = json:decode_start(<<"1">>, ok, #{}),
{12, ok, <<>>} = json:decode_continue(<<"2">>, S1),
{continue, S2} = json:decode_start(<<"42">>, ok, #{}),
{42, ok, <<>>} = json:decode_continue(end_of_input, S2),
%% Split in exponent
{continue, S3} = json:decode_start(<<"1e">>, ok, #{}),
{1.0e5, ok, <<>>} = json:decode_continue(<<"5">>, S3),

# 4. Elixir - error offset is always 0 (document/test this behavior)
{:error, {:invalid_byte, 0, _}} = JSON.decode("x", :ok, [])

petermm · 2026-04-01T19:41:41Z

usual caveats, but looking good!

PR Review: JSON Module (3 commits)

Commits reviewed:

a3298d70 — estdlib: Add json module (Erlang)
0427a139 — Keyword: Add put_new/3
2dea4f12 — exavmlib: Add JSON module (Elixir)

Verdict: Strong addition. Close but not fully OTP-compatible yet — 3 issues should be fixed before claiming OTP compatibility.

🔴 High Priority

1. `decode/3` / `decode_start/3` don't trim trailing whitespace in `Rest`

OTP trims top-level whitespace before returning leftover bytes. decode_continue/2 already does this (line 195), but decode/3 and decode_start/3 do not — making behavior inconsistent across entrypoints.

This also affects the Elixir wrapper: JSON.decode("42 x") will blame the space, not x.

Fix in json.erl — change reduce_value at top-level to trim:

 %% After parsing a value, attach it to the parent container or return at top level.
 reduce_value(Value, Rest, Acc, _Callbacks, []) ->
-    {Value, Acc, Rest};
+    {Value, Acc, skip_whitespace(Rest)};

2. Checked duplicate detection uses raw keys, not encoded keys

encode_object_checked uses key_to_binary/1 for duplicate tracking, but a custom encoder that normalizes keys (e.g. uppercases) can silently produce duplicate JSON keys. OTP uses the actual encoded key bytes.

Fix in json.erl — use encoded key for Seen:

 encode_object_checked(IterFun, State, Encoder) ->
     case IterFun(State) of
         none ->
             <<"{}">>;
         {Key, Value, Next} ->
-            KeyBin = key_to_binary(Key),
+            KeyBin = iolist_to_binary(encode_key(Key, Encoder)),
             [
                 ${,
-                encode_key(Key, Encoder),
+                KeyBin,
                 $:,
                 Encoder(Value, Encoder)
                 | encode_object_rest_checked(IterFun, Next, Encoder, #{KeyBin => Key})
             ]
     end.

 encode_object_rest_checked(IterFun, State, Encoder, Seen) ->
     case IterFun(State) of
         none ->
             [$}];
         {Key, Value, Next} ->
-            KeyBin = key_to_binary(Key),
+            KeyBin = iolist_to_binary(encode_key(Key, Encoder)),
             case Seen of
                 #{KeyBin := _} -> error({duplicate_key, Key});
                 _ -> ok
             end,
             [
                 $,,
-                encode_key(Key, Encoder),
+                KeyBin,
                 $:,
                 Encoder(Value, Encoder)
                 | encode_object_rest_checked(IterFun, Next, Encoder, Seen#{KeyBin => Key})
             ]
     end.

3. `encode_binary*` misclassifies incomplete UTF-8 as `{invalid_byte, _}`

The escape/4 function (line 709) falls through to error({invalid_byte, B}) for truncated UTF-8 sequences. OTP differentiates: incomplete UTF-8 → unexpected_end, invalid UTF-8 → {invalid_byte, Byte}.

string_fast/4 already has this logic (lines 390–396) but escape/4 and string_slow_run/5 do not.

Fix — add truncated-UTF-8 handling to escape/4 before the catch-all:

 escape(<<_/utf8, Rest/binary>>, Orig, Acc, EA) ->
     escape(Rest, Orig, Acc, EA);
+escape(<<B, Rest/binary>>, _, _, _) when
+    byte_size(Rest) < 3,
+    (B >= 16#C0 andalso B < 16#E0 andalso byte_size(Rest) < 1) orelse
+        (B >= 16#E0 andalso B < 16#F0 andalso byte_size(Rest) < 2) orelse
+        (B >= 16#F0 andalso B < 16#F8 andalso byte_size(Rest) < 3)
+->
+    error(unexpected_end);
 escape(<<B, _/binary>>, _, _, _) ->
     error({invalid_byte, B}).

And similarly for string_slow_run/5:

 string_slow_run(<<_/utf8, Rest/binary>>, RunOrig, Parts, Acc, Callbacks) ->
     string_slow_run(Rest, RunOrig, Parts, Acc, Callbacks);
+string_slow_run(<<B, Rest/binary>>, _, _, _, _) when
+    byte_size(Rest) < 3,
+    (B >= 16#C0 andalso B < 16#E0 andalso byte_size(Rest) < 1) orelse
+        (B >= 16#E0 andalso B < 16#F0 andalso byte_size(Rest) < 2) orelse
+        (B >= 16#F0 andalso B < 16#F8 andalso byte_size(Rest) < 3)
+->
+    error(unexpected_end);
 string_slow_run(<<B, _/binary>>, _, _, _, _) ->
     error({invalid_byte, B});

🟡 Medium Priority

4. Literal error byte reporting is imprecise

maybe_partial_literal/3 (line 363–368) reports the first byte (t/f/n) on mismatches like <<"truX">>, instead of the actual offending byte. Doesn't affect pass/fail, but weakens error diagnostics.

5. `formatter/0` type exported but `format/*` not implemented

The module exports a formatter/0 type (line 49–63) but does not implement OTP's format/1,2,3 or format_value/3. Either remove the type or note the intentional omission.

6. Compatibility claim is too broad

The commit message says "API-compatible with OTP's json module" but format/* is missing. Consider narrowing to "encode/decode API-compatible with OTP 27's json module".

✅ Strengths

Iterative stack-based decoder — excellent fit for embedded/AtomVM, avoids VM stack overflow on deep nesting
binary:copy/1 on decoded strings — prevents sub-binary references from pinning large input buffers in memory
Encoder returns iodata throughout — efficient for IO, avoids unnecessary copies
Test coverage is excellent — JSONTestSuite corpus (y/n/i/t), sliding-split streaming tests, deep nesting (1000 levels), round-trip tests
Keyword.put_new/3 is correct, simple, low-risk
BuildElixir.cmake compile_multi is a pragmatic solution for single-file multi-module Elixir sources
Streaming reparsing design is documented with appropriate caveats (O(n²) for many small chunks)

📝 Suggested Test Additions

After applying the fixes above, add these targeted tests:

%% Rest whitespace trimming
{42, ok, <<"rest">>} = json:decode(<<"42  rest">>, ok, #{}),
{true, ok, <<"x">>} = json:decode(<<"true  x">>, ok, #{}),

%% Incomplete UTF-8 in encoder → unexpected_end
ok = try json:encode_binary(<<16#C3>>) of
    _ -> error(should_have_failed)
catch
    error:unexpected_end -> ok
end,

%% Checked encoder with normalizing custom key encoder
%% (should detect duplicates after encoding)

Pure Erlang JSON encoder/decoder, API-compatible with OTP's json module. Supports decode/1, decode/3 with custom callbacks, and streaming via decode_start/decode_continue. Encoder handles maps, key-value lists, and checked variants for duplicate key detection. Tests combine the JSONTestSuite corpus (valid, invalid, and implementation-defined cases) with Erlang-specific API tests covering custom decoders, streaming, round-trips, key-value list encoding, and error handling. Two JSONTestSuite entries (n_structure_100000_opening_arrays and n_structure_open_array_object) are excluded because their ~634 KB of embedded test data would exhaust AtomVM memory. Signed-off-by: Davide Bettio <davide@uninstall.it>

Needed by the Elixir JSON module which uses Keyword.put_new to set the default null decoder. Signed-off-by: Davide Bettio <davide@uninstall.it>

Elixir JSON encoder/decoder adapted from upstream Elixir (commit 03b9fde6), built on top of the Erlang json module. Removed from upstream: - Protocol deriving macro (not supported on AtomVM) - Duration, Date, Time, NaiveDateTime, DateTime encoders (Calendar modules not available) - Error offset extraction from __STACKTRACE__ error_info Upstream Elixir defines the protocol, all implementations, and the main module in a single json.ex. The existing build system compiles each .ex file independently, so protocol implementations would race their protocol definition in parallel builds. Add a compile_multi macro to BuildElixir.cmake that compiles one .ex source and declares all its .beam outputs. Add EXTRA_BEAMS keyword to pack_archive so the extra beams are included in the archive. Signed-off-by: Davide Bettio <davide@uninstall.it>

json_encoder was ancient code from 2018, that didn't even support maps, since AtomVM has support for json official API, it was time to remove it. Signed-off-by: Davide Bettio <davide@uninstall.it>

bettio force-pushed the json branch from 1fc7703 to 4bec6b5 Compare March 30, 2026 13:30

bettio changed the title ~~Json~~ Add JSON support for Erlang and Elixir Mar 30, 2026

bettio force-pushed the json branch 3 times, most recently from 6112795 to 2dea4f1 Compare April 1, 2026 11:27

bettio force-pushed the json branch from 2dea4f1 to 7af7646 Compare April 2, 2026 14:05

bettio added 4 commits April 2, 2026 23:23

Keyword: Add put_new/3

cf76d66

Needed by the Elixir JSON module which uses Keyword.put_new to set the default null decoder. Signed-off-by: Davide Bettio <davide@uninstall.it>

eavmlib: Remove json_encoder module

40bb6f0

json_encoder was ancient code from 2018, that didn't even support maps, since AtomVM has support for json official API, it was time to remove it. Signed-off-by: Davide Bettio <davide@uninstall.it>

bettio force-pushed the json branch from 7af7646 to 40bb6f0 Compare April 2, 2026 21:23

petermm approved these changes Apr 3, 2026

View reviewed changes

bettio merged commit cfe20d9 into atomvm:release-0.7 Apr 3, 2026
208 of 215 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add JSON support for Erlang and Elixir#2247

Add JSON support for Erlang and Elixir#2247
bettio merged 4 commits intoatomvm:release-0.7from
bettio:json

bettio commented Mar 29, 2026 •

edited

Loading

Uh oh!

petermm commented Mar 30, 2026

Uh oh!

petermm commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bettio commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petermm commented Mar 30, 2026

PR #2247 Review — Add JSON support for Erlang and Elixir

🔴 Must-Fix

1. Streaming decoder finalizes numbers prematurely

2. Malformed \uXXXX escapes raise wrong error tag

3. encode_map*/2 ignores custom encoder for object keys

4. Elixir decode/3 always reports byte offset 0

🟡 Important — Should Document or Address

5. Recursive descent — no depth protection

6. Streaming reparse is O(n²)

7. Sub-binary memory retention

✅ Looks Good

Build system (BuildElixir.cmake)

Keyword.put_new/3

Number parser

String parser

Test coverage

Elixir JSON module

📋 Suggested Additional Tests

Uh oh!

petermm commented Apr 1, 2026

PR Review: JSON Module (3 commits)

🔴 High Priority

1. decode/3 / decode_start/3 don't trim trailing whitespace in Rest

2. Checked duplicate detection uses raw keys, not encoded keys

3. encode_binary* misclassifies incomplete UTF-8 as {invalid_byte, _}

🟡 Medium Priority

4. Literal error byte reporting is imprecise

5. formatter/0 type exported but format/* not implemented

6. Compatibility claim is too broad

✅ Strengths

📝 Suggested Test Additions

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bettio commented Mar 29, 2026 •

edited

Loading

2. Malformed `\uXXXX` escapes raise wrong error tag

3. `encode_map*/2` ignores custom encoder for object keys

4. Elixir `decode/3` always reports byte offset `0`

Build system (`BuildElixir.cmake`)

`Keyword.put_new/3`

Elixir `JSON` module

1. `decode/3` / `decode_start/3` don't trim trailing whitespace in `Rest`

3. `encode_binary*` misclassifies incomplete UTF-8 as `{invalid_byte, _}`

5. `formatter/0` type exported but `format/*` not implemented