A fast, spec-compliant CBOR (RFC 8949) implementation for mruby
| Feature | Details |
|---|---|
| Core Types | Integers, floats, strings, byte strings, arrays, maps, booleans, nil |
| BigInt Support | Tags 2/3 (RFC 8949) when compiled with MRB_USE_BIGINT |
| Float Precision | Float16/32/64 with subnormals, Inf, and NaN |
| Shared References | Tags 28/29 for deduplication, including cyclic structures |
| Zero-Copy Decoding | Both eager and lazy decoding operate directly on the input buffer without copying |
| Lazy Decoding | CBOR::Lazy for on-demand nested access with key and result caching |
| Streaming | CBOR.stream for CBOR sequence reading |
| Performance | ~30% faster than msgpack; 1.3β3Γ faster than simdjson for selective access |
| Limitation | Reason |
|---|---|
| No indefinite-length items | Use CBOR.stream mode instead. |
Determinism Guarantees:
- Encoding is deterministic within a single mruby build
- Hash field order follows insertion order (per mruby hash impl)
- Float width (16/32/64) is compile-time fixed via
MRB_USE_FLOAT32 - Symbol encoding strategy is global; don't mix
no_symbols/symbols_as_string/symbols_as_uint32in the same program - Not deterministic across builds if you rebuild mruby with different CFLAGS or config
Recursion Depth Limits:
Default CBOR_MAX_DEPTH depends on mruby profile:
MRB_PROFILE_MAIN/MRB_PROFILE_HIGH: 512MRB_PROFILE_BASELINE: 64- Constrained / other: 32
Exceeding this raises RuntimeError: "CBOR nesting depth exceeded". Override by setting CBOR_MAX_DEPTH at compile time.
- Encoding: ~30% faster than msgpack (SBO + incremental writes)
- Lazy decoding: 1.3β3Γ faster than simdjson for selective access
- Shared refs: Tags 28/29 deduplication is O(1) amortized
- Float encoding: No overhead; width selection happens once per value at encode time
When to use lazy decoding:
- Decoding large payloads where you only access a subset of fields
- Streaming/telemetry where you care about specific fields
- Network protocols where you validate before deserializing
When to use eager decoding:
- Small payloads
- You need the full object in memory instantly
- Simplicity over optimization
# Encode
buf = CBOR.encode({ "hello" => [1, 2, 3], "ok" => true })
# Decode
obj = CBOR.decode(buf)
# => {"hello" => [1, 2, 3], "ok" => true}
# Lazy decode β only parses what you access
lazy = CBOR.decode_lazy(buf)
lazy["hello"][1].value # => 2 (constant-time after first access)decode_lazy returns a CBOR::Lazy object wrapping the raw buffer without decoding the value. Navigate with [] or dig, then call .value when you need the actual value.
lazy = CBOR.decode_lazy(big_payload)
# Navigate nested structures
status = lazy["statuses"][0]["text"].value
# Equivalent: use `dig` for safety
status = lazy.dig("statuses", 0, "text").value
# Error handling
lazy["missing"] # => KeyError (raises)
lazy.dig("missing", "text") # => nil (safe)Performance: Access is O(n) only in skipped elements, not the full document. Keys and Results are cached for O(1) repeated access.
# Repeated access uses cache
inner = lazy["outer"]["inner"].value
inner2 = lazy["outer"]["inner"].value
assert_same inner, inner2 # Same object (cached)Eliminate duplication and represent cyclic structures.
# Encode with deduplication
shared_array = [1, 2, 3]
buf = CBOR.encode(
{ "x" => shared_array, "y" => shared_array },
sharedrefs: true
)
# Decode: identity is preserved
decoded = CBOR.decode(buf)
decoded["x"].equal?(decoded["y"]) # => true β
# Works with lazy decoding too
lazy = CBOR.decode_lazy(buf)
lazy["x"].value.equal?(lazy["y"].value) # => true βCyclic Structures:
cyclic = []
cyclic << cyclic
buf = CBOR.encode(cyclic, sharedrefs: true)
result = CBOR.decode(buf)
result.equal?(result[0]) # => true (self-referential)Register your own classes for CBOR encoding/decoding.
class Person
attr_accessor :name, :age
# Declare which fields to encode/decode and their expected CBOR types
native_ext_type :@name, CBOR::Type::String
native_ext_type :@age, CBOR::Type::Integer
# Called after decoding (optional)
def _after_decode
puts "Person #{@name} loaded"
self
end
# Called before encoding (optional)
# Must return self or a modified object
def _before_encode
@age += 1 if @age < 18 # Example transformation
self
end
end
# Register with a tag number (user-defined: 1000+)
CBOR.register_tag(1000, Person)
person = Person.new
person.name = "Alice"
person.age = 30
encoded = CBOR.encode(person)
decoded = CBOR.decode(encoded) # => Person object, after_decode calledAvailable Types:
CBOR::Type::UnsignedIntCBOR::Type::NegativeIntCBOR::Type::String(UTF-8 text)CBOR::Type::Bytes(raw bytes)CBOR::Type::Array,CBOR::Type::MapCBOR::Type::Tagged(for Bigint, your own registered Tags)CBOR::Type::Simple(nil, false, true, Floats)
Convenience Types:
CBOR::Type::BytesOrStringCBOR::Type::Integer
Three strategies for encoding Ruby symbols:
# 1. Default: strip symbols (no tag, no round-trip)
CBOR.no_symbols
sym = :hello
encoded = CBOR.encode(sym) # Encodes as plain string "hello"
decoded = CBOR.decode(encoded) # => "hello" (not a symbol!)
# 2. Use tag 39 + string (RFC 8949, interoperable)
CBOR.symbols_as_string
sym = :hello
encoded = CBOR.encode(sym)
decoded = CBOR.decode(encoded) # => :hello (symbol preserved)
# 3. Use tag 39 + uint32 (mruby presym only, fastest)
CBOR.symbols_as_uint32
sym = :hello
encoded = CBOR.encode(sym)
decoded = CBOR.decode(encoded) # => :hello (symbol preserved, same mruby only)Mode Comparison:
| Mode | Encoding | Interop | Round-trip | Speed |
|---|---|---|---|---|
no_symbols |
Plain string | β All | β No | Fast |
symbols_as_string |
Tag 39 + string | β All | β Yes | Good |
symbols_as_uint32 |
Tag 39 + uint32 | β mruby only | β Yes | Fastest |
β οΈ Warning:symbols_as_uint32requires:
- Same mruby build β encoder and decoder must use the same mruby executable (same
libmruby.a)- Compile-time symbols β all symbols must be interned at build time (see presym docs)
- No runtime symbol creation β decoding fails if the symbol ID doesn't exist in the decoder's presym table
Use only for internal mruby-to-mruby IPC on the same build. For external/user data, use
symbols_as_string.
Read a sequence of CBOR documents from any File-like object:
File.open("data.cbor", "rb") do |f|
CBOR.stream(f) do |doc|
puts doc.value.inspect
end
end
# Or as an enumerator
docs = CBOR.stream(f).map(&:value)Add to your build_config.rb:
conf.gem github: "Asmod4n/mruby-cbor"Then build:
rake compile
rake test| Error | When | Example |
|---|---|---|
ArgumentError |
Invalid encode options | CBOR.encode(obj, bad_option: true) |
RangeError |
Integer out of bounds | Encoding a Bigint larger than uint64 |
RuntimeError |
Nesting depth exceeded | Deeply nested structures beyond CBOR_MAX_DEPTH |
RuntimeError |
Truncated/invalid CBOR | CBOR.decode(incomplete_buffer) |
TypeError |
Type mismatch in registered tags | Field marked as String gets an Array |
KeyError |
Lazy access to missing key | lazy["nonexistent"] (use .dig to get nil instead) |
NotImplementedError |
Presym on non-presym mruby | CBOR.symbols_as_uint32 on build without presym |
- CBOR (RFC 8949): https://tools.ietf.org/html/rfc8949
- Test Vectors: Official test suite in
/test-vectors
Apache License 2.0