Skip to content

feat: k_mem_slab fixed-size block allocator (#46a)#51

Closed
swoisz wants to merge 132 commits into
mainfrom
feature/k-mem-slab
Closed

feat: k_mem_slab fixed-size block allocator (#46a)#51
swoisz wants to merge 132 commits into
mainfrom
feature/k-mem-slab

Conversation

@swoisz

@swoisz swoisz commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Ports Zephyr's k_mem_slab fixed-size block allocator — the low-risk half of #46 (split from k_fifo/k_lifo, which carries the object-lifetime risk). Independent reimplementation over the Boreas substrate: the free-block count and blocking ride the owned, notification-backed k_sem, with a portMUX guarding the intrusive free list and usage counters.

Design

  • Free list: upstream's intrusive singly-linked scheme — the next-pointer lives in each free block's first word (*(char**)block), threaded back-to-front. block_size must be >= sizeof(void*) and word-aligned. alloc returns uninitialized memory (never zeroed), matching upstream.
  • Blocking + count via k_sem: alloc = k_sem_take(timeout) then pop under the lock; free = push then k_sem_give. Reusing k_sem inherits its hardened, targeted wake — a give with a waiter present wakes that waiter without bumping the count, so a freed block is reserved for the woken allocator and a racing K_NO_WAIT alloc correctly gets -ENOMEM. This reproduces upstream's direct hand-off with no new wait-queue code.
  • Return codes match upstream: 0 / -ENOMEM (K_NO_WAIT, none free) / -EAGAIN (timeout) / -EINVAL (bad params).
  • ISR-safe alloc(K_NO_WAIT) and free (IRAM/K_ISR_SAFE, verified in the ELF).
  • K_MEM_SLAB_DEFINE rounds block_size/buffer-align up to a pointer word (upstream WB_UP), so the same DEFINE compiles on 32- and 64-bit targets (the linux test host). The embedded sem is compile-time-initialized (like K_TIMER_DEFINE's), so the only lazy first-use step is free-list threading — pure IRAM-safe pointer work, safe from any context including an ISR. (Upstream threads the list from a PRE_KERNEL SYS_INIT, which doesn't run on the linux target.)

API

k_mem_slab_init, k_mem_slab_alloc(timeout), k_mem_slab_free, k_mem_slab_num_used_get / num_free_get / max_used_get, K_MEM_SLAB_DEFINE[_STATIC].

Review

Boreas-conformance + adversarial-trace fan-out. The adversarial trace found no correctness bugs — all 7 interleavings (count/list desync, the hand-off steal, lazy-init publication barriers, free-before-init, dual-core ISR-free) verified safe, including that free-before-init is a benign no-op (not a crash) because sys_dlist_is_empty on a zeroed list returns NULL rather than dereferencing. Two robustness blockers folded: the embedded sem is now compile-time-initialized (removing flash-resident k_sem_init from the lazy/ISR path) and free also threads the list defensively.

Test plan

  • linux: 231/0 ×3 (224 + 7; ISR test compiles out on linux)
  • clang-format 21.1.8 clean; alloc/free confirmed IRAM-resident, init in flash
  • ESP32-S3 hardware flash (pending)

Tests: init/accounting, -EINVAL on bad params, alloc distinctness/in-bounds/no-overlap + exhaustion -ENOMEM + max_used high-water, blocking-alloc timeout -EAGAIN, K_MEM_SLAB_DEFINE lazy-init usable + not-zeroed-on-realloc, blocking alloc woken by free (MT), N>blocks multi-waiter conservation (MT), FromISR free wakes a blocked allocator (HW-gated).

Refs #46 (the k_fifo/k_lifo half lands separately)

🤖 Generated with Claude Code

swoisz and others added 30 commits April 1, 2026 21:31
fix: linker fragment collecting SYS_INITs so IDF doesn't drop them
fix: many resolved and validated iterable sections for logging, shell…
fix: deferred logging bringup, redundant static and unused variables …
swoisz and others added 28 commits June 6, 2026 20:23
k_sem_take initialized the waiter struct (xTaskGetCurrentTaskHandle +
uxTaskPriorityGet) inside the critical section, contradicting the
design intent documented on z_sem_waiter.prio: uxTaskPriorityGet
enters FreeRTOS's own critical section, nesting spinlocks under the
sem lock on SMP. Sample before locking; the fast path pays two cheap
getters, and no FreeRTOS call happens under the sem lock anywhere.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The previous fix moved xTaskGetCurrentTaskHandle/uxTaskPriorityGet to
the top of k_sem_take, putting them on every path -- including the
pre-scheduler constructor path that K_SEM_DEFINE's static initializer
exists to support, where pxCurrentTCB is NULL and uxTaskPriorityGet
dereferences it. CI's Linux-proper runner caught it as an instant
pre-main SIGSEGV (macOS tolerated the access, so local runs passed --
exactly why the CI matrix runs Linux-proper).

Restore the invariant both ways: fast paths (count hit, K_NO_WAIT)
never sample and stay pre-scheduler-safe; the must-block path samples
OUTSIDE the lock via unlock -> sample -> relock -> re-check (the
re-check loop), so no FreeRTOS call ever happens under the sem lock.
Blocking before the scheduler starts is invalid regardless.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Raw CONFIG_FREERTOS_TASK_NOTIFICATION_ARRAY_ENTRIES=2 lines in every
consumer's sdkconfig.defaults were correct but unreadable -- setting an
unfamiliar FreeRTOS int gives the user no idea why. Kconfig cannot make
it automatic (select is bool-only; a component-supplied default for an
int it does not own loses the parse-order race to the owning component
-- verified empirically), so the requirement moves one level up:

- sdkconfig.boreas at the repo root carries all Boreas-required
  settings with the rationale inline; future requirements centralize
  there instead of accreting raw lines downstream.
- Consumers reference it by name in SDKCONFIG_DEFAULTS:
      set(SDKCONFIG_DEFAULTS "sdkconfig.defaults;<boreas>/sdkconfig.boreas")
  The test app and all examples are wired this way and their raw
  config lines removed.
- The compile-time #error remains as the backstop, and the README
  configuration section documents the pattern.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The sdkconfig.boreas requirement lived only in the zkernel component
README -- the top-level Usage section (the first thing an integrator
reads) now wires it into the submodule instructions, including the
one-time config regeneration step.

CHANGELOG.md captures the 2026-06 hardening series as a downstream
migration checklist: the defaults-list addition (#41), the k_work
return-code audit (#37), the k_work_cancel polarity inversion (#38),
the cancel_delayable_sync collapse of triple-cancel workarounds, the
reserved notification index, and the k_sem_take abort restriction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat!: notification-backed k_sem (own state, no FreeRTOS control block)
Second primitive converted under #40, on the architecture proven by
the k_sem conversion: events word and waiter list in the caller-owned
struct under a portMUX; blocking on the shared reserved notification
index; stack-resident waiter nodes severed under the lock before any
return; the consume-the-in-flight-give protocol for the timeout race.
A post wakes ALL waiters whose conditions become met: satisfied nodes
are unlinked into a local chain in one lock pass and notified after
unlock (safe -- woken waiters block in the consume path until the
notification lands).

This retires the FreeRTOS event-group backend and with it three
parity gaps plus a hard ceiling:

- Full 32-bit events (EventBits_t reserved the top byte: 24 usable).
- k_event_set now REPLACES the tracked set (upstream: setting differs
  from posting); previously it merged, aliasing k_event_post.
- wait/wait_all reset=true zeroes the ENTIRE tracked set BEFORE
  waiting (previously: cleared only the matched bits, after).
- wait_all on timeout returns 0 (previously could return a truthy
  partial match).

New upstream surface: k_event_set_masked, k_event_test,
k_event_wait_safe / k_event_wait_all_safe (atomically consume matched
bits), all mutators return the previous value of the affected bits,
K_EVENT_DEFINE is a true compile-time initializer, and k_event_init
returns void (upstream signature; it can no longer fail).

The notification index and its config requirement move to
zkernel_internal.h (Z_KERNEL_NOTIFY_INDEX), shared by k_sem and
k_event -- safe because a task blocks on at most one primitive at a
time and every blocking call drains in-flight notifications before
returning.

BREAKING: k_event_set merge->replace (use k_event_post to accumulate);
reset=true semantics as above; k_event_init signature. In-tree tests
updated intentionally; the MT multi-setter now uses k_event_post and
the consume-on-wake test uses k_event_wait_safe.

Tests: 190 -> 196 (set-replaces vs post-merges, previous-value
returns, set_masked, wait_safe consumption, wait_all timeout-zero,
full-32-bit bits 24/31, K_EVENT_DEFINE static init). Linux suite
green x3.

Part of #40

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- The post-unlock chain walk's safety premise is now stated precisely
  at the source: within the protocol exactly one in-flight
  notification can exist per blocked waiter (targeting is exclusive
  under the lock; every return path drains what woke it), so a chained
  waiter cannot be released before our notify -- the premise IS the
  notification-index reservation, the same one k_sem_give's
  post-unlock handle use rests on.
- ISR-context posts accumulate the higher-priority-woken flag across
  the wake chain and yield once after the loop instead of per waiter.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat!: notification-backed k_event (full 32 bits, upstream semantics)
Closes the coverage gaps from the #41/#42 design review (issue #43):

- timeout-vs-give stress (k_sem) and timeout-vs-post stress (k_event):
  sweep the giver/poster firing time across the taker's 20 ms timeout
  (early / same-tick / late) for 100 iterations, checking conservation
  invariants on every outcome and probing the reserved notification
  index for stranded notifications after each cycle. The giver runs as
  a raw FreeRTOS task pinned to the other core on multicore targets
  (k_thread_create pins to core 0).
- give-racing-the-park: 1000 zero-delay handshake iterations hammer
  the unlock-sample-relock recheck window and the enqueue->park window
  inside k_sem_take.
- multi-waiter beyond two: 4-waiter conservation, FIFO order among
  equal-priority waiters (pins the strict '>' in z_sem_pop_waiter),
  k_sem_reset waking 3 waiters, single k_event_post waking 3 waiters.
- FromISR coverage (HW-gated on CONFIG_K_TIMER_DISPATCH_ISR):
  k_sem_give from real ISR context with a prompt-wake latency bound
  proving the portYIELD_FROM_ISR path (mid-tick expiry via K_USEC);
  k_sem_take(K_NO_WAIT) from ISR (upstream isr-ok contract);
  k_event_post from ISR waking 3 waiters (accumulate-yield-once path).

Doc notes on the declarations (inline-divergence convention):
- k_sem_take: ISR-legal only with K_NO_WAIT (upstream contract);
  waiter priority is cached at enqueue (k_thread_priority_set on a
  blocked waiter does not re-sort the wake order -- upstream re-sorts).
- k_sem_give: documented wake order (highest priority, FIFO among
  equals).
- k_sem_reset: task context only. Upstream verification showed
  upstream's k_sem_reset is NOT isr-ok either (no @isr_ok; calls
  z_reschedule), so this is parity, not a divergence -- the #43 item
  proposing an ISR-safe reset was based on a wrong premise.

The stale-priority note is NOT added to k_event_wait: k_event wakes
ALL satisfied waiters with no priority ordering, so there is no stale
value to document.

linux suite: 204/0 (197 + 7; the 3 ISR tests compile out on linux).

Refs #43

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review fan-out findings folded:

- k_sem_take, z_event_wait_internal, and the four k_event_wait*
  wrappers are now K_ISR_SAFE (IRAM-resident). The documented
  isr-ok-with-K_NO_WAIT contract was unsound for flash-resident code:
  the esp_timer ISR is allocated ESP_INTR_FLAG_IRAM, so the take/wait
  fast paths would fault if the ISR fired during a concurrent flash
  operation. New HW-gated tests pin the IRAM residency (prior art:
  the iram_attr tests in test_k_timer.c).
- Prompt-wake test retuned: expiry at 10.1 ms (was 10.5) makes a
  missing portYIELD_FROM_ISR cost ~900 us instead of ~500, so the
  700 us bound has wide margin against both false pass (deferred
  wake) and false fail (cache-cold first run, tick coincidence).
- The both-outcomes sweep asserts are now HW-only: on the linux
  target a loaded CI host can stall the tick clock and collapse the
  sweep onto one outcome. The per-iteration conservation invariants
  (which hold under every interleaving) still run everywhere.
- BUILD_ASSERT(CONFIG_FREERTOS_HZ >= 1000): the 18..22 ms sweep
  separation quantizes away at coarser ticks.
- Multi-waiter state hygiene: mw_result/mw_order sentinel-reset per
  test (stale values from a prior test can no longer satisfy
  assertions), mw_next is a Zephyr atomic_t.
- kernel.h: precise wording on the K_NO_WAIT-from-ISR note (spinlock
  only, no task-notify/blocking calls, IRAM-resident).
- test_k_event_mt.c: pre-existing sizeof(stack) call sites migrated
  to K_THREAD_STACK_SIZEOF; FreeRTOS-priority-convention reminder on
  the priority-wake test.

linux suite: 204/0 x3.

Refs #43

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ning

test: stress coverage for k_sem/k_event race windows (#43)
Rewrites k_timer_status_sync (both backends) from the k_msleep(1)
busy-poll to a true blocking wait on a binary k_sem embedded in
struct k_timer -- the original PR-4 design, now safe: the April
crash shapes that killed it were root-caused to the pre-#18 k_thread
zombies (#21) and are green regression tests, and the sem itself is
notification-backed (#41) with synchronous severance.

Upstream-verified semantics (kernel/timer.c):
- single-waiter model (upstream's wait_q holds "the (single) thread
  waiting on this timer"; expiry/stop wake exactly one) -> a binary
  sem latch matches exactly
- woken by expiry OR stop; status re-read after wake; status reset
  to 0 on every path; returns 0 immediately when stopped
- the wake fires AFTER the expiry callback / stop_fn (upstream order)

The sem is a wake LATCH, not a counter -- the expiry count stays in
timer->status. A give latched while nobody waited would satisfy a
later take early, so status_sync re-checks and re-blocks in a loop;
pinned by the new test_timer_status_sync_after_status_get_blocks
regression. Wakes are now immediate instead of quantized to the
FreeRTOS tick; the busy-poll divergence note on the declaration is
replaced.

k_sem_give is ISR-safe (IRAM), so the expiry-path give works under
CONFIG_K_TIMER_DISPATCH_ISR=y, the ESP_TIMER_TASK fallback, and the
linux dispatcher. K_TIMER_DEFINE gains the compile-time sem
initializer; k_work_init_delayable inherits the sem init via
k_timer_init. struct k_timer grows by sizeof(struct k_sem) (~24 B),
including inside k_work_delayable.

k_timer_remaining_get needed no change: it already returns uint32_t
milliseconds like upstream's k_ticks_to_ms_floor32 shape.

linux suite: 206/0 x3 (204 + 2).

Closes #28

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review fan-out findings folded:

- One-shot exchange-vs-running window closed: between status_sync's
  status exchange and its running load, a one-shot expiry could
  complete entirely (status=1, running=false, give latched) and the
  waiter returned 0 where upstream's spinlocked read returns 1 --
  terminally lost for a one-shot in a while(status_sync()) drain
  loop. The !running path now re-reads status instead of returning
  0; the callback orders status++ (RELEASE) before running=false
  (RELEASE), so an ACQUIRE load of running==false is guaranteed to
  observe the increment.
- test_timer_status_sync_after_status_get_blocks made deterministic:
  one-shot expiries only and t0 captured BEFORE the restart, so the
  elapsed lower bound holds under arbitrary host stall (the periodic
  re-arm racing the entry was CI-flaky both directions).
- @note on k_timer_start: restart does not wake a blocked
  status_sync waiter (upstream parity).
- @note on k_timer_init: re-init while a status_sync waiter is
  blocked is caller error (clobbers the embedded sem's waiter list).
- Z_SEM_INITIALIZER single-sources the compile-time k_sem
  initializer body for K_SEM_DEFINE and K_TIMER_DEFINE.

Adversarial review refuted (no change needed): zeroed-sem gives are
structurally unreachable before k_timer_init (running-guard +
handle-guard); stale latches cost at most one extra loop iteration
(binary sem caps at 1, no busy-spin); SMP ordering is sound
(RELEASE-before-give, portMUX barriers, aligned 32-bit status).

linux suite: 206/0 x3.

Refs #28

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Upstream unpends aborted threads from the timer wait queue; Boreas
cannot, so this restriction is a divergence, not parity -- matching
the wording already on k_sem_take.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fix: k_timer_status_sync blocks on an embedded sem, not a poll loop (#28)
Near-verbatim port of upstream's ring_buffer.h + lib/utils/ring_buffer.c
(both Apache-2.0), the canonical companion to the UART IRQ pattern and
the byte transport the direct-UART shell (#31) and a modbus serial
backend want. Pure arithmetic + memcpy; no kernel dependencies, nothing
blocks.

Byte-mode API: ring_buf_init, RING_BUF_DECLARE, reset, size/space/
capacity_get, is_empty, put/get/peek, and the zero-copy claim/finish
pairs. Indices use upstream's base-offset scheme (head/tail/base of
ring_buf_idx_t, byte offset = head-base with one wrap correction,
RING_BUFFER_MAX_SIZE caps size to half the index range so unsigned
subtraction disambiguates full vs empty). CONFIG_RING_BUFFER_LARGE
widens indices to 32-bit, matching upstream.

Adaptations for Boreas (documented in the files):
- include layout (zephyr/sys/util.h) and lowercase upstream min() -> MIN.
- toolchain shims added to sys/util.h: __ASSERT_NO_MSG, likely/unlikely,
  __noinit (mapped to zero-init storage -- correct here, and avoids
  __NOINIT_ATTR wrongly surviving deep sleep), __deprecated. Also made
  __ASSERT self-contained (#include <stdlib.h> for abort()).

Deliberate divergence: the upstream item-mode API (ring_buf_item_put/
_get, RING_BUF_ITEM_DECLARE*) is NOT ported -- it is @deprecated
upstream ("use <zephyr/sys/ringq.h>") and unused by the byte-transport
consumers this targets.

Concurrency is unchanged: lock-free for single-producer/single-consumer
in separate contexts; multiple producers/consumers must serialize
externally.

Tests (15): accounting, round-trip, capacity saturation, partial get,
NULL-discard, peek-no-consume, reset, claim/finish (put+get), surplus
return, over-claim -EINVAL, and the wrap-sensitive cases -- claim splits
at the physical end, put/get spanning the wrap, and a 20000-cycle stream
through a 7-byte (non-power-of-two) buffer exercising base advancement
and the index arithmetic across many wraps.

linux suite: 221/0 x3 (206 + 15).

Closes #45

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tests

Review fan-out (fidelity + conformance + adversarial) findings folded:

- Restore the public RING_BUF_INIT macro (upstream API completeness;
  RING_BUF_DECLARE is now defined in terms of it, matching upstream
  structure). A downstream `struct ring_buf rb = RING_BUF_INIT(...)`
  now compiles.
- Retain upstream's "Copyright (c) 2015 Intel Corporation" line on the
  two verbatim files alongside the Intercreate port line (Apache-2.0
  4(c) preservation for a substantial verbatim derivative; the repo is
  going public).
- Document the __noinit limitation: Boreas wires no .noinit section, so
  no-init/retained-RAM semantics are not available (storage is plain
  zero-init BSS).
- Tests: assert ring_buf_put_claim returns 0 on a full buffer (the
  zero-copy path UART IRQ uses), and assert ring_buf_space_get across a
  wrap (the put.head - get.tail subtraction path).

Adversarial index-arithmetic probe found zero constructible failures:
the UINT16_MAX/2 cap is exact and inclusive (size=32768 would alias
head-base to 0 mod 65536 and is rejected at build/init), and the
non-power-of-two-size x uint16-index-wrap case is safe because only
modular differences are used and tail-base is held in [0,size) by
one base bump per wrap. The 20000-cycle test crosses the uint16
boundary, so it covers that class.

linux suite: 221/0 x3.

Refs #45

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lic)

A pre-public licensing audit compared every Zephyr-API file against the
upstream original (function bodies and macro expansions, not just
names). Apache-2.0 4(c) requires retaining the upstream copyright only
where actual copyrightable expression was copied.

Retain upstream copyright (genuine verbatim/near-verbatim content):
- sys/util.h: IS_ENABLED's _XXXX##/_YYYY token-paste trick is upstream
  verbatim -> Copyright (c) 2011-2014 Wind River Systems, Inc.
- sys/byteorder.h: the 32-bit be/le composition mirrors upstream's
  canonical chaining -> Copyright (c) 2015-2016 Intel Corporation.
- (sys/ring_buffer.h + ring_buf.c already carry Copyright (c) 2015
  Intel Corporation from the port commit.)

Deliberately NOT attributed (independent reimplementations -- adding an
upstream notice would be false attribution): sys/dlist.h, sys/slist.h
(upstream is macro-generated; ours is hand-written over a different
struct layout), sys/atomic.h (inline __atomic wrappers vs upstream's
extern decls), sys/time_units.h (us-stored k_timeout_t + FreeRTOS ticks
vs upstream's Hz-based z_tmcvt), and all of zshell/zsys/zdevice (the
shell, logging, and device model are independent implementations over
ESP-IDF/FreeRTOS; only the public API surface matches Zephyr).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Folded findings from the code-review fan-out (no correctness bugs were
found; these are robustness/coverage/doc improvements):

- ring_buffer.h + ring_buf.c now #include "sdkconfig.h" explicitly.
  struct ring_buf's index width is selected by CONFIG_RING_BUFFER_LARGE,
  so the config must be visible wherever the struct is defined. It was
  safe before only by a transitive chain (util.h -> esp_attr.h ->
  sdkconfig.h); making it explicit removes a silent cross-TU ABI hazard
  if that chain ever changes, and matches the sibling .c convention.
- util.h: corrected the __noinit comment. The prior rationale was wrong
  (ESP-IDF DOES ship __NOINIT_ATTR + a .noinit section, and that
  attribute survives warm reset, not deep sleep). The real reasons to
  map __noinit to plain BSS: ring_buf never reads unwritten bytes (so
  no-init vs zero-init is invisible), and plain BSS avoids a section
  attribute that does not port to the Mach-O host toolchain. The
  no-init limitation is documented.
- util.h: dropped the unused __deprecated shim (no users; add toolchain
  shims when a real user lands). likely/unlikely kept (unlikely is used
  by ring_buf.c; both are #ifndef-guarded and identical in value to
  ESP-IDF's, so the include-order race is codegen-only).
- test: added test_ring_buf_full_empty_across_wrap -- fills to full /
  drains to empty for 40 cycles (163 KB through a 4 KB buffer, crossing
  the uint16 index wrap ~2.5x), asserting the full (space==0, put
  refused) and empty discriminations AT high `allocated` across the
  wrap. The existing many-wrap test drains within a 7-byte span, so it
  never exercised full-vs-empty near the index wrap.

linux suite: 222/0 x3.

Refs #45

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
util.h was accreting symbols whose upstream homes are
<zephyr/toolchain.h> (likely/unlikely, __weak, ALWAYS_INLINE, __noinit)
and <zephyr/sys/__assert.h> (__ASSERT, __ASSERT_NO_MSG), while the repo
otherwise mirrors upstream include paths exactly -- the devlog TODO
already records a downstream smf port hand-editing upstream #includes
because of this. Thin headers at the upstream paths fix that now, while
they have 1-2 consumers; util.h re-includes both so existing ports keep
working.

Also two correctness fixes folded in:
- __noinit now maps to __NOINIT_ATTR (real .noinit on hardware, no-op
  on the linux target via esp_attr.h's CONFIG_IDF_TARGET_LINUX gate in
  both IDF v5.4 and v5.5). The old empty shim silently gave zero-init
  BSS semantics to any future warm-reset-retention user, and its
  stated Mach-O blocker was false -- the macOS host build of the linux
  target compiles clean.
- likely/unlikely now document the divergence from esp_compiler.h
  (ESP-IDF's are CONFIG_COMPILER_OPTIMIZATION_PERF-gated; Boreas
  matches upstream Zephyr's unconditional __builtin_expect; both are
  #ifndef-guarded so the include-order race is codegen-only).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The header advertises the byte API for thread + ISR use (the UART IRQ
pattern), and the repo convention is that ISR-callable primitives are
K_ISR_SAFE (k_sem, k_msgq, k_event, k_work): ESP-IDF ISRs registered
with ESP_INTR_FLAG_IRAM fire during flash-cache-disabled windows (e.g.
NVS commits), where calling flash-resident code panics with "Cache
disabled but cached memory region accessed". All five functions are now
IRAM-resident (verified via nm: 0x4037xxxx alongside k_sem_give).

The internal __ASSERT_NO_MSGs are replaced with k_panic() checks: the
upstream asserts compile out under its default CONFIG_ASSERT=n, but
Boreas's __ASSERT is always on and its ESP_LOGE+abort failure path is
itself flash-resident (util.h documents it as not IRAM-safe and
prescribes k_panic() -- k_work.c already set the precedent of replacing
upstream asserts on K_ISR_SAFE paths). memcpy is ROM-resident on
ESP32-S3, so the copy loops remain safe.

ring_buf_peek's NULL check is also hoisted above the copy loop so the
documented "Cannot be NULL" contract is enforced on an empty buffer
too, instead of silently returning 0 until data arrives.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…retvals

Doc-only fixes from review (code unchanged, upstream-verbatim):

- Concurrency note rewritten: it overclaimed "lock-free SPSC" without
  stating that no field is volatile and no barriers are issued. Now
  documents that the query inlines read one index from each side
  (tear-free aligned loads, conservatively stale), forbids busy-waiting
  on them (the compiler may hoist the load out of a call-free loop),
  and requires per-handoff k_sem pairing on SMP. Also notes the
  implementation is K_ISR_SAFE.
- RING_BUF_INIT: @warning that size8 > RING_BUFFER_MAX_SIZE is NOT
  checked on this path (unlike RING_BUF_DECLARE/ring_buf_init) and
  silently corrupts; macro body stays upstream-verbatim.
- put_claim/get_claim/peek: @notes that finishing rewinds head to tail,
  so an outstanding claim must be finished before any other same-side
  call (including the copy-mode functions); restored upstream's
  dropped "with a non-zero `size`" qualifier on the peek doc.
- put_finish/get_finish: -EINVAL condition corrected -- the code (here
  and upstream) rejects size > outstanding claimed bytes, not "exceeds
  free space"/"valid bytes" as upstream's doc has always misstated.
- size_get: adopted current upstream main's "available data" wording
  (v3.7's "used space" miscounts outstanding claims); three
  @retval-with-prose fixed to @return (also fixed upstream post-v3.7).
- Removed seven @warnings referencing the unported ring_buf_item_ API;
  RING_BUF_DECLARE gains a file-scope-only @note.
- Kconfig: RING_BUFFER_LARGE costs 12 extra bytes per struct ring_buf
  on the 32-bit target (6 index fields x 2 bytes; measured 20 -> 32),
  not 6 as the help text claimed.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Two coverage gaps closed:

- test_ring_buf_index_wrap_seeded: the traffic-volume stress tests can
  only reach the default uint16_t 65536 wrap; under
  CONFIG_RING_BUFFER_LARGE the 2^32 wrap was silently untested in
  exactly the config the option exists for. Seeding the indices just
  below the wrap via ring_buf_internal_reset (the technique upstream's
  ringbuffer test suite uses) makes the full/empty discrimination
  coverage index-width-independent. Verified locally: full suite passes
  on the linux target with CONFIG_RING_BUFFER_LARGE=y.
- test_ring_buf_peek_across_wrap_multi_claim_get: peek was only tested
  contiguous-at-offset-0; now covers peek walking both physical
  segments of wrapped data and rewinding via its internal
  get_finish(0), plus multiple get-side claims completed by a single
  finish.

Cleanups: per-byte TEST_ASSERT_EQUAL_UINT8 loops (~300k Unity calls
across the two stress tests) replaced with per-chunk
TEST_ASSERT_EQUAL_UINT8_ARRAY (better failure diagnostics: reports the
element index); put_claim_wraps now drains and verifies its patterned
writes landed at the correct physical offsets instead of leaving them
unchecked; NULL-discard priming replaces drain-into-out + memset; dead
sequence fill dropped from the saturates test.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e (Copilot)

The file-level comment claimed all definitions are #ifndef-guarded, but
ALWAYS_INLINE is intentionally #undef'd and redefined unconditionally
(the RISC-V -Werror=attributes fix). Call out the exception so the
include-order contract isn't misleading.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat: port ring_buf (sys/ring_buffer.h) + pre-public copyright audit (#45)
Independent reimplementation of upstream Zephyr's k_mem_slab API over
the Boreas substrate. The low-risk half of #46 (split from k_fifo).

Design: the free-block count and blocking ride the owned,
notification-backed k_sem (count = free blocks); a portMUX guards the
intrusive free list and the usage counters. Reusing k_sem inherits its
hardened wake protocol -- a give targets the highest-priority waiter
without bumping the count, so a freed block is reserved for the woken
allocator and a racing K_NO_WAIT alloc correctly gets -ENOMEM. This
matches upstream's direct hand-off semantics with no new wait-queue
code.

- Free list: upstream's intrusive singly-linked scheme (next-pointer in
  each free block's first word, threaded back-to-front). block_size
  must be >= sizeof(void*) and word-aligned; alloc returns uninitialized
  memory (never zeroed).
- API: k_mem_slab_init, k_mem_slab_alloc(timeout), k_mem_slab_free,
  num_used/num_free/max_used_get, K_MEM_SLAB_DEFINE[_STATIC].
- Return codes match upstream: 0 / -ENOMEM (K_NO_WAIT, none free) /
  -EAGAIN (timeout) / -EINVAL (bad params); k_sem_take's -EBUSY is
  mapped to -ENOMEM.
- ISR-safe alloc(K_NO_WAIT) and free (IRAM/K_ISR_SAFE), like the other
  primitives.
- K_MEM_SLAB_DEFINE rounds block_size and buffer alignment up to a
  pointer word (upstream WB_UP), so the same DEFINE compiles on 32- and
  64-bit targets (the linux test host). The free list is threaded
  lazily on first use, since upstream's PRE_KERNEL SYS_INIT threading
  does not run on the linux target -- DEFINE'd slabs must therefore be
  first touched from thread context (documented).

Divergence from upstream noted on the declarations; k_mem_slab_init
keeps upstream's strict word-alignment -EINVAL contract (no rounding).

Tests (7): init/accounting, -EINVAL on bad params, alloc distinctness/
in-bounds/no-overlap + exhaustion -ENOMEM + max_used high-water,
blocking-alloc timeout -EAGAIN, K_MEM_SLAB_DEFINE lazy-init usable,
blocking alloc woken by free (MT), N>blocks multi-waiter conservation
(MT), and FromISR free wakes a blocked allocator (HW-gated).

linux suite: 231/0 x3 (224 + 7; ISR test compiles out on linux).

Refs #46

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s list

Folded both review blockers (no correctness bugs were found by the
adversarial trace -- all 7 interleavings verified safe; these are
robustness/quality fixes):

- The embedded `avail` sem is now compile-time-initialized via
  Z_SEM_INITIALIZER in the DEFINE macro (like K_TIMER_DEFINE's embedded
  sem), so k_sem_init no longer runs on the lazy first-use path. That
  path is now pure free-list threading (IRAM-safe pointer work), which
  makes the K_ISR_SAFE annotations honest and lets a DEFINE'd slab be
  first-touched from an ISR -- the "first use from thread context"
  caveat is dropped.
- k_mem_slab_free now calls z_mem_slab_ensure_threaded too, so a
  DEFINE'd slab freed before any alloc (caller error) can't push onto
  an unthreaded list.
- BUILD_ASSERT num_blocks >= 1 in the DEFINE macros.
- ensure_threaded's under-lock re-check uses __atomic_load_n (TSAN
  cleanliness; it was already correct under the portMUX barrier).
- Comment the intentional num_used transient-skew during the free->
  woken-alloc hand-off.
- Test: assert a freed-then-realloc'd block keeps its bytes (alloc
  returns uninitialized memory, never zeroed).

linux suite: 231/0 x3.

Refs #46

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant