feat(17): speak SLA — caches, JSON-safe fetch, social fast lane by moyunzero · Pull Request #5 · moyunzero/AetherLife

moyunzero · 2026-06-11T16:24:20Z

Summary

Phase 17 speak SLA on top of merged Phase 16 (main).

game-server: worker-state / memory-context / lore-chunk caches; embed guard; internal latency; concurrency gate; hostile ambient speak_end gate
worker: social fast lane; JSON-safe fetch + empty tool-args fallbacks for parallel speak
verify scripts: Phase 16 world coords, phase8 NL crowd-clear, phase11 dual-tab spawn clear, phase12 hostile drift tolerance
docs: CONTRACTS + LLM-E2E-FLOW-AND-LATENCY

Test plan

pnpm --filter @aetherlife/game-server test — 175 passed
cd workers/agent-worker && LLM_MOCK=1 uv run pytest -q — 201 passed
pnpm agent:verify --e2e --base — GF-01/03/04/05/06/07 + phase8 cross-layer (~7 min, real LLM, post-commit)
pre-push pnpm agent:verify — OK

Notes

assets/pet-cat/ intentionally not included (untracked art assets).

…etch Add layered caches for worker-state, memory-context, and lore chunks; harden embed and HTTP JSON parsing to avoid empty-body decode errors. Introduce social edge fast lane and speak intent improvements on the worker path; update CONTRACTS and verify scripts accordingly. Co-authored-by: Cursor <cursoragent@cursor.com>

chatgpt-codex-connector · 2026-06-11T16:24:26Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

coderabbitai · 2026-06-11T16:24:33Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR internalizes apply-actions, adds concurrency gates and caches (worker-state, memory, chunk lore) with per-player invalidation and stale fallbacks, instruments internal latency, and introduces a deterministic social-edge fast-lane plus related tests and scripts.

Changes

Cache Infrastructure and Concurrency Gating

Layer / File(s)	Summary
Concurrency gate and wrappers `apps/game-server/src/util/concurrencyGate.ts`	Priority queue gate caps concurrent async work (MAX_CONCURRENT=2) and exposes lore/embed wrappers and test reset.
Worker state cache (TTL modes) & invalidation `apps/game-server/src/colyseus/workerStateCache.ts`, `apps/game-server/src/colyseus/workerStateCache.test.ts`	Adds TTL variants (skipNearbyLore vs with lore), `setCachedWorkerState(..., skipNearbyLore)` and `invalidateWorkerStateForPlayer(roomId, playerId)`; tests verify per-player invalidation.
Memory-context cache `apps/game-server/src/memory/memoryContextCache.ts`	In-memory 5s TTL cache keyed by room/player/npc/message+skipEmbed; helpers to get/set/clear.
Chunk lore cache & concurrency `apps/game-server/src/world/lore-chunk-cache.ts`, `apps/game-server/src/world/lore-chunk-cache.test.ts`	10-minute cache for chunk lore, uses runWithLoreConcurrencyLimit to prevent duplicate fetches; test asserts single repo hit.
Embedding via concurrency gate `apps/game-server/src/memory/embed.ts`	`embedText(text, options?)` now runs under runWithEmbedConcurrencyLimit and supports `options.priority` for hot-path priority.

Player-Scoped Operations and State Invalidation

Layer / File(s)	Summary
Per-player collective cleanup `apps/game-server/src/collective/service.ts`, `packages/npc-memory/src/collective/repository.ts`, `packages/npc-memory/src/collective/repository.test.ts`	Adds `deleteForPlayer(roomId, playerId)` in repo and service; tests validate attitudes cleared for target player while room events remain.
Invalidate worker state on move `apps/game-server/src/colyseus/GameRoom.ts`	On successful move derive effective playerId and call `invalidateWorkerStateForPlayer` to drop cached worker-state for that player.
Scoped reset `apps/game-server/src/colyseus/bridge.ts`	`resetColyseusFromMap(..., requestingPlayerId?)` can scope reset to a single player anchored to HOME_DEFAULT_PLAYER_SPAWN; otherwise resets all players from map.player.
Reset endpoint wiring `apps/game-server/src/routes/rooms.ts`	Reset handler passes `playerId` into resetColyseusFromMap and calls `deleteForPlayer(roomId, playerId)` rather than room-wide deletion.

Internal Endpoint Security and Routing

Layer / File(s)	Summary
Worker auth middleware `apps/game-server/src/routes/internal.ts`	`requireWorkerAuth` allows open internal in test/ALLOW_OPEN_INTERNAL mode, otherwise 503 when token missing, and validates Bearer token when present.
Internal apply-actions routing `apps/game-server/src/routes/rooms.ts`	Removes public `POST /:roomId/apply-actions`; `createInternalRoomsRouter` wires the internal route with `requireWorkerAuth` and normalizes initiatorPlayerId from header/body.
Worker-state route instrumentation `apps/game-server/src/routes/rooms.ts`	Internal `/worker-state` logs latency via `logInternalLatency` and populates cache via `setCachedWorkerState(..., skipNearbyLore)`.

Memory Context and Latency Instrumentation

Layer / File(s)	Summary
buildMemoryContext priority `apps/game-server/src/memory/service.ts`	`buildMemoryContext(..., options?)` accepts `embedPriority?: boolean` and forwards to `embedText`.
Internal latency logging `apps/game-server/src/observability/internalLatency.ts`	Adds `InternalLatencyLog` type and `logInternalLatency(entry)` which prints structured stderr latency lines.
Memory-context route caching `apps/game-server/src/routes/internal-memories.ts`	GET memory-context reads `x-speak-hot-path`, checks memoryContext cache, logs hit/miss, calls buildMemoryContext with `embedPriority` from header, and caches result.

Speak Job Control Flow

Layer / File(s)	Summary
clearSpeakInFlight options `apps/game-server/src/colyseus/GameRoom.ts`	`clearSpeakInFlight(jobId, options?)` can suppress ambient `"speak_end"` enqueue via `options.enqueueAmbient`.
SSE terminal handling `apps/game-server/src/sse/hub.ts`, `apps/game-server/src/sse/hub.colyseus.test.ts`	Terminal events compute `gateRejected` and call `clearSpeakInFlight(jobId, { enqueueAmbient })` where `enqueueAmbient` is true only for `done` without gateRejected; tests added for gateRejected behavior.
Chat job lifecycle & events auth `apps/game-server/src/routes/chat.ts`	Tracks `speakAcquired`/`jobId`; on error only clears/unregisters when job was acquired; GET events validates job existence and requester playerId authorisation.

Worker Resilience, Lazy Lore, and Collective Gate

Layer / File(s)	Summary
Stale worker snapshot cache `workers/agent-worker/src/graph/npc_loop.py`	In-memory per-(room,player) stale snapshot cache; `fetch_state` stores snapshots, uses `safe_response_json`, and falls back to stale snapshot on timeout annotated with `_stale`/_stale_age_ms.
Collective gate fields `workers/agent-worker/src/graph/npc_loop.py`	`_load_collective_gate_fields` loads memory-context with `skip_embed=true` and extracts attitude/allowed_tools for PHYSICAL intent flows.
Lazy nearby-lore `workers/agent-worker/src/graph/npc_loop.py`	`fetch_nearby_lore_into_snapshot` enables on-demand nearbyLore population for NARRATIVE when `message_needs_nearby_lore` is true.
fetch_state_and_memory orchestration updates `workers/agent-worker/src/graph/npc_loop.py`	Coordinates skip/full-memory branches, records timing metrics, and integrates collective-gate and lazy nearby-lore fetches.

Social-Edge Fast-Lane and Intent Routing

Layer / File(s)	Summary
Speak intent helpers `workers/agent-worker/src/graph/speak_intent.py`	Adds `message_needs_nearby_lore` and expands `should_skip_memory_embed` to skip for CASUAL, SOCIAL_EDGE, NARRATIVE.
Social-edge router `workers/agent-worker/src/graph/speak_intent.py`	`can_use_social_edge_fast_lane` returns deterministic turn when intent is SOCIAL_EDGE and turn exists (not ignored).
Social-edge fast-lane implementation `workers/agent-worker/src/graph/social_edge_fast_lane.py`	`run_social_edge_fast_lane` builds state, fetches state/memory, reconciles social, applies event, runs tools, composes reply, and returns GraphState with timings.
Job gating & fast-lane invocation `workers/agent-worker/src/main.py`	Introduces `_speak_in_progress` guard to serialize speak jobs; `process_job` evaluates social-edge fast-lane and invokes `run_social_edge_fast_lane` when selected.

Tool parsing and invoke robustness

Layer / File(s)	Summary
Tool arg parsing `workers/agent-worker/src/graph/tools.py`	`parse_tool_calls` now trims args, treats empty as `{}`, and guards JSON parsing to avoid exceptions.
Invoke tool-bound LLM fallback `workers/agent-worker/src/llm/invoke_tools.py`	`invoke_tool_bound_llm` retries on empty-tool-args JSONDecodeError by unwrapping `.bound` LLM; tests added.
apply_tools hostile-gate handling `workers/agent-worker/src/graph/npc_loop.py`	On 403 hostile_gate responses, `apply_tools` returns state with `gate_rejected=true`, `gate_kind`, and clears `tool_calls` instead of raising; tests added.

Test suite & scripts

Layer / File(s)	Summary
Game-server tests `apps/game-server/src/index.test.ts`	Refactors tests to use `internalApplyActions()` helper, asserts public apply route removed, adds internal worker-state 503 behavior tests, and updates collective reset assertions to per-player.
Worker tests `workers/agent-worker/tests/*`	Adds/updates tests for safe_response_json, invoke_tool_bound_llm, parse_tool_calls, fetch_state_and_memory behaviors, social intent fast-lane, and hostile-gate handling.
E2E scripts & UAT `scripts/verify-*.mjs`, `scripts/uat-phase7-reset-snap.mjs`	Multiple verify scripts and UAT updated to call internal apply-actions endpoints, include optional Authorization when INTERNAL_WORKER_TOKEN is set, add preflight worker-state hot-path probes, and adjust spawn/home constants.

Docs & Contracts

Layer / File(s)	Summary
Contracts `docs/CONTRACTS.md`	Updates C-01/C-03: speak hot-path defaults, skipNearbyLore rules, stale-snapshot fallback, internal apply-actions write contract, and header usage.
Benchmark docs `docs/LLM-E2E-FLOW-AND-LATENCY.md`	Adds finer timing fields and a Phase 17 pre-LLM target with p95 thresholds and related notes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

moyunzero/AetherLife#2: Related homestead/reset changes impacting scoped reset behavior and spawn anchoring.
moyunzero/AetherLife#3: Related ambient/speak lifecycle and speak→ambient-intent flow changes.

Poem

🐰 I hopped through caches, gates, and queues,

per-player wipes and worker-views,
Fast-lanes hum and lore sleeps light,
Auth moved in, tests pass at night—
A rabbit cheers the runtime tune, swift and true.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/phase17-speak-sla

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/game-server/src/routes/internal-memories.ts (1)
67-92: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Memory-context cache needs invalidation on memory mutations to avoid stale read-after-write responses.

This new cache is now authoritative for Line 67 onward, but write routes in the same router (POST /:roomId/memories, POST /:roomId/reflect, POST /:roomId/summarize-bulk, DELETE /:roomId/memories) do not invalidate affected keys. Repeated prompts can return stale context until TTL expiry.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/game-server/src/routes/internal-memories.ts` around lines 67 - 92, The
memory-context cache (used via memoryContextCacheKey, getCachedMemoryContext,
setCachedMemoryContext) is not being invalidated on writes, causing stale
responses; add cache invalidation after any mutation by removing relevant keys
(e.g., by roomId/playerId/npcId scope) — implement or expose a helper like
invalidateMemoryContext(roomId, playerId?, npcId?) that deletes matching cache
entries and call it at the end of each write handler: POST /:roomId/memories,
POST /:roomId/reflect, POST /:roomId/summarize-bulk, and DELETE
/:roomId/memories (use the same identifiers used to build keys so the fresh
buildMemoryContext returns up-to-date data).

🧹 Nitpick comments (2)

workers/agent-worker/src/graph/social_edge_fast_lane.py (1)

49-50: 💤 Low value

Consider reusing a single httpx.Client context.

The function creates two separate httpx.Client() contexts for sequential synchronous calls. Since there's no threading here, a single client context would reduce overhead.

♻️ Suggested refactor

-    with httpx.Client() as client:
-        state = fetch_state_and_memory(state, settings=cfg, client=client)
-
-    player_msg = player_message.strip()
-    reconciled = reconcile_social_perception(player_msg, preview.social)
-    turn = preview.model_copy(update={"social": reconciled})
-
-    state = {
-        **state,
-        "social_perception": turn.social.model_dump(),
-        "reply_draft": turn.reply,
-        "tool_calls": [],
-        "social_applied": False,
-        "collective_updated": False,
-    }
-
-    state = apply_social_event(state)
-    state = refresh_collective_in_state(state)
-
-    with httpx.Client() as client:
-        state = apply_tools(state, settings=cfg, client=client)
+    with httpx.Client() as client:
+        state = fetch_state_and_memory(state, settings=cfg, client=client)
+
+        player_msg = player_message.strip()
+        reconciled = reconcile_social_perception(player_msg, preview.social)
+        turn = preview.model_copy(update={"social": reconciled})
+
+        state = {
+            **state,
+            "social_perception": turn.social.model_dump(),
+            "reply_draft": turn.reply,
+            "tool_calls": [],
+            "social_applied": False,
+            "collective_updated": False,
+        }
+
+        state = apply_social_event(state)
+        state = refresh_collective_in_state(state)
+        state = apply_tools(state, settings=cfg, client=client)

Also applies to: 68-69

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@workers/agent-worker/src/graph/social_edge_fast_lane.py` around lines 49 -
50, The current code creates two separate httpx.Client() contexts for sequential
calls; open a single httpx.Client() as client once and reuse it for both calls
(wrap the block that calls fetch_state_and_memory(...) and
fetch_social_edges(...) in one `with httpx.Client() as client:`), remove the
second client context, and pass the same `client` instance into both functions
(they already accept a client param: fetch_state_and_memory and
fetch_social_edges) so you reduce overhead and reuse connections.

apps/game-server/src/world/lore-chunk-cache.test.ts (1)

11-21: ⚡ Quick win

Add a concurrent same-key test to lock in stampede prevention behavior.

Current coverage verifies sequential hits, but not concurrent callers on an empty cache. Add a Promise.all([...]) case and assert repository fetch count stays 1.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/game-server/src/world/lore-chunk-cache.test.ts` around lines 11 - 21,
Add a concurrent-request test to ensure stampede prevention: update the test
that uses getChunkLoreCached and loreRepo.getChunkLore by adding a
Promise.all([...]) call that invokes getChunkLoreCached("world-1", 1, 2)
multiple times concurrently when the cache is empty, await the Promise.all
result, assert all results equal the mocked row, and assert the spy on
loreRepo.getChunkLore was called exactly once; keep the existing spy setup
(vi.spyOn(loreRepo, "getChunkLore").mockResolvedValue(row)) and reuse the same
arguments so the test verifies concurrent callers share a single repository
fetch.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/game-server/src/colyseus/bridge.ts`:
- Around line 200-204: The reset currently uses the singleton map.player as the
anchor which can desync multiplayer; instead, for each entry in state.players
use that player's own position or home/spawn coordinates when finding a reset
location. Replace the anchor = map.player usage in the loop with the per-player
anchor (e.g., use player.x/player.y or player.homeX/player.homeY if a spawn
field exists) before calling buildMoveGrid/findNearestWalkableCell so each
player's spawn is computed from their own state rather than the global
map.player.

In `@apps/game-server/src/index.test.ts`:
- Around line 307-311: The test currently asserts that "p-a" attitude is removed
but doesn't assert that "p-b" remains; update the reset test around
stateRes.body.attitudes to explicitly find the attitude for playerId "p-b"
(similar to how attitudeA is found) and assert it's present (not undefined)
and/or has expected properties, e.g., check that attitudeB !== undefined or
matches expected values; locate this logic near the existing expectations using
stateRes and attitudes in index.test.ts.

In `@apps/game-server/src/routes/internal-memories.ts`:
- Around line 65-68: The cache key is being computed using playerId that may not
consider the X-Player-Id header; before calling memoryContextCacheKey, read the
primary identity from req.header("x-player-id") and use that as playerId
(falling back to playerIdFromInternal(...) only if the header is missing) so
cache partitioning always prefers the X-Player-Id value; update the code around
the speakHotPath/cacheKey lines to derive playerId from
req.header("x-player-id") first and then pass that into memoryContextCacheKey.

In `@apps/game-server/src/world/lore-chunk-cache.ts`:
- Around line 24-31: Concurrent callers all call getChunkLore for the same cache
key causing stampedes; fix by adding an in-flight promise deduplication map
keyed by cacheKey(worldId, cx, cy): before awaiting
runWithLoreConcurrencyLimit(() => getChunkLore(...)) check and return an
existing in-flight promise result, otherwise create and store the promise (the
result of runWithLoreConcurrencyLimit(...)), await it, then cache.set(key, {
expiresAt: Date.now() + TTL_MS, row }) and remove the entry from the in-flight
map in both success and failure paths so later callers either reuse the
in-flight promise or trigger a fresh fetch.

In `@docs/CONTRACTS.md`:
- Line 20: Update the contract entry for POST /internal/rooms/:id/apply-actions
in docs/CONTRACTS.md to mention the optional body field jobId (used by
recordSuccessfulMutation for audit tracking); explicitly mark it optional and
show where it fits alongside actingNpcId and actions[] so integrators know it
can be supplied but is not required.

---

Outside diff comments:
In `@apps/game-server/src/routes/internal-memories.ts`:
- Around line 67-92: The memory-context cache (used via memoryContextCacheKey,
getCachedMemoryContext, setCachedMemoryContext) is not being invalidated on
writes, causing stale responses; add cache invalidation after any mutation by
removing relevant keys (e.g., by roomId/playerId/npcId scope) — implement or
expose a helper like invalidateMemoryContext(roomId, playerId?, npcId?) that
deletes matching cache entries and call it at the end of each write handler:
POST /:roomId/memories, POST /:roomId/reflect, POST /:roomId/summarize-bulk, and
DELETE /:roomId/memories (use the same identifiers used to build keys so the
fresh buildMemoryContext returns up-to-date data).

---

Nitpick comments:
In `@apps/game-server/src/world/lore-chunk-cache.test.ts`:
- Around line 11-21: Add a concurrent-request test to ensure stampede
prevention: update the test that uses getChunkLoreCached and
loreRepo.getChunkLore by adding a Promise.all([...]) call that invokes
getChunkLoreCached("world-1", 1, 2) multiple times concurrently when the cache
is empty, await the Promise.all result, assert all results equal the mocked row,
and assert the spy on loreRepo.getChunkLore was called exactly once; keep the
existing spy setup (vi.spyOn(loreRepo, "getChunkLore").mockResolvedValue(row))
and reuse the same arguments so the test verifies concurrent callers share a
single repository fetch.

In `@workers/agent-worker/src/graph/social_edge_fast_lane.py`:
- Around line 49-50: The current code creates two separate httpx.Client()
contexts for sequential calls; open a single httpx.Client() as client once and
reuse it for both calls (wrap the block that calls fetch_state_and_memory(...)
and fetch_social_edges(...) in one `with httpx.Client() as client:`), remove the
second client context, and pass the same `client` instance into both functions
(they already accept a client param: fetch_state_and_memory and
fetch_social_edges) so you reduce overhead and reuse connections.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 3522da14-d770-4d59-82cc-a1fce28187cc

📥 Commits

Reviewing files that changed from the base of the PR and between 85164da and d88cd6b.

📒 Files selected for processing (40)

apps/game-server/src/collective/service.ts
apps/game-server/src/colyseus/GameRoom.ts
apps/game-server/src/colyseus/bridge.ts
apps/game-server/src/colyseus/workerStateCache.test.ts
apps/game-server/src/colyseus/workerStateCache.ts
apps/game-server/src/index.test.ts
apps/game-server/src/memory/embed.ts
apps/game-server/src/memory/memoryContextCache.ts
apps/game-server/src/memory/service.ts
apps/game-server/src/observability/internalLatency.ts
apps/game-server/src/routes/chat.ts
apps/game-server/src/routes/internal-memories.ts
apps/game-server/src/routes/internal.ts
apps/game-server/src/routes/rooms.ts
apps/game-server/src/sse/hub.ts
apps/game-server/src/util/concurrencyGate.ts
apps/game-server/src/world/lore-chunk-cache.test.ts
apps/game-server/src/world/lore-chunk-cache.ts
docs/CONTRACTS.md
docs/LLM-E2E-FLOW-AND-LATENCY.md
packages/npc-memory/src/collective/repository.test.ts
packages/npc-memory/src/collective/repository.ts
scripts/uat-phase7-reset-snap.mjs
scripts/verify-phase10.mjs
scripts/verify-phase12.mjs
scripts/verify-phase2.mjs
scripts/verify-phase4.mjs
scripts/verify-phase5.mjs
workers/agent-worker/src/collective/social_turn.py
workers/agent-worker/src/graph/ambient_intent.py
workers/agent-worker/src/graph/nodes/llm_social_turn.py
workers/agent-worker/src/graph/npc_loop.py
workers/agent-worker/src/graph/social_edge_fast_lane.py
workers/agent-worker/src/graph/speak_intent.py
workers/agent-worker/src/main.py
workers/agent-worker/src/memory/client.py
workers/agent-worker/tests/test_fetch_state_and_memory.py
workers/agent-worker/tests/test_social_turn.py
workers/agent-worker/tests/test_speak_intent.py
workers/agent-worker/tests/test_tool_gate.py

coderabbitai · 2026-06-11T16:39:37Z

+    const anchor = map.player;
+    state.players.forEach((player, sessionId) => {
+      const grid = buildMoveGrid(map, state, sessionId, loader);
+      const spawn = findNearestWalkableCell(anchor.x, anchor.y, grid);
+      player.x = spawn.x;


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use state.players/home spawn for resets; don’t anchor from map.player in multiplayer.

Line 200 reintroduces singleton RoomState.player as the position source for resetting all humans, which can desync multiplayer semantics and snap everyone from the wrong anchor.

Suggested fix

- } else { - const anchor = map.player; - state.players.forEach((player, sessionId) => { - const grid = buildMoveGrid(map, state, sessionId, loader); - const spawn = findNearestWalkableCell(anchor.x, anchor.y, grid); + } else { + state.players.forEach((player, sessionId) => { + const grid = buildMoveGrid(map, state, sessionId, loader); + const spawn = findNearestWalkableCell( + HOME_DEFAULT_PLAYER_SPAWN.x, + HOME_DEFAULT_PLAYER_SPAWN.y, + grid, + ); player.x = spawn.x; player.y = spawn.y; }); }

As per coding guidelines: “Multiplayer: human positions live in players, not lone RoomState.player.”

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const anchor = map.player;

state.players.forEach((player, sessionId) => {

const grid = buildMoveGrid(map, state, sessionId, loader);

const spawn = findNearestWalkableCell(anchor.x, anchor.y, grid);

player.x = spawn.x;

const grid = buildMoveGrid(map, state, sessionId, loader);

state.players.forEach((player, sessionId) => {

const grid = buildMoveGrid(map, state, sessionId, loader);

const spawn = findNearestWalkableCell(

HOME_DEFAULT_PLAYER_SPAWN.x,

HOME_DEFAULT_PLAYER_SPAWN.y,

grid,

);

player.x = spawn.x;

player.y = spawn.y;

});

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/game-server/src/colyseus/bridge.ts` around lines 200 - 204, The reset currently uses the singleton map.player as the anchor which can desync multiplayer; instead, for each entry in state.players use that player's own position or home/spawn coordinates when finding a reset location. Replace the anchor = map.player usage in the loop with the per-player anchor (e.g., use player.x/player.y or player.homeX/player.homeY if a spawn field exists) before calling buildMoveGrid/findNearestWalkableCell so each player's spawn is computed from their own state rather than the global map.player.

Source: Coding guidelines

coderabbitai · 2026-06-11T16:39:37Z

+    expect(stateRes.body.recentEvents).toHaveLength(1);
+    const attitudeA = stateRes.body.attitudes?.find(
+      (a: { playerId: string }) => a.playerId === "p-a",
+    );
+    expect(attitudeA).toBeUndefined();


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert surviving player attitude explicitly in the reset test.

Line 307 and Line 311 verify event retention and p-a removal, but they do not prove p-b attitude still exists; this case can pass if attitudes are accidentally cleared wholesale.

Suggested test hardening

expect(stateRes.status).toBe(200); expect(stateRes.body.recentEvents).toHaveLength(1); const attitudeA = stateRes.body.attitudes?.find( (a: { playerId: string }) => a.playerId === "p-a", ); expect(attitudeA).toBeUndefined(); + const attitudeB = stateRes.body.attitudes?.find( + (a: { playerId: string }) => a.playerId === "p-b", + ); + expect(attitudeB).toBeDefined();

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

expect(stateRes.body.recentEvents).toHaveLength(1);

const attitudeA = stateRes.body.attitudes?.find(

(a: { playerId: string }) => a.playerId === "p-a",

);

expect(attitudeA).toBeUndefined();

expect(stateRes.body.recentEvents).toHaveLength(1);

const attitudeA = stateRes.body.attitudes?.find(

(a: { playerId: string }) => a.playerId === "p-a",

);

expect(attitudeA).toBeUndefined();

const attitudeB = stateRes.body.attitudes?.find(

(a: { playerId: string }) => a.playerId === "p-b",

);

expect(attitudeB).toBeDefined();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/game-server/src/index.test.ts` around lines 307 - 311, The test currently asserts that "p-a" attitude is removed but doesn't assert that "p-b" remains; update the reset test around stateRes.body.attitudes to explicitly find the attitude for playerId "p-b" (similar to how attitudeA is found) and assert it's present (not undefined) and/or has expected properties, e.g., check that attitudeB !== undefined or matches expected values; locate this logic near the existing expectations using stateRes and attitudes in index.test.ts.

coderabbitai · 2026-06-11T16:39:37Z

+export function getCachedMemoryContext(key: string): MemoryContext | null {
+  const entry = cache.get(key);
+  if (!entry) return null;
+  if (Date.now() > entry.expiresAt) {
+    cache.delete(key);
+    return null;
+  }
+  return entry.context;
+}
+
+export function setCachedMemoryContext(key: string, context: MemoryContext): void {
+  cache.set(key, { expiresAt: Date.now() + TTL_MS, context });
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bound the memory-context cache to avoid unbounded growth.

Expired entries are only cleaned on reads (Line 27). Entries written once and never read again remain indefinitely, so unique playerMessage traffic can grow memory without bound.

Proposed fix

const TTL_MS = 5000; +const MAX_ENTRIES = 2000; type CacheEntry = { expiresAt: number; context: MemoryContext; }; const cache = new Map<string, CacheEntry>(); + +function pruneExpired(now = Date.now()): void { + for (const [k, v] of cache) { + if (now > v.expiresAt) cache.delete(k); + } +} export function setCachedMemoryContext(key: string, context: MemoryContext): void { + pruneExpired(); + if (cache.size >= MAX_ENTRIES) { + const oldestKey = cache.keys().next().value as string | undefined; + if (oldestKey) cache.delete(oldestKey); + } cache.set(key, { expiresAt: Date.now() + TTL_MS, context }); }

coderabbitai · 2026-06-11T16:39:37Z

+    const speakHotPath = req.header("x-speak-hot-path") === "1";
+
+    const cacheKey = memoryContextCacheKey(roomId, playerId, npcId, playerMessage, skipEmbed);
+    const started = Date.now();


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use X-Player-Id as the primary identity source for this hot path before cache-keying.

On Line 67, cache partitioning depends on playerId, but playerIdFromInternal(...) does not read the player-id header. That can mis-scope context/caching when query/body values are absent or inconsistent.

As per coding guidelines, “spatial NL needs X-Player-Id header.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/game-server/src/routes/internal-memories.ts` around lines 65 - 68, The cache key is being computed using playerId that may not consider the X-Player-Id header; before calling memoryContextCacheKey, read the primary identity from req.header("x-player-id") and use that as playerId (falling back to playerIdFromInternal(...) only if the header is missing) so cache partitioning always prefers the X-Player-Id value; update the code around the speakHotPath/cacheKey lines to derive playerId from req.header("x-player-id") first and then pass that into memoryContextCacheKey.

Source: Coding guidelines

coderabbitai · 2026-06-11T16:39:37Z

+  const key = cacheKey(worldId, cx, cy);
+  const hit = cache.get(key);
+  if (hit && Date.now() <= hit.expiresAt) {
+    return hit.row;
+  }
+  const row = await runWithLoreConcurrencyLimit(() => getChunkLore(worldId, cx, cy));
+  cache.set(key, { expiresAt: Date.now() + TTL_MS, row });
+  return row;


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Deduplicate in-flight fetches per chunk key to prevent cache-miss stampedes.

On Line 29, concurrent calls for the same key all execute getChunkLore(...) before any result is cached. In hot-path fan-out this can multiply DB load and latency under burst traffic.

Suggested fix

const cache = new Map<string, CacheEntry>(); +const inFlight = new Map<string, Promise<ChunkLoreRow | null>>(); export async function getChunkLoreCached( worldId: string, cx: number, cy: number, ): Promise<ChunkLoreRow | null> { const key = cacheKey(worldId, cx, cy); const hit = cache.get(key); if (hit && Date.now() <= hit.expiresAt) { return hit.row; } - const row = await runWithLoreConcurrencyLimit(() => getChunkLore(worldId, cx, cy)); + const existing = inFlight.get(key); + if (existing) return existing; + const pending = runWithLoreConcurrencyLimit(() => getChunkLore(worldId, cx, cy)) + .then((row) => { + cache.set(key, { expiresAt: Date.now() + TTL_MS, row }); + return row; + }) + .finally(() => { + inFlight.delete(key); + }); + inFlight.set(key, pending); + const row = await pending; cache.set(key, { expiresAt: Date.now() + TTL_MS, row }); return row; }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/game-server/src/world/lore-chunk-cache.ts` around lines 24 - 31, Concurrent callers all call getChunkLore for the same cache key causing stampedes; fix by adding an in-flight promise deduplication map keyed by cacheKey(worldId, cx, cy): before awaiting runWithLoreConcurrencyLimit(() => getChunkLore(...)) check and return an existing in-flight promise result, otherwise create and store the promise (the result of runWithLoreConcurrencyLimit(...)), await it, then cache.set(key, { expiresAt: Date.now() + TTL_MS, row }) and remove the entry from the in-flight map in both success and failure paths so later callers either reuse the in-flight promise or trigger a fresh fetch.

coderabbitai · 2026-06-11T16:39:37Z

+| **Memory 热路径** | `GET .../memory-context`：`skipEmbed=1` 用于 CASUAL/SOCIAL_EDGE/NARRATIVE；RECALL 须 full embed；5s 进程内 cache；header `X-Speak-Hot-Path: 1` 优先 embed 队列 |
 | **Prompt** | 「我 / 下方 / 旁边」仅相对 **本次** `state.player`（见 `workers/.../prompt.py`） |
-| **执行** | `POST .../apply-actions` body: `actingNpcId`, `actions[]`, `initiatorPlayerId` |
+| **执行** | 唯一 HTTP 写入口：`POST /internal/rooms/:id/apply-actions`（Bearer `INTERNAL_WORKER_TOKEN` + header `X-Player-Id`；body `initiatorPlayerId` 仅 header 缺失时 fallback）。公开 `/rooms/.../apply-actions` 已移除。body: `actingNpcId`, `actions[]` |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document optional jobId body field for completeness.

The contract specifies body: actingNpcId, actions[], but the implementation also accepts an optional jobId field used for audit trail tracking (recordSuccessfulMutation). While jobId is optional and its omission doesn't break functionality, the contract should mention it for completeness and to prevent confusion during integration.

📝 Suggested contract clarification

-| **执行** | 唯一 HTTP 写入口：`POST /internal/rooms/:id/apply-actions`（Bearer `INTERNAL_WORKER_TOKEN` + header `X-Player-Id`；body `initiatorPlayerId` 仅 header 缺失时 fallback）。公开 `/rooms/.../apply-actions` 已移除。body: `actingNpcId`, `actions[]` | +| **执行** | 唯一 HTTP 写入口：`POST /internal/rooms/:id/apply-actions`（Bearer `INTERNAL_WORKER_TOKEN` + header `X-Player-Id`；body `initiatorPlayerId` 仅 header 缺失时 fallback）。公开 `/rooms/.../apply-actions` 已移除。body: `actingNpcId`, `actions[]`, `jobId?`（可选，用于审计） |

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

| **执行** | 唯一 HTTP 写入口：`POST /internal/rooms/:id/apply-actions`（Bearer `INTERNAL_WORKER_TOKEN` + header `X-Player-Id`；body `initiatorPlayerId` 仅 header 缺失时 fallback）。公开 `/rooms/.../apply-actions` 已移除。body: `actingNpcId`, `actions[]` |

| **执行** | 唯一 HTTP 写入口：`POST /internal/rooms/:id/apply-actions`（Bearer `INTERNAL_WORKER_TOKEN` + header `X-Player-Id`；body `initiatorPlayerId` 仅 header 缺失时 fallback）。公开 `/rooms/.../apply-actions` 已移除。body: `actingNpcId`, `actions[]`, `jobId?`（可选，用于审计） |

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/CONTRACTS.md` at line 20, Update the contract entry for POST /internal/rooms/:id/apply-actions in docs/CONTRACTS.md to mention the optional body field jobId (used by recordSuccessfulMutation for audit tracking); explicitly mark it optional and show where it fits alongside actingNpcId and actions[] so integrators know it can be supplied but is not required.

Add JSON-safe fetch/tool-call fallbacks for parallel speak, gate hostile ambient enqueue on hub rejection, and adapt verify scripts for Phase 16 world plus phase8 NL staging with spawn crowd clear. Co-authored-by: Cursor <cursoragent@cursor.com>

Move the second browser tab within cached home chunk before cache-nav so pageA can reach (34,13) without triggering extra lore enqueues. Co-authored-by: Cursor <cursoragent@cursor.com>

Assert hostile NL move block on speak-path coords; allow ≤1 cell HTTP drift from ambient wander after the turn completes. Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

moyunzero and others added 3 commits June 12, 2026 13:25

fix(e2e): clear pageB from homestead spawn in verify-phase11

277780b

Move the second browser tab within cached home chunk before cache-nav so pageA can reach (34,13) without triggering extra lore enqueues. Co-authored-by: Cursor <cursoragent@cursor.com>

fix(e2e): tolerate ambient npc drift in verify-phase12 hostile gate

94bbf8f

Assert hostile NL move block on speak-path coords; allow ≤1 cell HTTP drift from ambient wander after the turn completes. Co-authored-by: Cursor <cursoragent@cursor.com>

moyunzero merged commit 27e7043 into main Jun 12, 2026
1 of 2 checks passed

moyunzero deleted the feat/phase17-speak-sla branch June 12, 2026 05:59

coderabbitai Bot mentioned this pull request Jun 12, 2026

refactor(18): modularize RoomScene and code-health batch #6

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(17): speak SLA — caches, JSON-safe fetch, social fast lane#5

feat(17): speak SLA — caches, JSON-safe fetch, social fast lane#5
moyunzero merged 4 commits into
mainfrom
feat/phase17-speak-sla

moyunzero commented Jun 11, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	\| 执行 \| 唯一 HTTP 写入口：`POST /internal/rooms/:id/apply-actions`（Bearer `INTERNAL_WORKER_TOKEN` + header `X-Player-Id`；body `initiatorPlayerId` 仅 header 缺失时 fallback）。公开 `/rooms/.../apply-actions` 已移除。body: `actingNpcId`, `actions[]` \|
	\| 执行 \| 唯一 HTTP 写入口：`POST /internal/rooms/:id/apply-actions`（Bearer `INTERNAL_WORKER_TOKEN` + header `X-Player-Id`；body `initiatorPlayerId` 仅 header 缺失时 fallback）。公开 `/rooms/.../apply-actions` 已移除。body: `actingNpcId`, `actions[]`, `jobId?`（可选，用于审计） \|

Conversation

moyunzero commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Notes

Uh oh!

chatgpt-codex-connector Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

moyunzero commented Jun 11, 2026 •

edited

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading