Feat/phase20 memory speak trust#9
Conversation
- Nickname refusal override case (RED) - Non-recall question leaves LLM reply unchanged Co-authored-by: Cursor <cursoragent@cursor.com>
- Add scripts/lib/dialogue-engage.mjs shared by verify-phase19 and benchmark - Replace drawer .message--thinking with Phase 19 overlay T_think/T_first selectors - Add B_recall case with optional seed; write speak-browser-latest.json Co-authored-by: Cursor <cursoragent@cursor.com>
- extract_nickname + format_recall_answer for 我叫/叫我 patterns - Add 叫什么 recall marker for SOLO-01 nickname questions - Social system prompt: direct answer when memory_summary present Co-authored-by: Cursor <cursoragent@cursor.com>
Cross-session password/nickname recall with overlay speak, refusal guard, latency smoke, and 60s single-retry on 429/timeout/5xx per D-13. Co-authored-by: Cursor <cursoragent@cursor.com>
Wire pnpm verify:phase20 npm script and diff triggers for recall_merge and phase 20 E2E scripts in agent-verify-map. Co-authored-by: Cursor <cursoragent@cursor.com>
Wait for corner-menu connected state, open/close shell drawer for memory callback, extend memory poll default, and fix consecutive-turn baseline detection so verify:phase20 and benchmark:speak-browser pass on Phase 19 shell. Co-authored-by: Cursor <cursoragent@cursor.com>
Malformed schedule zoneIds (e.g. village-square@v1 without :localId) no longer take down Colyseus during ambient ticks; pickZoneTarget falls back to the NPC position so verify:phase20 can complete. Co-authored-by: Cursor <cursoragent@cursor.com>
ISSUE-050: password/nickname recall picks player-seed rows, rejects token echo and ambiguous LLM replies, extends RECALL memory-context timeout, and adds regression tests. Adds OpenAI-compatible GSD review providers (Agnes + NVIDIA NIM) and speak SLA playtest scripts for Phase 20 gates. Co-authored-by: Cursor <cursoragent@cursor.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (11)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (5)
📝 WalkthroughWalkthroughThis PR fixes Phase 20 cross-session memory recall (ISSUE-050) through deterministic password/nickname selection and reply merging in ChangesPhase 20 cross-session memory recall fix and worker integration
Phase 20 E2E verification and playtest pipeline
Web UI streaming reply rendering and dialogue-overlay enhancements
NPC behavior optimization: casual fast-lane short-circuit and social perception refinements
Movement and action intent pattern expansion
LLM model configuration and constraints
GSD cross-AI plan review CLI tooling
Zone wander malformed ID safety
Infrastructure, configuration, and documentation updates
Sequence Diagram(s)sequenceDiagram
participant E2E as verify-phase20.mjs
participant Browser as Playwright
participant UI as Web UI (Dialogue)
participant GameServer as Game Server
participant AgentWorker as Agent Worker
participant MemoryDB as Memory Storage
E2E->>GameServer: healthOk() GET /health
E2E->>Browser: launch chromium → navigate room
Browser->>UI: waitRoomReady() canvas visible
E2E->>UI: engageDialogue(page)
E2E->>UI: sendSpeakOverlay("密码是 sunbet42")
UI->>AgentWorker: speak turn + player_message
AgentWorker->>MemoryDB: store memory token
E2E->>GameServer: waitForMemoryContext(needle="sunbet42")
GameServer-->>E2E: memory context contains token
E2E->>Browser: page.reload()
E2E->>UI: engageDialogue(page)
E2E->>UI: sendSpeakOverlay("门禁密码是多少?")
UI->>AgentWorker: speak turn SpeakIntent.RECALL
AgentWorker->>MemoryDB: load_memory_context (timeout=18s, attempts=2)
MemoryDB-->>AgentWorker: retrieved memories + recency
AgentWorker->>AgentWorker: pick_recall_memory → merge_recall_into_reply
AgentWorker-->>UI: reply with "sunbet42"
UI-->>E2E: NPC reply verified (no refusal)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
🧹 Nitpick comments (1)
scripts/benchmark-speak-browser.mjs (1)
136-145: 💤 Low valueAcceptable duplication for self-containment.
These selectors duplicate constants from
e2e-memory-helpers.mjs. This is intentional to keep the benchmark script self-contained without importing the full helpers module. Consider adding a comment noting the canonical source to help maintainers keep them in sync.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/benchmark-speak-browser.mjs` around lines 136 - 145, The THINKING_LOCATOR and OVERLAY_NPC_REPLY constants in the benchmark script intentionally duplicate selectors from e2e-memory-helpers.mjs for self-containment. Add a comment above these constant definitions documenting that they are duplicated from the canonical source in e2e-memory-helpers.mjs and noting that they should be kept in sync with the originals if either location is updated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/assert-refusal-markers-parity.mjs`:
- Around line 20-21: The regex pattern in the matchAll call at line 20 only
matches double-quoted strings, causing it to miss single-quoted Python literals.
Modify the regex pattern to capture both single-quoted and double-quoted strings
by updating it to match either quote type, then ensure the captured marker is
extracted correctly from whichever quote type was matched (accounting for the
fact that alternation with multiple capture groups will have one group undefined
depending on which pattern matched).
In `@scripts/gsd-review-providers.mjs`:
- Around line 33-40: Update the ESLint configuration file (eslint.config.js) to
properly cover all root-level script files. Modify the files glob pattern on
line 7 from the current narrow pattern that only covers scripts/lib/**/*.mjs and
scripts/agent-verify.mjs to the broader pattern ["scripts/**/*.mjs"] to include
all .mjs files in the scripts directory. Additionally, add AbortController:
"readonly" to the globals section (lines 11-19) to properly declare the
AbortController global that is used in scripts/verify-llm-models.mjs and
scripts/lib/gsd-openai-review.mjs.
In `@scripts/lib/agent-verify-map.mjs`:
- Around line 156-161: The triggers array for the GF-10 check (in the
configuration starting at line 156) is missing patterns for three Phase 20
helper surface files that directly affect Phase 20 E2E behavior. Add regex
patterns to the triggers array to include
`/scripts\/lib\/speak-browser-round\.mjs$/`,
`/scripts\/lib\/speak-browser-stack\.mjs$/`, and
`/scripts\/lib\/dialogue-engage\.mjs$/` so that changes to these helper files
will automatically select the verify:phase20 check.
In `@scripts/lib/gsd-openai-review.mjs`:
- Around line 91-93: The `await res.json()` call at line 91 in the
`runOpenAiReview` function can throw an exception when the response body is not
valid JSON (e.g., HTML returned by a proxy with HTTP 200 status), causing the
entire multi-provider run to crash. Wrap the res.json() call and the subsequent
content extraction logic in a try-catch block, and return a structured failure
object (matching the expected return type of the function) instead of allowing
the exception to propagate. This ensures that JSON parsing errors are handled
gracefully and do not terminate the provider loop in
scripts/gsd-review-providers.mjs.
In `@scripts/lib/gsd-review-merge.mjs`:
- Around line 16-18: Remove the unused variable declarations to resolve ESLint
no-unused-vars errors. In scripts/lib/gsd-review-merge.mjs at lines 16-18,
delete the entire paddedPrefix variable declaration that checks phaseNum and
performs string manipulation. At line 54, delete the heading variable
declaration. Both variables are unused and their removal will clear the static
analysis violations.
- Around line 134-145: The `reviewers_failed` section in the frontmatter becomes
stale because the code intentionally skips updating it when the block already
exists. Currently, the condition that checks for existing `reviewers_failed:`
with the regex test at line 138 only has a comment explaining why the merge is
skipped, but does not actually update the block. To fix this, replace the
existing `reviewers_failed:` block in the frontmatter with the new `failLines`
content whenever it's detected, rather than leaving it unchanged. This ensures
the list stays current with actual failures from each run, preventing stale
entries from persisting and ensuring new failures are captured. Apply the same
replacement logic (using the replace method with the pattern) in both the if and
else branches to keep the behavior consistent.
In `@scripts/lib/gsd-review-providers-config.mjs`:
- Around line 82-87: The code at lines 82-87 wraps the numeric values for
timeoutMs and maxTokens with Number() without validating the result, which can
produce NaN if the input values are invalid. After calling Number() on the
override and defaults values for both timeoutMs and maxTokens, add validation to
ensure the result is not NaN. If the conversion results in NaN, either throw a
descriptive error indicating invalid configuration or fall back to a valid
default value. This ensures that only valid numeric values are assigned to
timeoutMs and maxTokens, preventing downstream issues like immediate aborts from
setTimeout or invalid max_tokens values sent to external services.
- Around line 108-111: The code at lines 108-111 has a path traversal
vulnerability where the normalized phase parameter could contain traversal
segments like `../` that allow escaping from the `.planning/phases` directory.
When calling `resolve(phasesRoot, normalized)`, validate that the resulting
directory path stays within `phasesRoot` before accepting it. Use the
`relative()` function to check that the resolved path does not escape the root,
or verify that the resolved absolute path starts with the `phasesRoot` absolute
path, and only return the directory if this validation passes.
In `@scripts/lib/speak-browser-round.mjs`:
- Around line 32-40: The function `waitForFirstNpcReply` currently has a
signature mismatch with its caller in `scripts/lib/e2e-memory-helpers.mjs`. The
caller invokes it with only 2 arguments (page and timeoutMs) and expects a
timestamp return value, but the current function signature requires 3 parameters
(page, baseline, timeoutMs) and returns a string. Restore the correct contract
by removing the baseline parameter, capturing an initial baseline within the
function before the polling loop, and changing the return value from
`current.slice(0, 200)` to `Date.now()` so that the function returns the
timestamp when the first NPC reply change is detected.
- Around line 87-92: The performance.getEntriesByName("speak_partial") call is
checking all historical entries globally, which allows old entries from previous
rounds to incorrectly indicate this round had a partial response. Modify the
evaluation to filter the returned entries so that only those with timing values
that occurred after the start of this round (comparing against t0) are counted.
This ensures the partial check is scoped to the current round only and prevents
old entries from masking real first-reply timeouts.
In `@scripts/lib/speak-browser-stack.mjs`:
- Around line 27-30: The requireFromWeb function call on line 29 uses invalid
Node.js require() syntax by including a version specifier in the module string
format `@playwright/test@${ver}`. Node.js require() does not support version
specifiers in require strings. Remove the version specifier portion from the
module string passed to requireFromWeb, changing it to just require the base
module name without the @${ver} suffix. Alternatively, if the intention is to
handle different versions dynamically, consider a different approach that aligns
with Node.js module resolution, or verify whether this fallback mechanism is
viable given that `@playwright/test` is not listed as a dependency in
apps/web/package.json.
In `@scripts/playtest-speak-sla.mjs`:
- Line 86: The error message in the throw statement that guides users to install
playwright dependencies currently directs them to use `npm install`, but per
monorepo guidelines this should use `pnpm` instead. Update the error message
string to replace `npm install` with `pnpm` to align with the monorepo's package
management standards.
- Around line 47-52: The numeric environment variable parsing on these lines can
produce NaN values when environment variables contain invalid input (non-numeric
strings), which breaks control flow in retry loops and timeouts. Wrap each
Number.parseInt call with validation to ensure it returns a valid number,
providing a sensible default fallback when parsing fails. For variables like
retryBackoffMs, turnGapMs, and minSessionMs that currently lack Math.max()
guards, add explicit NaN checks or use a helper function that validates the
parsed result and returns the default value if NaN is produced. This ensures
speakAttempts, retryBackoffMs, sessionCount, turnGapMs, and minSessionMs all
receive valid numeric values regardless of invalid environment input.
In `@workers/agent-worker/src/graph/nodes/llm_social_turn.py`:
- Line 310: The string on line 310 in llm_social_turn.py triggers a RUF001 lint
warning due to confusable fullwidth comma characters (,) embedded in the Chinese
text. Replace all fullwidth commas in this string with regular ASCII commas (,)
to resolve the warning. The string beginning with "若玩家追问 Memory summary 中已有的事实"
contains at least two fullwidth commas that need conversion.
In `@workers/agent-worker/src/graph/recall_merge.py`:
- Around line 168-173: The extract_nickname() function currently extracts
nicknames from any memory text without distinguishing between question rows and
fact rows. When a question row like "player: 我叫什么?" is processed, it incorrectly
extracts "什么" as a nickname, causing pick_recall_memory() to select questions as
facts. Modify extract_nickname() to return None if the memory_text is a question
row (contains question markers like ? or ?), ensuring nicknames are only
extracted from factual memory rows, not question rows.
---
Nitpick comments:
In `@scripts/benchmark-speak-browser.mjs`:
- Around line 136-145: The THINKING_LOCATOR and OVERLAY_NPC_REPLY constants in
the benchmark script intentionally duplicate selectors from
e2e-memory-helpers.mjs for self-containment. Add a comment above these constant
definitions documenting that they are duplicated from the canonical source in
e2e-memory-helpers.mjs and noting that they should be kept in sync with the
originals if either location is updated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 544d5021-a6d7-45be-9943-39b698d34bd9
📒 Files selected for processing (28)
apps/game-server/src/ambient/zone-wander.test.tsapps/game-server/src/ambient/zone-wander.tsdocs/ISSUE-LOG.mdpackage.jsonscripts/assert-refusal-markers-parity.mjsscripts/benchmark-speak-browser.mjsscripts/gsd-review-providers.mjsscripts/lib/agent-verify-map.mjsscripts/lib/dialogue-engage.mjsscripts/lib/e2e-memory-helpers.mjsscripts/lib/gsd-openai-review.mjsscripts/lib/gsd-review-merge.mjsscripts/lib/gsd-review-providers-config.mjsscripts/lib/playtest-findings-patch.mjsscripts/lib/speak-browser-round.mjsscripts/lib/speak-browser-stack.mjsscripts/playtest-speak-sla.mjsscripts/verify-phase19.mjsscripts/verify-phase20.mjsworkers/agent-worker/src/graph/memory_quote.pyworkers/agent-worker/src/graph/nodes/llm_social_turn.pyworkers/agent-worker/src/graph/npc_loop.pyworkers/agent-worker/src/graph/recall_merge.pyworkers/agent-worker/src/main.pyworkers/agent-worker/src/memory/client.pyworkers/agent-worker/tests/test_fetch_state_and_memory.pyworkers/agent-worker/tests/test_memory_quote.pyworkers/agent-worker/tests/test_recall_merge.py
| console.error(`Usage: | ||
| node scripts/gsd-review-providers.mjs --phase <N> [--provider agnes|nvidia|all] [--prompt-file PATH] [--merge] [--probe] | ||
|
|
||
| Examples: | ||
| pnpm gsd:review:providers -- --phase 20 --merge | ||
| pnpm gsd:review:providers -- --phase 20 --provider nvidia | ||
| pnpm gsd:review:providers -- --probe`); | ||
| process.exit(1); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== ESLint config files =="
fd -HI 'eslint.config.*'
fd -HI '.eslintrc*'
echo
echo "== Node/global declarations in ESLint config =="
rg -n --hidden --iglob 'eslint.config.*' --iglob '.eslintrc*' 'env|globals|node|process|console|AbortController|files|scripts'
echo
echo "== Current static-analysis-reported symbols in changed scripts =="
rg -n '\b(process|console|AbortController)\b' scripts/gsd-review-providers.mjs scripts/lib/gsd-openai-review.mjsRepository: moyunzero/AetherLife
Length of output: 5228
🏁 Script executed:
cat -n eslint.config.jsRepository: moyunzero/AetherLife
Length of output: 860
🏁 Script executed:
fd -maxdepth 1 '\.mjs$' scripts/Repository: moyunzero/AetherLife
Length of output: 295
🏁 Script executed:
fd --max-depth 1 '\.mjs$' scripts/Repository: moyunzero/AetherLife
Length of output: 1617
🏁 Script executed:
rg -l '\b(console|process|AbortController)\b' scripts/*.mjs | head -20Repository: moyunzero/AetherLife
Length of output: 612
🏁 Script executed:
rg -l '\b(console|process|AbortController)\b' scripts/*.mjs | wc -lRepository: moyunzero/AetherLife
Length of output: 67
🏁 Script executed:
rg -l '\bAbortController\b' scripts/*.mjsRepository: moyunzero/AetherLife
Length of output: 94
Update ESLint config to cover all root-level script files and declare AbortController global.
The files glob in eslint.config.js line 7 is too narrow: it covers only scripts/lib/**/*.mjs and scripts/agent-verify.mjs, leaving 48+ other root-level .mjs scripts (including scripts/gsd-review-providers.mjs) uncovered. Additionally, AbortController is used in scripts/verify-llm-models.mjs and scripts/lib/gsd-openai-review.mjs but is not declared in the globals (lines 11–19).
Expand the files glob to ["scripts/**/*.mjs"] and add AbortController: "readonly" to the globals section.
🧰 Tools
🪛 ESLint
[error] 33-33: 'console' is not defined.
(no-undef)
[error] 40-40: 'process' is not defined.
(no-undef)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/gsd-review-providers.mjs` around lines 33 - 40, Update the ESLint
configuration file (eslint.config.js) to properly cover all root-level script
files. Modify the files glob pattern on line 7 from the current narrow pattern
that only covers scripts/lib/**/*.mjs and scripts/agent-verify.mjs to the
broader pattern ["scripts/**/*.mjs"] to include all .mjs files in the scripts
directory. Additionally, add AbortController: "readonly" to the globals section
(lines 11-19) to properly declare the AbortController global that is used in
scripts/verify-llm-models.mjs and scripts/lib/gsd-openai-review.mjs.
Source: Linters/SAST tools
| system_text = f"{system_text}\n\nMemory summary:\n{memory}" | ||
| system_text = ( | ||
| f"{system_text}\n\nMemory summary:\n{memory}\n" | ||
| "若玩家追问 Memory summary 中已有的事实,reply 须直接给出答案,勿拒绝或说「不记得」。" |
There was a problem hiding this comment.
Resolve Ruff RUF001 on Line 310 by replacing the confusable fullwidth comma.
This string currently triggers RUF001; a plain ASCII comma removes the lint warning without changing behavior.
Proposed surgical diff
- "若玩家追问 Memory summary 中已有的事实,reply 须直接给出答案,勿拒绝或说「不记得」。"
+ "若玩家追问 Memory summary 中已有的事实, reply 须直接给出答案, 勿拒绝或说「不记得」。"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "若玩家追问 Memory summary 中已有的事实,reply 须直接给出答案,勿拒绝或说「不记得」。" | |
| "若玩家追问 Memory summary 中已有的事实, reply 须直接给出答案, 勿拒绝或说「不记得」。" |
🧰 Tools
🪛 Ruff (0.15.17)
[warning] 310-310: String contains ambiguous , (FULLWIDTH COMMA). Did you mean , (COMMA)?
(RUF001)
[warning] 310-310: String contains ambiguous , (FULLWIDTH COMMA). Did you mean , (COMMA)?
(RUF001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@workers/agent-worker/src/graph/nodes/llm_social_turn.py` at line 310, The
string on line 310 in llm_social_turn.py triggers a RUF001 lint warning due to
confusable fullwidth comma characters (,) embedded in the Chinese text. Replace
all fullwidth commas in this string with regular ASCII commas (,) to resolve the
warning. The string beginning with "若玩家追问 Memory summary 中已有的事实" contains at
least two fullwidth commas that need conversion.
Source: Linters/SAST tools
…nchmarks Harden speak latency path (casual fast lane, fetch_state cache, phaseTimingMs), fix relay move phrases and apply_tools physical inject (ISSUE-051/052), show speakPartial in DialogueOverlay, and make speak-browser benchmark resilient to timeouts and room reset failures. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
workers/agent-worker/src/main.py (1)
290-294:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse original payload message as fallback for recall quote routing.
At Line 293,
pick_memory_quote()only receivesresult.get("player_message"). If a fast-lane result omits this field, recall detection won’t trigger and quote selection can regress to non-recall behavior.Proposed surgical diff
memory_quote = pick_memory_quote( result.get("retrieved_memories"), int(result.get("memory_count") or 0), - player_message=str(result.get("player_message") or ""), + player_message=str(result.get("player_message") or player_message or ""), )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/agent-worker/src/main.py` around lines 290 - 294, The pick_memory_quote function call is passing player_message from the result without a fallback mechanism, which causes recall detection to fail when fast-lane results omit this field. Modify the player_message parameter passed to pick_memory_quote to use the original payload message as a fallback when result.get("player_message") is not available, ensuring recall behavior works consistently regardless of whether the fast-lane result includes the player_message field.scripts/benchmark-speak-browser.mjs (1)
132-132:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winSwitch the install hint to
pnpm.Line 132 still recommends
npm install, which is inconsistent with monorepo command conventions.As per coding guidelines, "Use
pnpminstead ofnpmoryarnfor all package management commands in the monorepo."Suggested patch
- throw new Error("playwright 未安装:cd scripts/.pw-deps && npm install"); + throw new Error("playwright 未安装:cd scripts/.pw-deps && pnpm install");🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/benchmark-speak-browser.mjs` at line 132, The error message in the throw statement at line 132 recommends using npm install, which violates the monorepo convention of using pnpm for all package management commands. Replace the npm install reference with pnpm install in the error message to align with project standards.Source: Coding guidelines
🧹 Nitpick comments (4)
apps/web/src/components/DialogueOverlay.test.ts (1)
27-68: ⚡ Quick winAdd assertions for streaming ARIA contract (
role="status"+aria-live="polite").These attributes were added in the component but are not validated in tests, so accessibility regressions could slip through.
Suggested test diff
it("shows streaming text instead of thinking-only placeholder", () => { const html = renderToStaticMarkup( createElement(DialogueOverlay, { ...baseProps, streamingReply: "你好呀,今天想聊点什么?", }), ); expect(html).toContain('data-testid="dialogue-overlay-streaming"'); + expect(html).toContain('role="status"'); + expect(html).toContain('aria-live="polite"'); + expect(html).toContain("dialogue-overlay__last-line--streaming"); expect(html).toContain("你好呀,今天想聊点什么?"); expect(html).not.toContain("dialogue-overlay__thinking"); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/src/components/DialogueOverlay.test.ts` around lines 27 - 68, The tests for DialogueOverlay component are not validating the ARIA accessibility attributes (role="status" and aria-live="polite") that should be present on the streaming dialogue overlay element. Add assertions in the test cases that involve streamingReply content (specifically in "shows streaming text instead of thinking-only placeholder" and "prefers streaming partial over stale last message while thinking") to verify that the rendered HTML contains both role="status" and aria-live="polite" attributes on the streaming element to ensure the accessibility contract is maintained and prevent future regressions.workers/agent-worker/tests/test_action_intent.py (1)
222-234: ⚡ Quick winAdd one negative boundary case for “那里/那边/那儿” to prevent regressions.
These new assertions are all positive paths. Please add a non-move sentence (e.g., “那里有什么故事?”) asserting
player_requests_move(...)isFalseand intent is notPHYSICAL.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/agent-worker/tests/test_action_intent.py` around lines 222 - 234, The test cases shown all verify positive paths where movement is requested. Add a negative boundary case to prevent regressions by including a test assertion for a sentence that contains the location words "那里/那边/那儿" but does NOT request movement (such as "那里有什么故事?"). After creating this non-move test message variable, assert that player_requests_move() returns False for this message, and assert that classify_speak_intent() returns a value other than SpeakIntent.PHYSICAL, similar to how the existing positive test cases verify these conditions for their respective messages.workers/agent-worker/tests/test_casual_fast_lane.py (1)
83-107: ⚡ Quick winAdd a physical-action non-short-circuit regression test.
This new test only locks the short-circuit path. Add one case where
player_requests_physical_action(...)is true and assert the pipeline does not short-circuit (i.e., social/tool stages run).Suggested diff
+def test_run_casual_fast_lane_physical_request_does_not_short_circuit(): + preview = _preview_for("你好,用一句话简短回复") + settings = Settings(game_server_url="http://127.0.0.1:2567") + + with patch("src.graph.casual_fast_lane.fetch_state") as fetch_state: + fetch_state.return_value = {"room_id": "default", "room_snapshot": {"npcs": []}} + with patch("src.graph.casual_fast_lane.player_requests_physical_action", return_value=True): + with patch("src.graph.casual_fast_lane.apply_social_event", side_effect=lambda s: s) as apply_social: + with patch("src.graph.casual_fast_lane.apply_tools", side_effect=lambda s, **_: s) as apply_tools: + with patch("src.graph.casual_fast_lane.compose_reply", side_effect=lambda s: {**s, "reply": "ok"}): + with patch("src.graph.job_context.record_phase_ms"): + run_casual_fast_lane( + room_id="default", + player_message="请你走到那边", + npc_id="npc-1", + player_id="p1", + recent_turns=[], + preview=preview, + settings=settings, + ) + apply_social.assert_called_once() + apply_tools.assert_called_once()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/agent-worker/tests/test_casual_fast_lane.py` around lines 83 - 107, Add a new test function to verify the non-short-circuit regression case for when physical actions are requested. Create a test that sets up the same mocks and calls to run_casual_fast_lane as the existing test_run_casual_fast_lane_short_circuits_zero_side_effect, but configure the preview or player state to make player_requests_physical_action return true. In this new test, assert that apply_social and apply_tools ARE called (not assert_not_called), demonstrating that the pipeline proceeds through the normal stages instead of short-circuiting when a physical action is requested.workers/agent-worker/tests/test_social_turn.py (1)
130-133: ⚡ Quick winCover the
请假normalization branch explicitly.The new helper strips both
回复and请假, but this test only verifies the回复case. Add a请假assertion to lock the second false-positive guard.Suggested diff
def test_infer_help_not_triggered_by_hui_fu_substring(): assert infer_social_from_message("你好,用一句话简短回复") is None - assert infer_social_from_message("请帮帮我") is not None - assert infer_social_from_message("请帮帮我").kind == "help" + assert infer_social_from_message("我今天请假") is None + inferred = infer_social_from_message("请帮帮我") + assert inferred is not None + assert inferred.kind == "help"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/agent-worker/tests/test_social_turn.py` around lines 130 - 133, The test function test_infer_help_not_triggered_by_hui_fu_substring currently only verifies the 回复 substring normalization case but does not cover the 请假 substring normalization case. Add a new assertion in this test function that calls infer_social_from_message with a message containing 请假 to verify that this substring is also properly normalized and does not trigger false-positive help detection, ensuring both false-positive guards are properly tested.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@eslint.config.js`:
- Around line 7-14: Add the missing script file `scripts/verify-phase20.mjs` to
the `files` array in the ESLint override configuration at the location shown in
the diff. This script uses globals like `AbortSignal` and `localStorage` that
need the override rules to avoid triggering `no-undef` errors. Simply include
the string "scripts/verify-phase20.mjs" as an additional entry in the files
array along with the other listed scripts.
In `@packages/shared/src/speakIntent.ts`:
- Line 56: The regex pattern at line 56 matching `/那边|那里|那儿/` is too broad and
matches plain location references without movement context, causing incorrect
PHYSICAL intent classification. The pattern should be refined to only match
these location terms when they appear in contexts that indicate actual movement
or directional intent, such as after motion verbs or directional markers. Update
the regex to require contextual indicators (like preceding directional or
movement-related characters/patterns) before matching these location terms, so
that standalone queries like "那里有什么历史?" are not misclassified as movement
intents at the PHYSICAL classification stage around line 115.
In `@scripts/benchmark-speak-browser.mjs`:
- Around line 149-150: The ROOM_ID variable is being interpolated directly into
URL path segments without proper encoding, which can cause issues if it contains
special characters like `/`, `?`, or `#`. Wrap the ROOM_ID with
encodeURIComponent() when interpolating it into the fetch URL template literals.
This applies to both the fetch call in the reset endpoint (line 149) and the
other occurrence mentioned at line 160, ensuring the ROOM_ID is properly
URL-encoded before being used in the path.
In `@scripts/lib/speak-browser-stack.mjs`:
- Around line 46-48: The error message in the throw statement indicates users
should run npm install, but this conflicts with the monorepo's package
management standards. In the error message within the throw new Error call,
replace the text "npm install" with "pnpm install" to align with the monorepo
convention of using pnpm for all package management commands.
In `@scripts/verify-overlay-streaming.mjs`:
- Around line 53-63: The browser cleanup is deferred too long, leaving the
Chromium process open if chromium.launch(), newContext(), addInitScript(), or
newPage() throws an error. Restructure the code by moving the try-finally block
to immediately wrap everything starting after chromium.launch(), so that the
finally block containing browser.close() always executes regardless of which
operation fails. This ensures the Chromium process is properly cleaned up even
if any initialization step throws an exception.
- Line 31: The variable tPartialWarnMs is parsed from the
OVERLAY_STREAMING_T_PARTIAL_MS environment variable without validation, which
means invalid values like "abc" result in NaN instead of failing fast. After
parsing the environment variable with Number.parseInt in the tPartialWarnMs
assignment, add a validation check to ensure the result is a valid number (not
NaN), and throw or exit with an error message if the environment variable
contains an invalid value, preventing silent failures where NaN disables
downstream warning comparisons.
- Around line 29-32: The file `scripts/verify-overlay-streaming.mjs` is not
included in the ESLint configuration's `files` array in `eslint.config.js`,
causing no-undef errors for Node.js globals like `process`. Update the
`eslint.config.js` file to add `scripts/verify-overlay-streaming.mjs` (or an
appropriate glob pattern to cover all scripts) to the `files` array in the
configuration object that handles Node.js environment globals, ensuring the
ESLint rules properly recognize process and other Node.js built-in globals in
this file.
In `@workers/agent-worker/src/graph/action_intent.py`:
- Line 26: The regex pattern containing the standalone Chinese location words
`那边|那里|那儿` on line 26 is too permissive when used with `re.search()` in the
`player_requests_move` function (line 54), causing it to match non-movement
sentences and incorrectly classify them as move intents. Add word boundary
constraints or contextual patterns to these Chinese location words so they only
match when they are actually part of a movement action request, keeping the
implementation consistent with the shared TypeScript matcher to reduce false
positives.
---
Outside diff comments:
In `@scripts/benchmark-speak-browser.mjs`:
- Line 132: The error message in the throw statement at line 132 recommends
using npm install, which violates the monorepo convention of using pnpm for all
package management commands. Replace the npm install reference with pnpm install
in the error message to align with project standards.
In `@workers/agent-worker/src/main.py`:
- Around line 290-294: The pick_memory_quote function call is passing
player_message from the result without a fallback mechanism, which causes recall
detection to fail when fast-lane results omit this field. Modify the
player_message parameter passed to pick_memory_quote to use the original payload
message as a fallback when result.get("player_message") is not available,
ensuring recall behavior works consistently regardless of whether the fast-lane
result includes the player_message field.
---
Nitpick comments:
In `@apps/web/src/components/DialogueOverlay.test.ts`:
- Around line 27-68: The tests for DialogueOverlay component are not validating
the ARIA accessibility attributes (role="status" and aria-live="polite") that
should be present on the streaming dialogue overlay element. Add assertions in
the test cases that involve streamingReply content (specifically in "shows
streaming text instead of thinking-only placeholder" and "prefers streaming
partial over stale last message while thinking") to verify that the rendered
HTML contains both role="status" and aria-live="polite" attributes on the
streaming element to ensure the accessibility contract is maintained and prevent
future regressions.
In `@workers/agent-worker/tests/test_action_intent.py`:
- Around line 222-234: The test cases shown all verify positive paths where
movement is requested. Add a negative boundary case to prevent regressions by
including a test assertion for a sentence that contains the location words
"那里/那边/那儿" but does NOT request movement (such as "那里有什么故事?"). After creating
this non-move test message variable, assert that player_requests_move() returns
False for this message, and assert that classify_speak_intent() returns a value
other than SpeakIntent.PHYSICAL, similar to how the existing positive test cases
verify these conditions for their respective messages.
In `@workers/agent-worker/tests/test_casual_fast_lane.py`:
- Around line 83-107: Add a new test function to verify the non-short-circuit
regression case for when physical actions are requested. Create a test that sets
up the same mocks and calls to run_casual_fast_lane as the existing
test_run_casual_fast_lane_short_circuits_zero_side_effect, but configure the
preview or player state to make player_requests_physical_action return true. In
this new test, assert that apply_social and apply_tools ARE called (not
assert_not_called), demonstrating that the pipeline proceeds through the normal
stages instead of short-circuiting when a physical action is requested.
In `@workers/agent-worker/tests/test_social_turn.py`:
- Around line 130-133: The test function
test_infer_help_not_triggered_by_hui_fu_substring currently only verifies the 回复
substring normalization case but does not cover the 请假 substring normalization
case. Add a new assertion in this test function that calls
infer_social_from_message with a message containing 请假 to verify that this
substring is also properly normalized and does not trigger false-positive help
detection, ensuring both false-positive guards are properly tested.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: a20e7207-69b2-423f-9a83-644036f862b4
📒 Files selected for processing (37)
apps/web/src/ChatPage.tsxapps/web/src/components/DialogueOverlay.test.tsapps/web/src/components/DialogueOverlay.tsxapps/web/src/index.cssdocs/ISSUE-LOG.mdeslint.config.jspackage.jsonpackages/shared/src/speakIntent.tsscripts/assert-refusal-markers-parity.mjsscripts/benchmark-speak-browser.mjsscripts/lib/agent-verify-map.mjsscripts/lib/e2e-memory-helpers.mjsscripts/lib/gsd-openai-review.mjsscripts/lib/gsd-review-merge.mjsscripts/lib/gsd-review-providers-config.mjsscripts/lib/speak-browser-round.mjsscripts/lib/speak-browser-stack.mjsscripts/playtest-speak-sla.mjsscripts/verify-overlay-streaming.mjsworkers/agent-worker/src/collective/schemas.pyworkers/agent-worker/src/collective/social_turn.pyworkers/agent-worker/src/config.pyworkers/agent-worker/src/graph/action_intent.pyworkers/agent-worker/src/graph/casual_fast_lane.pyworkers/agent-worker/src/graph/nodes/llm_social_turn.pyworkers/agent-worker/src/graph/npc_loop.pyworkers/agent-worker/src/graph/recall_merge.pyworkers/agent-worker/src/llm/factory.pyworkers/agent-worker/src/main.pyworkers/agent-worker/tests/test_action_intent.pyworkers/agent-worker/tests/test_casual_fast_lane.pyworkers/agent-worker/tests/test_fetch_state_and_memory.pyworkers/agent-worker/tests/test_llm_social_memory.pyworkers/agent-worker/tests/test_recall_merge.pyworkers/agent-worker/tests/test_social_stream_extract.pyworkers/agent-worker/tests/test_social_turn.pyworkers/agent-worker/tests/test_tool_gate.py
✅ Files skipped from review due to trivial changes (3)
- workers/agent-worker/src/config.py
- apps/web/src/index.css
- docs/ISSUE-LOG.md
🚧 Files skipped from review as they are similar to previous changes (9)
- scripts/lib/gsd-openai-review.mjs
- scripts/lib/agent-verify-map.mjs
- scripts/lib/speak-browser-round.mjs
- package.json
- scripts/lib/gsd-review-merge.mjs
- scripts/lib/e2e-memory-helpers.mjs
- workers/agent-worker/src/graph/npc_loop.py
- scripts/lib/gsd-review-providers-config.mjs
- workers/agent-worker/src/graph/recall_merge.py
| files: [ | ||
| "scripts/lib/**/*.mjs", | ||
| "scripts/agent-verify.mjs", | ||
| "scripts/gsd-review-providers.mjs", | ||
| "scripts/assert-refusal-markers-parity.mjs", | ||
| "scripts/playtest-speak-sla.mjs", | ||
| "scripts/benchmark-speak-browser.mjs", | ||
| ], |
There was a problem hiding this comment.
Include scripts/verify-phase20.mjs in this ESLint override.
That script uses AbortSignal and localStorage, but this files set doesn’t match it, so it can still hit no-undef under default rules.
Suggested surgical diff
files: [
"scripts/lib/**/*.mjs",
"scripts/agent-verify.mjs",
+ "scripts/verify-phase20.mjs",
"scripts/gsd-review-providers.mjs",
"scripts/assert-refusal-markers-parity.mjs",
"scripts/playtest-speak-sla.mjs",
"scripts/benchmark-speak-browser.mjs",
],📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| files: [ | |
| "scripts/lib/**/*.mjs", | |
| "scripts/agent-verify.mjs", | |
| "scripts/gsd-review-providers.mjs", | |
| "scripts/assert-refusal-markers-parity.mjs", | |
| "scripts/playtest-speak-sla.mjs", | |
| "scripts/benchmark-speak-browser.mjs", | |
| ], | |
| files: [ | |
| "scripts/lib/**/*.mjs", | |
| "scripts/agent-verify.mjs", | |
| "scripts/verify-phase20.mjs", | |
| "scripts/gsd-review-providers.mjs", | |
| "scripts/assert-refusal-markers-parity.mjs", | |
| "scripts/playtest-speak-sla.mjs", | |
| "scripts/benchmark-speak-browser.mjs", | |
| ], |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@eslint.config.js` around lines 7 - 14, Add the missing script file
`scripts/verify-phase20.mjs` to the `files` array in the ESLint override
configuration at the location shown in the diff. This script uses globals like
`AbortSignal` and `localStorage` that need the override rules to avoid
triggering `no-undef` errors. Simply include the string
"scripts/verify-phase20.mjs" as an additional entry in the files array along
with the other listed scripts.
| const res = await fetch(`${GS}/rooms/${ROOM_ID}/reset`, { method: "POST", headers }); | ||
| lastStatus = res.status; |
There was a problem hiding this comment.
Encode ROOM_ID before interpolating it into path URLs.
Line 149 and Line 160 use raw ROOM_ID in path segments. Since ROOM_ID is env-driven, reserved characters (e.g. /, ?, #) can break route matching.
Suggested patch
- const res = await fetch(`${GS}/rooms/${ROOM_ID}/reset`, { method: "POST", headers });
+ const res = await fetch(`${GS}/rooms/${encodeURIComponent(ROOM_ID)}/reset`, { method: "POST", headers });
@@
- const res = await fetch(`${GS}/rooms/${ROOM_ID}/state`, { headers });
+ const res = await fetch(`${GS}/rooms/${encodeURIComponent(ROOM_ID)}/state`, { headers });Also applies to: 160-160
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/benchmark-speak-browser.mjs` around lines 149 - 150, The ROOM_ID
variable is being interpolated directly into URL path segments without proper
encoding, which can cause issues if it contains special characters like `/`,
`?`, or `#`. Wrap the ROOM_ID with encodeURIComponent() when interpolating it
into the fetch URL template literals. This applies to both the fetch call in the
reset endpoint (line 149) and the other occurrence mentioned at line 160,
ensuring the ROOM_ID is properly URL-encoded before being used in the path.
| throw new Error( | ||
| "playwright not installed — run: cd scripts/.pw-deps && npm install", | ||
| ); |
There was a problem hiding this comment.
Use pnpm in the Playwright install hint.
Line 47 still tells users to run npm install; this conflicts with monorepo command conventions.
As per coding guidelines, "Use pnpm instead of npm or yarn for all package management commands in the monorepo."
Suggested patch
throw new Error(
- "playwright not installed — run: cd scripts/.pw-deps && npm install",
+ "playwright not installed — run: cd scripts/.pw-deps && pnpm install",
);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/lib/speak-browser-stack.mjs` around lines 46 - 48, The error message
in the throw statement indicates users should run npm install, but this
conflicts with the monorepo's package management standards. In the error message
within the throw new Error call, replace the text "npm install" with "pnpm
install" to align with the monorepo convention of using pnpm for all package
management commands.
Source: Coding guidelines
| const roomId = process.env.OVERLAY_STREAMING_ROOM_ID || `overlay-stream-${runId}`; | ||
| const speakTimeoutMs = Math.max(90_000, e2eSpeakTimeoutMs()); | ||
| const tPartialWarnMs = Number.parseInt(process.env.OVERLAY_STREAMING_T_PARTIAL_MS || "15000", 10); | ||
| const requireStreaming = process.env.OVERLAY_STREAMING_REQUIRE === "1"; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and inspect its structure
git ls-files | grep -E "(verify-overlay-streaming|eslint)" | head -20Repository: moyunzero/AetherLife
Length of output: 118
🏁 Script executed:
# Find the file and check its existence
fd "verify-overlay-streaming" --type fRepository: moyunzero/AetherLife
Length of output: 101
🏁 Script executed:
# Find ESLint config files
fd "eslint" --type fRepository: moyunzero/AetherLife
Length of output: 81
🏁 Script executed:
# Check for ESLint-related configs in root
ls -la | grep -E "eslint|\.eslintrc"Repository: moyunzero/AetherLife
Length of output: 198
🏁 Script executed:
# Inspect the ESLint config
cat eslint.config.jsRepository: moyunzero/AetherLife
Length of output: 1077
🏁 Script executed:
# Inspect the actual script file
cat -n scripts/verify-overlay-streaming.mjs | head -140Repository: moyunzero/AetherLife
Length of output: 5744
Add scripts/verify-overlay-streaming.mjs to ESLint configuration for Node.js globals coverage.
The file is missing from the ESLint config's files array. Without it, ESLint will report no-undef errors for process, console, and localStorage since these are not defined in the default configuration.
Update eslint.config.js to include this script in the covered files list:
Configuration fix
files: [
"scripts/lib/**/*.mjs",
"scripts/agent-verify.mjs",
"scripts/gsd-review-providers.mjs",
"scripts/assert-refusal-markers-parity.mjs",
"scripts/playtest-speak-sla.mjs",
"scripts/benchmark-speak-browser.mjs",
+ "scripts/verify-overlay-streaming.mjs",
],🧰 Tools
🪛 ESLint
[error] 29-29: 'process' is not defined.
(no-undef)
[error] 31-31: 'process' is not defined.
(no-undef)
[error] 32-32: 'process' is not defined.
(no-undef)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/verify-overlay-streaming.mjs` around lines 29 - 32, The file
`scripts/verify-overlay-streaming.mjs` is not included in the ESLint
configuration's `files` array in `eslint.config.js`, causing no-undef errors for
Node.js globals like `process`. Update the `eslint.config.js` file to add
`scripts/verify-overlay-streaming.mjs` (or an appropriate glob pattern to cover
all scripts) to the `files` array in the configuration object that handles
Node.js environment globals, ensuring the ESLint rules properly recognize
process and other Node.js built-in globals in this file.
Source: Linters/SAST tools
| const runId = Date.now(); | ||
| const roomId = process.env.OVERLAY_STREAMING_ROOM_ID || `overlay-stream-${runId}`; | ||
| const speakTimeoutMs = Math.max(90_000, e2eSpeakTimeoutMs()); | ||
| const tPartialWarnMs = Number.parseInt(process.env.OVERLAY_STREAMING_T_PARTIAL_MS || "15000", 10); |
There was a problem hiding this comment.
Validate OVERLAY_STREAMING_T_PARTIAL_MS before using it.
A typo such as OVERLAY_STREAMING_T_PARTIAL_MS=abc makes tPartialWarnMs become NaN, which disables the late-partial warning comparison and writes a misleading partialWithinWarn: false verdict instead of failing fast on bad config.
🛡️ Proposed config parser
-const tPartialWarnMs = Number.parseInt(process.env.OVERLAY_STREAMING_T_PARTIAL_MS || "15000", 10);
+const tPartialWarnMs = parseNonNegativeIntEnv("OVERLAY_STREAMING_T_PARTIAL_MS", 15_000);
const requireStreaming = process.env.OVERLAY_STREAMING_REQUIRE === "1";
@@
/** `@param` {number | null} a `@param` {number | null} b */
function deltaMs(a, b) {
if (a == null || b == null) return null;
return b - a;
}
+
+function parseNonNegativeIntEnv(name, fallback) {
+ const raw = process.env[name];
+ if (raw == null || raw === "") return fallback;
+ const value = Number.parseInt(raw, 10);
+ if (!Number.isFinite(value) || value < 0) {
+ throw new Error(`${name} must be a non-negative integer, got ${JSON.stringify(raw)}`);
+ }
+ return value;
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const tPartialWarnMs = Number.parseInt(process.env.OVERLAY_STREAMING_T_PARTIAL_MS || "15000", 10); | |
| const tPartialWarnMs = parseNonNegativeIntEnv("OVERLAY_STREAMING_T_PARTIAL_MS", 15_000); | |
| const requireStreaming = process.env.OVERLAY_STREAMING_REQUIRE === "1"; | |
| /** `@param` {number | null} a `@param` {number | null} b */ | |
| function deltaMs(a, b) { | |
| if (a == null || b == null) return null; | |
| return b - a; | |
| } | |
| function parseNonNegativeIntEnv(name, fallback) { | |
| const raw = process.env[name]; | |
| if (raw == null || raw === "") return fallback; | |
| const value = Number.parseInt(raw, 10); | |
| if (!Number.isFinite(value) || value < 0) { | |
| throw new Error(`${name} must be a non-negative integer, got ${JSON.stringify(raw)}`); | |
| } | |
| return value; | |
| } |
🧰 Tools
🪛 ESLint
[error] 31-31: 'process' is not defined.
(no-undef)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/verify-overlay-streaming.mjs` at line 31, The variable tPartialWarnMs
is parsed from the OVERLAY_STREAMING_T_PARTIAL_MS environment variable without
validation, which means invalid values like "abc" result in NaN instead of
failing fast. After parsing the environment variable with Number.parseInt in the
tPartialWarnMs assignment, add a validation check to ensure the result is a
valid number (not NaN), and throw or exit with an error message if the
environment variable contains an invalid value, preventing silent failures where
NaN disables downstream warning comparisons.
Merge in-session dialogue turns into recall so food preferences are available before async memory tail persists. Harden food extraction and relay summon patterns. Skip LLM stream partials on recall questions and emit merged reply once to fix overlay flicker (ISSUE-053). Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
workers/agent-worker/src/graph/npc_loop.py (1)
205-211:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRefresh the hot snapshot cache after applied actions.
fetch_state()can skip HTTP for 3 seconds using_stale_worker_snapshots, butapply_tools()returns the game-server’s updated state without caching it. A second speak right after a move can therefore plan from the pre-action snapshot.🐛 Proposed fix
- body = safe_response_json(res) + body = safe_response_json(res) + updated_snapshot = body.get("state") + if not isinstance(updated_snapshot, dict): + updated_snapshot = state.get("room_snapshot", {}) or {} + _remember_worker_snapshot(room_id, player_id, updated_snapshot) record_phase_ms("t_apply_ms", int((time.perf_counter() - t0) * 1000)) return { **state, - "room_snapshot": body.get("state", state.get("room_snapshot", {})), + "room_snapshot": updated_snapshot, "pending_actions": actions, }Also applies to: 677-683
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/agent-worker/src/graph/npc_loop.py` around lines 205 - 211, The issue is that apply_tools() returns updated game-server state but does not refresh the hot snapshot cache, causing subsequent fetch_state() calls to continue using stale pre-action snapshots from _stale_worker_snapshots for up to 3 seconds. After apply_tools() returns the updated state from the game-server, refresh the cached snapshot in _stale_worker_snapshots (or whichever cache storage mechanism is used) with the new state so that the next fetch_state() call returns the fresh post-action snapshot instead of a stale one. Apply the same fix in the second location mentioned around lines 677-683.workers/agent-worker/src/graph/recall_merge.py (1)
216-223:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFilter incompatible password topics, not just rank them lower.
If the user asks for a computer password and only a door-password row is retrieved, it remains a candidate and can be formatted as
电脑密码是 ...; door-lock prompts have the symmetric issue because topic handling only special-cases电脑.🐛 Proposed fix
+def _password_topic(text: str) -> str | None: + if "电脑" in text: + return "computer" + if "门锁" in text or "门禁" in text: + return "door" + return None + + def _password_topic_score(player_message: str, memory_text: str) -> int: - if "电脑" not in player_message: + asked = _password_topic(player_message) + if asked is None: return 0 - if "电脑" in memory_text or "电脑密码" in memory_text: + remembered = _password_topic(memory_text) + if remembered == asked: return 2 - if "门锁" in memory_text or "门禁" in memory_text: - return 0 + if remembered is not None: + return -1 return 1if not candidates: return None + explicit_topic = _password_topic(player_message) + if explicit_topic is not None: + candidates = [ + item + for item in candidates + if _password_topic_score(player_message, _memory_item_text(item)) > 0 + ] + if not candidates: + return None def sort_key(item: dict[str, Any]) -> tuple[int, int, int, float]:Also applies to: 354-360
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/agent-worker/src/graph/recall_merge.py` around lines 216 - 223, The _password_topic_score function currently ranks incompatible password topics lower instead of filtering them out completely. When a user asks for a computer password ("电脑") but only door-lock related memory ("门锁" or "门禁") is retrieved, it returns 1 instead of 0, keeping it as a candidate. Refactor this function to return 0 (incompatible/filtered out) when password topic types are mismatched between the player message and memory text, ensuring symmetric handling for both computer and door-lock password topic pairs. Check if the player message indicates one type of password topic and the memory text indicates a different incompatible type, then filter it out by returning 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@workers/agent-worker/src/graph/recall_merge.py`:
- Line 51: The `_FOOD_RECALL_RE` regex pattern currently matches both
first-person and second-person food questions, and it is only evaluated after
the top-level recall guard, causing first-person prompts like `我喜欢吃什么?` to be
missed. Refine the regex pattern to exclude second-person variations (e.g., ones
containing `你`) by making it more specific to only first-person food
preferences, then integrate the `_FOOD_RECALL_RE` check into the
`is_recall_question` function so that first-person food recall questions are
properly routed through the deterministic food merging logic earlier in the
flow.
- Around line 142-145: The issue is that the food-specific branch with
_is_food_recall_question() returns before applying the min_score filter,
allowing below-threshold food items to be selected while other item types would
be rejected. Move the score filtering logic (the line that creates the scored
list by filtering items where score >= min_score) to execute before the
_is_food_recall_question check, so that _pick_food_memory receives
already-filtered items that meet the min_score threshold, ensuring consistent
filtering behavior across all code paths.
---
Outside diff comments:
In `@workers/agent-worker/src/graph/npc_loop.py`:
- Around line 205-211: The issue is that apply_tools() returns updated
game-server state but does not refresh the hot snapshot cache, causing
subsequent fetch_state() calls to continue using stale pre-action snapshots from
_stale_worker_snapshots for up to 3 seconds. After apply_tools() returns the
updated state from the game-server, refresh the cached snapshot in
_stale_worker_snapshots (or whichever cache storage mechanism is used) with the
new state so that the next fetch_state() call returns the fresh post-action
snapshot instead of a stale one. Apply the same fix in the second location
mentioned around lines 677-683.
In `@workers/agent-worker/src/graph/recall_merge.py`:
- Around line 216-223: The _password_topic_score function currently ranks
incompatible password topics lower instead of filtering them out completely.
When a user asks for a computer password ("电脑") but only door-lock related
memory ("门锁" or "门禁") is retrieved, it returns 1 instead of 0, keeping it as a
candidate. Refactor this function to return 0 (incompatible/filtered out) when
password topic types are mismatched between the player message and memory text,
ensuring symmetric handling for both computer and door-lock password topic
pairs. Check if the player message indicates one type of password topic and the
memory text indicates a different incompatible type, then filter it out by
returning 0.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: f7995973-f174-4ec3-b2af-554e8edfe234
📒 Files selected for processing (9)
docs/ISSUE-LOG.mdworkers/agent-worker/src/graph/action_intent.pyworkers/agent-worker/src/graph/nodes/llm_social_turn.pyworkers/agent-worker/src/graph/npc_loop.pyworkers/agent-worker/src/graph/recall_merge.pyworkers/agent-worker/tests/test_action_intent.pyworkers/agent-worker/tests/test_recall_merge.pyworkers/agent-worker/tests/test_social_stream_extract.pyworkers/agent-worker/tests/test_tool_gate.py
🚧 Files skipped from review as they are similar to previous changes (1)
- workers/agent-worker/src/graph/action_intent.py
| r"(?:最喜欢|最爱|喜欢|爱)吃([^。,,.!?;;\s~~吗呢吧啊??]+)", | ||
| re.IGNORECASE, | ||
| ) | ||
| _FOOD_RECALL_RE = re.compile(r"喜欢吃什么|爱吃什么|最爱吃|口味|爱吃什么", re.IGNORECASE) |
There was a problem hiding this comment.
Route first-person food recall through is_recall_question.
_FOOD_RECALL_RE is only used after the top-level recall guard, so prompts like 我喜欢吃什么? never reach deterministic food merging. Tighten the regex so it does not match second-person seed turns like 我喜欢吃西瓜,你喜欢吃什么?, then include it in is_recall_question.
🐛 Proposed fix
-_FOOD_RECALL_RE = re.compile(r"喜欢吃什么|爱吃什么|最爱吃|口味|爱吃什么", re.IGNORECASE)
+_FOOD_RECALL_RE = re.compile(
+ r"(?:我|我的).{0,8}(?:最喜欢|最爱|喜欢|爱)吃什么|(?:我|我的).{0,8}口味",
+ re.IGNORECASE,
+)
def is_recall_question(message: str) -> bool:
msg = message.strip()
if not msg:
return False
if any(marker in msg for marker in _DISCLOSURE_MARKERS):
return False
+ if _FOOD_RECALL_RE.search(msg) and _has_question_cue(msg):
+ return True
if any(marker in msg for marker in _STRONG_RECALL_MARKERS):
return TrueAlso applies to: 68-78
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@workers/agent-worker/src/graph/recall_merge.py` at line 51, The
`_FOOD_RECALL_RE` regex pattern currently matches both first-person and
second-person food questions, and it is only evaluated after the top-level
recall guard, causing first-person prompts like `我喜欢吃什么?` to be missed. Refine
the regex pattern to exclude second-person variations (e.g., ones containing
`你`) by making it more specific to only first-person food preferences, then
integrate the `_FOOD_RECALL_RE` check into the `is_recall_question` function so
that first-person food recall questions are properly routed through the
deterministic food merging logic earlier in the flow.
| msg = player_message.strip() | ||
| if _is_food_recall_question(msg): | ||
| return _pick_food_memory(items, min_score=min_score) | ||
| scored = [item for item in items if float(item.get("score") or 0) >= min_score] |
There was a problem hiding this comment.
Apply min_score before food-specific selection.
The food branch returns before the shared score gate, so a below-threshold food-looking row can answer recall even though password/nickname rows would be rejected.
🐛 Proposed fix
"""Pick the retrieved row that matches the recall question type, not only top score."""
items = retrieved_memories or []
msg = player_message.strip()
+ scored = [item for item in items if float(item.get("score") or 0) >= min_score]
if _is_food_recall_question(msg):
- return _pick_food_memory(items, min_score=min_score)
- scored = [item for item in items if float(item.get("score") or 0) >= min_score]
+ return _pick_food_memory(scored, min_score=min_score)
if not scored:
return None🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@workers/agent-worker/src/graph/recall_merge.py` around lines 142 - 145, The
issue is that the food-specific branch with _is_food_recall_question() returns
before applying the min_score filter, allowing below-threshold food items to be
selected while other item types would be rejected. Move the score filtering
logic (the line that creates the scored list by filtering items where score >=
min_score) to execute before the _is_food_recall_question check, so that
_pick_food_memory receives already-filtered items that meet the min_score
threshold, ensuring consistent filtering behavior across all code paths.
…sword recall CodeRabbit PR #9 fixes: apply_tools now updates worker hot cache from apply-actions state; password recall rejects door-lock rows for computer questions; contextual location regex avoids misclassifying narrative questions. Verified with pytest, agent:verify, verify:phase20, and verify:overlay-streaming. Co-authored-by: Cursor <cursoragent@cursor.com>
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests
Chores