Skip to content

claude-code-chat-browser: YAML escape fix — handle colons, hashes, tabs, bool literals in frontmatter export#107

Merged
wpak-ai merged 5 commits into
masterfrom
fix/yaml-frontmatter-escape
Jul 1, 2026
Merged

claude-code-chat-browser: YAML escape fix — handle colons, hashes, tabs, bool literals in frontmatter export#107
wpak-ai merged 5 commits into
masterfrom
fix/yaml-frontmatter-escape

Conversation

@clean6378-max-it

@clean6378-max-it clean6378-max-it commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Closes #102

Summary

Problem

Unquoted frontmatter values broke on colons, # comments, tabs, and YAML boolean literals (true, off, etc.) in real session titles and paths.

Test plan

  • pytest tests/test_md_exporter_yaml.py — 5 passed
  • pytest -q — 424 passed, 13 skipped
  • mypy api utils models — pass
  • ruff check . — pass
  • Manual: export session titled Fix: handle edge case — confirm yaml.safe_load() parses frontmatter

Summary by CodeRabbit

  • New Features
    • Markdown exports now generate richer, consistently ordered YAML frontmatter metadata.
  • Bug Fixes
    • Improved YAML escaping/serialization for exported metadata, including multiline titles and special characters (tabs, #, quotes, backslashes, newlines).
  • Tests
    • Added automated round-trip and fuzz coverage to verify YAML frontmatter correctness and escaping behavior.
  • Chores
    • Updated development dependencies to include YAML parsing support for tests.

Refactor frontmatter export through _session_frontmatter_dict and
_append_yaml_value so all string fields are double-quoted or block-scalars.
_escape_yaml handles tabs, hashes, colons, bool-like literals, and non-printable
Unicode. Add round-trip and Hypothesis property tests; pin PyYAML in dev deps.
@clean6378-max-it clean6378-max-it self-assigned this Jul 1, 2026
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 22c408c4-edec-4dca-94b9-1113a7f439e6

📥 Commits

Reviewing files that changed from the base of the PR and between eaa4a0d and e1fef70.

📒 Files selected for processing (2)
  • tests/test_md_exporter_yaml.py
  • utils/md_exporter.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_md_exporter_yaml.py

📝 Walkthrough

Walkthrough

This PR adds YAML dev dependencies, rewrites frontmatter serialization in utils/md_exporter.py, and adds tests that parse and round-trip exported YAML frontmatter.

Changes

YAML Frontmatter Escaping

Layer / File(s) Summary
Dev dependencies
requirements-dev.txt
Adds PyYAML>=6.0 and types-PyYAML>=6.0 to development requirements.
Frontmatter builder and YAML escaping
utils/md_exporter.py
Replaces manual frontmatter assembly with _session_frontmatter_dict, adds YAML key formatting, and rewrites _escape_yaml and _append_yaml_value for nested values and safe scalar emission.
YAML round-trip tests
tests/test_md_exporter_yaml.py
Adds frontmatter parsing helpers, round-trip assertions against the canonical frontmatter dict, multiline/title edge-case checks, and Hypothesis fuzz tests for YAML escaping.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The escape and round-trip fix is implemented, but multiline values use quoted scalars instead of the block scalar syntax required by #102. Update multiline serialization to emit YAML block scalars (| or >), or revise the issue criteria if quoted scalars are intended.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly names the YAML frontmatter escape fix and its affected values.
Out of Scope Changes check ✅ Passed The changes stay within the YAML frontmatter exporter, tests, and dev dependency needed for PyYAML.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/yaml-frontmatter-escape

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/test_md_exporter_yaml.py (2)

70-83: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Inconsistent frontmatter extraction: naive split("---") vs. the robust helper defined above.

Line 82 extracts the YAML block with md.split("---")[1], a plain substring split, while line 83 uses the delimiter-line-aware _extract_frontmatter_dict for the same assertion. The naive split would misbehave if any frontmatter value ever contains the literal substring "---" (not triggered by current fixed test data, but a footgun for future edits). Prefer using _extract_frontmatter_dict consistently instead of duplicating the assertion with two different extraction strategies.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_md_exporter_yaml.py` around lines 70 - 83, The YAML frontmatter
roundtrip test uses two different extraction paths, and the direct
md.split("---")[1] approach in test_yaml_frontmatter_roundtrip is brittle.
Update the assertion to use the same robust helper, _extract_frontmatter_dict,
consistently for frontmatter parsing so the test relies on delimiter-line-aware
extraction instead of naive substring splitting.

97-104: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Prefer hypothesis.assume() over early return to skip invalid examples.

Returning early when "\n" in s or "\r" in s makes the test pass trivially for those inputs rather than telling Hypothesis to discard/reject them, which weakens shrinking and example diversity for the intended single-line case.

♻️ Suggested change
-from hypothesis import given, settings, strategies as st
+from hypothesis import assume, given, settings, strategies as st
@@
 def test_escape_yaml_roundtrip(s: str) -> None:
     """Double-quoted scalars round-trip for arbitrary single-line text."""
-    if "\n" in s or "\r" in s:
-        return
+    assume("\n" not in s and "\r" not in s)
     loaded = yaml.safe_load(f"key: {_escape_yaml(s)}")
     assert loaded["key"] == s
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_md_exporter_yaml.py` around lines 97 - 104, Update
test_escape_yaml_roundtrip to use hypothesis.assume() instead of returning early
when the generated string contains newline or carriage return characters. Keep
the single-line constraint in place by rejecting invalid examples with assume,
so Hypothesis correctly discards them while still exercising _escape_yaml and
yaml.safe_load on valid inputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_md_exporter_yaml.py`:
- Around line 107-114: The YAML round-trip fuzz test in
test_yaml_string_field_roundtrip is exposing that _append_yaml_value does not
preserve strings ending with trailing newlines because the current block scalar
style strips them on load. Update the string serialization logic in
_append_yaml_value so values like "\n" survive a safe_load round-trip, using a
representation that preserves trailing line breaks or an equivalent escaping
strategy, and keep the behavior verified by test_yaml_string_field_roundtrip.

In `@utils/md_exporter.py`:
- Around line 525-534: The multiline `|-` fast-path in `md_exporter.py` is lossy
for some string values, especially those with trailing newlines, `\r\n`, or
leading indentation, so narrow the `value.splitlines()` branch in the YAML
export logic to only cases that round-trip safely. Update the condition inside
the value-handling block to use `splitlines()` only when the exact original
content can be preserved, otherwise fall back to `_escape_yaml(value)`; use the
existing `content_lines` handling in `md_exporter.py` as the place to fix this.

---

Nitpick comments:
In `@tests/test_md_exporter_yaml.py`:
- Around line 70-83: The YAML frontmatter roundtrip test uses two different
extraction paths, and the direct md.split("---")[1] approach in
test_yaml_frontmatter_roundtrip is brittle. Update the assertion to use the same
robust helper, _extract_frontmatter_dict, consistently for frontmatter parsing
so the test relies on delimiter-line-aware extraction instead of naive substring
splitting.
- Around line 97-104: Update test_escape_yaml_roundtrip to use
hypothesis.assume() instead of returning early when the generated string
contains newline or carriage return characters. Keep the single-line constraint
in place by rejecting invalid examples with assume, so Hypothesis correctly
discards them while still exercising _escape_yaml and yaml.safe_load on valid
inputs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3f333a4a-d6f4-4ca9-98b1-b44b7741fec9

📥 Commits

Reviewing files that changed from the base of the PR and between 554b2cf and 6579447.

📒 Files selected for processing (3)
  • requirements-dev.txt
  • tests/test_md_exporter_yaml.py
  • utils/md_exporter.py

Comment on lines +107 to +114
@FUZZ_SETTINGS
@given(st.text())
def test_yaml_string_field_roundtrip(s: str) -> None:
"""Frontmatter string serializer round-trips arbitrary text."""
lines: list[str] = []
_append_yaml_value(lines, "key", s)
loaded = yaml.safe_load("\n".join(lines))
assert loaded["key"] == s

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the block-scalar chomping round-trip bug for a trailing-newline string.
python3 - <<'PY'
import sys
sys.path.insert(0, "utils")
from md_exporter import _append_yaml_value
import yaml
lines = []
_append_yaml_value(lines, "key", "\n")
text = "\n".join(lines)
print(repr(text))
print(repr(yaml.safe_load(text)))
PY

Repository: cppalliance/claude-code-chat-browser

Length of output: 200


test_yaml_string_field_roundtrip fails for strings ending in \n _append_yaml_value(lines, "key", "\n") serializes to key: |-\n , which loads back as "" because |- strips trailing line breaks. st.text() can shrink to this case, so the fuzz test is expected to fail unless trailing-newline strings are handled differently (for example, | or quoted escaping).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_md_exporter_yaml.py` around lines 107 - 114, The YAML round-trip
fuzz test in test_yaml_string_field_roundtrip is exposing that
_append_yaml_value does not preserve strings ending with trailing newlines
because the current block scalar style strips them on load. Update the string
serialization logic in _append_yaml_value so values like "\n" survive a
safe_load round-trip, using a representation that preserves trailing line breaks
or an equivalent escaping strategy, and keep the behavior verified by
test_yaml_string_field_roundtrip.

Comment thread utils/md_exporter.py Outdated
Block scalars cannot preserve newline-only strings on Python 3.12
where splitlines() yields ['']. Always emit double-quoted scalars.
Widen test_escape_yaml_roundtrip to cover multiline text.
Drop brittle md.split('---') assertion; _extract_frontmatter_dict already
covers the same check with proper delimiter handling.
@timon0305

Copy link
Copy Markdown
Collaborator

The PR body claims "Multiline printable strings use |- block scalars; non-printable multiline falls back to quotes escapes." But _escape_yaml always emits a double-quoted scalar with \n for newlines - there's no block-scalar code anywhere test_multiline_title_uses_quoted_scalar even asserts the quoted from.
The quoted approach is fine and round-trips, so just fix the description - right now it describes behavior the code doesn't have

Comment thread utils/md_exporter.py Outdated
Store list fields as lists in _session_frontmatter_dict and serialize them
as YAML block sequences in _append_yaml_value instead of comma-joined
strings that consumers cannot parse as arrays.
@clean6378-max-it

Copy link
Copy Markdown
Collaborator Author

The PR body claims "Multiline printable strings use |- block scalars; non-printable multiline falls back to quotes escapes." But _escape_yaml always emits a double-quoted scalar with \n for newlines - there's no block-scalar code anywhere test_multiline_title_uses_quoted_scalar even asserts the quoted from. The quoted approach is fine and round-trips, so just fix the description - right now it describes behavior the code doesn't have

Good catch — block scalars were removed in df37c92, updated the PR description

@clean6378-max-it clean6378-max-it requested a review from wpak-ai July 1, 2026 21:19
@wpak-ai wpak-ai merged commit 4345d69 into master Jul 1, 2026
16 checks passed
@wpak-ai wpak-ai deleted the fix/yaml-frontmatter-escape branch July 1, 2026 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

claude-code-chat-browser: YAML escape fix — handle colons, hashes, tabs, bool literals in frontmatter export

3 participants