Conversation
…on-7fu.1) - Create lib/generate-search-index.js to extract searchable content from all .md files - Extract title, description, headings (h2/h3), and body text from each page - Strip markdown syntax, code blocks, Markdoc tags, and frontmatter from body - Map paths to sections (Get Started, Core Concepts, Tools, etc.) - Generate public/search-index.json with 43 entries (~121KB) - Update package.json to run search index generator before build/dev - Home page includes title only (no body content) - Body text limited to 5000 chars per page
…erts-atproto-documentation-7fu.2) - Add flexsearch dependency - Replace fuzzy title matching with FlexSearch Document index - Load search index from /search-index.json on first dialog open - Search across title, description, headings, and body fields - Prioritize results: title matches first, then description, headings, body - Generate context snippets for body matches with highlighted terms - Add loading state while fetching index - Keep keyboard navigation, quick links, and all existing UX - Remove flattenNavigation import and fuzzyMatch function
…atproto-documentation-7fu.3)
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis PR introduces FlexSearch-based search capabilities by creating a build-time search index generator, updating SearchDialog to use the persistent index with debounced search, adding necessary styling, and updating build scripts accordingly. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
components/SearchDialog.js (1)
11-22: Snippet generation only matches the first query term.For multi-word queries like "create record",
getSnippetusesindexOfwhich finds only the first occurrence of the entire query string. If the user searches "create record" but those words don't appear consecutively in the body, no snippet is generated.🔧 Consider matching individual terms for better coverage
function getSnippet(body, query, contextChars = 60) { const lower = body.toLowerCase(); - const idx = lower.indexOf(query.toLowerCase()); + const queryLower = query.toLowerCase(); + // Try exact phrase first, then fall back to first word + let idx = lower.indexOf(queryLower); + if (idx === -1) { + const firstWord = queryLower.split(/\s+/)[0]; + if (firstWord) idx = lower.indexOf(firstWord); + } if (idx === -1) return '';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/SearchDialog.js` around lines 11 - 22, getSnippet currently searches for the whole query string with indexOf, so multi-word queries like "create record" fail when words are not consecutive; update getSnippet to split the query into individual terms (e.g., by whitespace), ignore empty terms, search for the first occurrence of any term (case-insensitive) in body, and use that term's index/length when computing start/end and building the snippet so snippets are produced for matches of any query term; keep the existing contextChars logic and ellipsis behavior.lib/generate-search-index.js (3)
26-27: Quoted YAML values won't be parsed correctly.The regex
/^title:\s*(.+)$/mcaptures the entire value including quotes. A title liketitle: "Getting Started: Quick Guide"would include the surrounding quotes in the extracted value.🔧 Optional fix to strip surrounding quotes
+function stripQuotes(str) { + if ((str.startsWith('"') && str.endsWith('"')) || + (str.startsWith("'") && str.endsWith("'"))) { + return str.slice(1, -1); + } + return str; +} + function extractFrontmatter(content) { const fmMatch = content.match(/^---\n([\s\S]*?)\n---/); if (!fmMatch) return { title: "", description: "" }; const frontmatter = fmMatch[1]; const titleMatch = frontmatter.match(/^title:\s*(.+)$/m); const descMatch = frontmatter.match(/^description:\s*(.+)$/m); return { - title: titleMatch ? titleMatch[1].trim() : "", - description: descMatch ? descMatch[1].trim() : "", + title: titleMatch ? stripQuotes(titleMatch[1].trim()) : "", + description: descMatch ? stripQuotes(descMatch[1].trim()) : "", }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/generate-search-index.js` around lines 26 - 27, The frontmatter regexes titleMatch and descMatch capture quoted YAML values including surrounding quotes; update the extraction logic to strip optional surrounding single or double quotes (and trim whitespace) from the captured value (frontmatter.match results) so values like title: "Getting Started: Quick Guide" and description: '...' return the inner string; you can either adjust the regex to allow optional surrounding quotes in the capture or post-process the matched group to remove leading/trailing quotes and whitespace for both titleMatch and descMatch.
8-19: Consider handling symbolic links to prevent potential infinite loops.If the
pagesdirectory contains symlinks pointing to parent directories,walkDircould recurse infinitely. This is unlikely in a typical docs setup but worth noting.🛡️ Optional defensive fix
function walkDir(dir) { const results = []; for (const entry of readdirSync(dir)) { const full = join(dir, entry); - if (statSync(full).isDirectory()) { + const stat = statSync(full); + if (stat.isSymbolicLink()) continue; + if (stat.isDirectory()) { results.push(...walkDir(full)); } else if (full.endsWith(".md")) { results.push(full); } } return results; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/generate-search-index.js` around lines 8 - 19, walkDir can follow symlinks and potentially recurse forever; modify walkDir to detect and skip symbolic links (use lstatSync(...).isSymbolicLink()) or resolve entries with realpathSync and maintain a visited Set of real paths to avoid revisiting the same directory, replacing the current statSync/recursion logic in walkDir to first check lstatSync for symlinks (or check visited realpaths) before calling statSync and recursing.
119-121: Body truncation may cut mid-word.
substring(0, MAX_BODY_LENGTH)could truncate in the middle of a word, potentially affecting search matches at the boundary. Consider truncating at a word boundary.🔧 Optional fix to truncate at word boundary
if (body.length > MAX_BODY_LENGTH) { - body = body.substring(0, MAX_BODY_LENGTH); + body = body.substring(0, MAX_BODY_LENGTH); + const lastSpace = body.lastIndexOf(' '); + if (lastSpace > MAX_BODY_LENGTH - 100) { + body = body.substring(0, lastSpace); + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/generate-search-index.js` around lines 119 - 121, The current truncation uses body.substring(0, MAX_BODY_LENGTH) which can cut mid-word; change the logic in the block that checks body.length > MAX_BODY_LENGTH (referencing the body variable and MAX_BODY_LENGTH) to truncate at the last word boundary before the limit: when body.length > MAX_BODY_LENGTH, find the last whitespace (e.g., lastIndexOf(/\s/ or ' ') before MAX_BODY_LENGTH and, if found, slice up to that index, otherwise fall back to substring(0, MAX_BODY_LENGTH); ensure you preserve trimming and avoid returning an empty string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.beads/issues.jsonl:
- Line 170: There is a duplicate JSONL record for the primary id
"hypercerts-atproto-documentation-woj" which will cause overwrite/rejection;
remove the duplicate entry (or merge its unique fields into the existing record)
so only one JSON object with id "hypercerts-atproto-documentation-woj" remains,
ensuring fields like title, description, status, owner, timestamps and labels
are preserved or consolidated as appropriate and the resulting JSONL contains a
single well-formed line per record.
In `@components/SearchDialog.js`:
- Around line 71-75: The current catch in the search index load only logs the
error and leaves searchData null, causing an empty quickLinks UI without
feedback; update the catch to set a visible error state (e.g., setSearchError or
setError), flip setLoading(false) and isLoading = false, and set searchData to
an empty array or a sentinel so quickLinks rendering can show a user-facing
error message; update the render logic that reads searchData / quickLinks to
display the new error state (use the new setSearchError flag) instead of
silently showing empty quick links.
---
Nitpick comments:
In `@components/SearchDialog.js`:
- Around line 11-22: getSnippet currently searches for the whole query string
with indexOf, so multi-word queries like "create record" fail when words are not
consecutive; update getSnippet to split the query into individual terms (e.g.,
by whitespace), ignore empty terms, search for the first occurrence of any term
(case-insensitive) in body, and use that term's index/length when computing
start/end and building the snippet so snippets are produced for matches of any
query term; keep the existing contextChars logic and ellipsis behavior.
In `@lib/generate-search-index.js`:
- Around line 26-27: The frontmatter regexes titleMatch and descMatch capture
quoted YAML values including surrounding quotes; update the extraction logic to
strip optional surrounding single or double quotes (and trim whitespace) from
the captured value (frontmatter.match results) so values like title: "Getting
Started: Quick Guide" and description: '...' return the inner string; you can
either adjust the regex to allow optional surrounding quotes in the capture or
post-process the matched group to remove leading/trailing quotes and whitespace
for both titleMatch and descMatch.
- Around line 8-19: walkDir can follow symlinks and potentially recurse forever;
modify walkDir to detect and skip symbolic links (use
lstatSync(...).isSymbolicLink()) or resolve entries with realpathSync and
maintain a visited Set of real paths to avoid revisiting the same directory,
replacing the current statSync/recursion logic in walkDir to first check
lstatSync for symlinks (or check visited realpaths) before calling statSync and
recursing.
- Around line 119-121: The current truncation uses body.substring(0,
MAX_BODY_LENGTH) which can cut mid-word; change the logic in the block that
checks body.length > MAX_BODY_LENGTH (referencing the body variable and
MAX_BODY_LENGTH) to truncate at the last word boundary before the limit: when
body.length > MAX_BODY_LENGTH, find the last whitespace (e.g., lastIndexOf(/\s/
or ' ') before MAX_BODY_LENGTH and, if found, slice up to that index, otherwise
fall back to substring(0, MAX_BODY_LENGTH); ensure you preserve trimming and
avoid returning an empty string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c0f10428-739f-4714-a067-174a378e27fb
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (7)
.beads/issues.jsonlcomponents/SearchDialog.jslib/generate-search-index.jslib/lastUpdated.jsonpackage.jsonpublic/search-index.jsonstyles/globals.css
| {"id":"hypercerts-atproto-documentation-w96.5","title":"Fix wrong lexicon NSIDs in data-flow-and-lifecycle page","description":"## Files\n- pages/architecture/data-flow-and-lifecycle.md (modify)\n\n## What to do\nFix two incorrect NSIDs:\n\n### Fix 1: Line 59\nCurrently says: `org.hypercerts.claim.contributionDetails`\nChange to: `org.hypercerts.claim.contribution`\n\n### Fix 2: Line 79\nCurrently says: `org.hypercerts.claim.collection`\nChange to: `org.hypercerts.collection`\n\nThe actual lexicon IDs are:\n- org.hypercerts.claim.contribution (file: contribution.json)\n- org.hypercerts.collection (file: collection.json, no .claim. segment)\n\n## Dont\n- Do not change any surrounding text or page structure\n- Only change the two NSID strings","acceptance_criteria":"1. Line 59 (or equivalent) contains `org.hypercerts.claim.contribution` (not contributionDetails)\n2. Line 79 (or equivalent) contains `org.hypercerts.collection` (not org.hypercerts.claim.collection)\n3. No other content is changed\n4. File parses as valid Markdown","status":"closed","priority":1,"issue_type":"task","assignee":"karma.gainforest.id","owner":"karma.gainforest.id","estimated_minutes":10,"created_at":"2026-03-05T19:58:14.602853024+06:00","created_by":"karma.gainforest.id","updated_at":"2026-03-05T20:04:27.406244913+06:00","closed_at":"2026-03-05T20:04:27.406244913+06:00","close_reason":"91f3ecd Fix wrong lexicon NSIDs in data-flow-and-lifecycle page","labels":["scope:trivial"],"dependencies":[{"issue_id":"hypercerts-atproto-documentation-w96.5","depends_on_id":"hypercerts-atproto-documentation-w96","type":"parent-child","created_at":"2026-03-05T19:58:14.60526901+06:00","created_by":"karma.gainforest.id"}]} | ||
| {"id":"hypercerts-atproto-documentation-w96.6","title":"Fix wrong collection NSID in roadmap page","description":"## Files\n- pages/roadmap.md (modify)\n\n## What to do\nFix incorrect collection NSID on line 69.\n\nCurrently says: `org.hypercerts.claim.collection`\nChange to: `org.hypercerts.collection`\n\nThe actual lexicon ID is org.hypercerts.collection (no .claim. segment).\n\n## Dont\n- Do not change any other content on the roadmap page\n- Only change the one NSID string","acceptance_criteria":"1. Line 69 (or the collection row in the table) shows `org.hypercerts.collection` (not org.hypercerts.claim.collection)\n2. No other content is changed\n3. File parses as valid Markdown","status":"closed","priority":1,"issue_type":"task","assignee":"karma.gainforest.id","owner":"karma.gainforest.id","estimated_minutes":5,"created_at":"2026-03-05T19:58:19.75432948+06:00","created_by":"karma.gainforest.id","updated_at":"2026-03-05T20:04:11.56608607+06:00","closed_at":"2026-03-05T20:04:11.56608607+06:00","close_reason":"3352ec1 Fix wrong collection NSID in roadmap page","labels":["scope:trivial"],"dependencies":[{"issue_id":"hypercerts-atproto-documentation-w96.6","depends_on_id":"hypercerts-atproto-documentation-w96","type":"parent-child","created_at":"2026-03-05T19:58:19.756668539+06:00","created_by":"karma.gainforest.id"}]} | ||
| {"id":"hypercerts-atproto-documentation-w96.7","title":"Rewrite contributor model description and add missing record types in core data model page","description":"## Files\n- pages/core-concepts/hypercerts-core-data-model.md (modify)\n\n## What to do\nThe core data model page has structural inaccuracies about how contributors work. Fix the contributor model description and the \"How records connect\" tree.\n\n### 1. Fix the \"Additional details\" section (lines 26-33)\nThe current description says ContributorInformation and ContributionDetails are \"separate records with their own AT-URI\" that \"can be referenced from the activity claim.\" This is misleading.\n\nRewrite to explain the actual model: The activity claim has a `contributors` array where each entry is a contributor object containing:\n- `contributorIdentity`: either an inline identity string (DID) via `#contributorIdentity`, OR a strong reference to an `org.hypercerts.claim.contributorInformation` record\n- `contributionWeight`: optional relative weight string\n- `contributionDetails`: either an inline role string via `#contributorRole`, OR a strong reference to an `org.hypercerts.claim.contribution` record\n\nEmphasize the dual inline/reference pattern — simple cases use inline strings, richer profiles use separate records.\n\nUpdate the table to reflect the correct lexicon name `org.hypercerts.claim.contribution` (not contributionDetails).\n\n### 2. Fix the \"How records connect\" tree (lines 70-82)\nThe tree currently shows ContributorInformation and ContributionDetails as separate child records. Update it to show them as embedded within contributor objects, reflecting the actual structure. Example:\n\n```text\nActivity Claim (the core record)\n├── contributors[0]\n│ ├── contributorIdentity: Alice (inline DID or ref to ContributorInformation)\n│ ├── contributionWeight: \"1\"\n│ └── contributionDetails: Lead author (inline role or ref to Contribution)\n├── contributors[1]\n│ ├── contributorIdentity: → ContributorInformation record (Bob)\n│ └── contributionDetails: → Contribution record (Technical reviewer, Jan-Mar)\n├── Attachment: GitHub repository link\n├── Measurement: 12 pages written\n├── Measurement: 8,500 words\n└── Evaluation: \"High-quality documentation\" (by Carol)\n```\n\n## Dont\n- Do not change the \"The core record: activity claim\" section (lines 12-23) — the four dimensions table is correct\n- Do not change the \"Grouping hypercerts\" section\n- Do not change the \"Mutability\" or \"What happens next\" sections\n- Do not add new record types (rights, acknowledgement, funding receipt) — this task is fixes only\n- Do not add code examples or API usage — this is a conceptual page\n- Keep the writing style consistent with the rest of the page (concise, factual, no marketing language)","acceptance_criteria":"1. The \"Additional details\" section accurately describes the dual inline/reference pattern for contributors\n2. The contributor table shows `org.hypercerts.claim.contribution` (not contributionDetails)\n3. The \"How records connect\" tree shows contributors as embedded objects with inline/reference options\n4. The four dimensions table in \"The core record\" section is unchanged\n5. The \"Grouping hypercerts\", \"Mutability\", and \"What happens next\" sections are unchanged\n6. No new record types are added to the page\n7. File parses as valid Markdown","status":"closed","priority":2,"issue_type":"task","assignee":"karma.gainforest.id","owner":"karma.gainforest.id","estimated_minutes":45,"created_at":"2026-03-05T19:58:48.627566144+06:00","created_by":"karma.gainforest.id","updated_at":"2026-03-05T20:05:12.194006503+06:00","closed_at":"2026-03-05T20:05:12.194006503+06:00","close_reason":"e55324a Rewrite contributor model description and fix records tree","labels":["scope:small"],"dependencies":[{"issue_id":"hypercerts-atproto-documentation-w96.7","depends_on_id":"hypercerts-atproto-documentation-w96","type":"parent-child","created_at":"2026-03-05T19:58:48.630865221+06:00","created_by":"karma.gainforest.id"},{"issue_id":"hypercerts-atproto-documentation-w96.7","depends_on_id":"hypercerts-atproto-documentation-w96.4","type":"blocks","created_at":"2026-03-05T19:58:48.635719127+06:00","created_by":"karma.gainforest.id"}]} | ||
| {"id":"hypercerts-atproto-documentation-woj","title":"Epic: Resolve CodeRabbit review for PR #78","description":"Resolve all open CodeRabbit inline review comments on PR #78. 3 comments across 1 file (pages/tools/scaffold.md).","status":"closed","priority":1,"issue_type":"epic","owner":"kzoepa@gmail.com","created_at":"2026-03-05T15:57:13.054348747+06:00","created_by":"kzoeps","updated_at":"2026-03-06T18:07:23.476296+08:00","closed_at":"2026-03-06T18:07:23.476296+08:00","close_reason":"4e902e4 Both children closed — CodeRabbit review items resolved","labels":["needs-integration-review","scope:small"]} |
There was a problem hiding this comment.
Remove the duplicate issue entry.
id: "hypercerts-atproto-documentation-woj" already exists earlier in this file. Duplicating a primary identifier in the JSONL store can make downstream readers overwrite one record with the other or reject the file entirely.
Suggested fix
-{"id":"hypercerts-atproto-documentation-woj","title":"Epic: Resolve CodeRabbit review for PR `#78`","description":"Resolve all open CodeRabbit inline review comments on PR `#78`. 3 comments across 1 file (pages/tools/scaffold.md).","status":"closed","priority":1,"issue_type":"epic","owner":"kzoepa@gmail.com","created_at":"2026-03-05T15:57:13.054348747+06:00","created_by":"kzoeps","updated_at":"2026-03-06T18:07:23.476296+08:00","closed_at":"2026-03-06T18:07:23.476296+08:00","close_reason":"4e902e4 Both children closed — CodeRabbit review items resolved","labels":["needs-integration-review","scope:small"]}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {"id":"hypercerts-atproto-documentation-woj","title":"Epic: Resolve CodeRabbit review for PR #78","description":"Resolve all open CodeRabbit inline review comments on PR #78. 3 comments across 1 file (pages/tools/scaffold.md).","status":"closed","priority":1,"issue_type":"epic","owner":"kzoepa@gmail.com","created_at":"2026-03-05T15:57:13.054348747+06:00","created_by":"kzoeps","updated_at":"2026-03-06T18:07:23.476296+08:00","closed_at":"2026-03-06T18:07:23.476296+08:00","close_reason":"4e902e4 Both children closed — CodeRabbit review items resolved","labels":["needs-integration-review","scope:small"]} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.beads/issues.jsonl at line 170, There is a duplicate JSONL record for the
primary id "hypercerts-atproto-documentation-woj" which will cause
overwrite/rejection; remove the duplicate entry (or merge its unique fields into
the existing record) so only one JSON object with id
"hypercerts-atproto-documentation-woj" remains, ensuring fields like title,
description, status, owner, timestamps and labels are preserved or consolidated
as appropriate and the resulting JSONL contains a single well-formed line per
record.
| .catch(err => { | ||
| console.error('Failed to load search index:', err); | ||
| setLoading(false); | ||
| isLoading = false; | ||
| }); |
There was a problem hiding this comment.
Silent failure when index fails to load.
If the fetch fails, the error is only logged to the console. Users see an empty quick links state (since searchData remains null, quickLinks will be empty) with no indication that something went wrong.
🛡️ Proposed fix to show error state
+ const [error, setError] = useState(null);
+
// Load search index on first open
useEffect(() => {
if (isOpen && !searchData && !isLoading) {
isLoading = true;
setLoading(true);
+ setError(null);
fetch('/search-index.json')
.then(res => res.json())
// ... existing success handling ...
.catch(err => {
console.error('Failed to load search index:', err);
+ setError('Failed to load search index');
setLoading(false);
isLoading = false;
});
}
}, [isOpen]);Then in the render:
{loading ? (
<div className="search-no-results">Loading...</div>
+) : error ? (
+ <div className="search-no-results">{error}</div>
) : !hasQuery ? (🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/SearchDialog.js` around lines 71 - 75, The current catch in the
search index load only logs the error and leaves searchData null, causing an
empty quickLinks UI without feedback; update the catch to set a visible error
state (e.g., setSearchError or setError), flip setLoading(false) and isLoading =
false, and set searchData to an empty array or a sentinel so quickLinks
rendering can show a user-facing error message; update the render logic that
reads searchData / quickLinks to display the new error state (use the new
setSearchError flag) instead of silently showing empty quick links.
Summary
This PR implements full-text search functionality for the documentation site, replacing the existing title-only fuzzy search with FlexSearch-powered keyword search across page titles, descriptions, headings, and body content.
Changes
1. Build-time search index generator (7fu.1)
lib/generate-search-index.jsscript that runs during buildpublic/search-index.json(117KB) with structured data for 43 pagespackage.json2. FlexSearch integration (7fu.2)
flexsearchdependencycomponents/SearchDialog.jsto:lib/navigation.jsfor search (still used for sidebar)3. Search result styling (7fu.3)
.search-result-snippetclass<mark>tags for highlighted search termsTesting
All acceptance criteria met:
pnpm buildsucceedsRelated Issues
Closes hypercerts-atproto-documentation-7fu.1
Closes hypercerts-atproto-documentation-7fu.2
Closes hypercerts-atproto-documentation-7fu.3
Part of epic hypercerts-atproto-documentation-7fu
Summary by CodeRabbit
New Features
Refactor
Style