Conversation
…s/2026/02/) - Move 935 articles from flat news/ to news/2026/02/ - Update vite.config.js discoverNewsArticles() for recursive scanning - Update scripts/generate-news-indexes/helpers.ts scanNewsArticles() for recursive scanning with relative slug paths - Update scripts/generate-sitemap.ts getNewsArticles() for recursive scanning - Update scripts/extract-news-metadata.ts for recursive scanning - Update tests to match new URL format and use articlePath() helper Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
…sive scan for future files
Existing 935 articles moved back to news/ to preserve shared URLs (LinkedIn, X, etc).
The recursive scanning code in vite.config.js, helpers.ts, generate-sitemap.ts, and
extract-news-metadata.ts is retained so future articles in news/{year}/{month}/ are
automatically discovered alongside the existing flat files.
Tests updated: sitemap URL pattern matches both flat and subdirectory paths;
news-evening-analysis.test.ts reverted to direct path.join(NEWS_DIR, file) calls.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
Restores URL stability by moving previously relocated news articles back into the flat news/ directory, while keeping the new recursive discovery logic so future articles can live under news/{year}/{month}/ without being missed by builds, sitemap generation, or metadata extraction.
Changes:
- Keep recursive news-article discovery in Vite build inputs so nested
news/YYYY/MM/*.htmlare included automatically. - Keep recursive scanning for sitemap generation and news index generation to support both flat and nested article layouts.
- Broaden sitemap language URL matching in tests to accept either flat or date-based paths.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| vite.config.js | Recursively discovers news article HTML files for Rollup/Vite inputs. |
| tests/sitemap-generation.test.ts | Updates language-specific URL matching to allow optional YYYY/MM/ path segment. |
| scripts/generate-sitemap.ts | Recursively scans news/ when building sitemap article groups and hreflang alternates. |
| scripts/generate-news-indexes/helpers.ts | Recursively collects article files and sets metadata.slug to the relative path (supports nested folders). |
| scripts/extract-news-metadata.ts | Recursively discovers news HTML files and builds metadata DB using relative paths. |
| // Check for language-specific news articles (flat or date-based directory structure) | ||
| const languages: readonly string[] = ['en', 'sv', 'da', 'no', 'fi', 'de', 'fr', 'es', 'nl', 'ar', 'he', 'ja', 'ko', 'zh']; | ||
| const foundLanguages = new Set<string>(); | ||
|
|
||
| languages.forEach(lang => { | ||
| const pattern = new RegExp(`news/\\d{4}-\\d{2}-\\d{2}-.+-${lang}\\.html`); | ||
| const pattern = new RegExp(`news/(?:\\d{4}/\\d{2}/)?\\d{4}-\\d{2}-\\d{2}-.+-${lang}\\.html`); | ||
| if (pattern.test(sitemapContent)) { |
There was a problem hiding this comment.
The language-specific pattern was broadened for flat + date-based layouts, but the earlier "should include news articles" regex still only matches /news/YYYY-MM-DD-... and will fail once any articles are actually placed under news/YYYY/MM/. Please broaden that earlier assertion in the same way so tests reflect the intended dual-layout support.
There was a problem hiding this comment.
Fixed in bcb38c2. The "should include news articles" URL regex now accepts both flat (news/YYYY-MM-DD-*) and date-based (news/YYYY/MM/YYYY-MM-DD-*) paths, matching the pattern already used in the language-variant test below it.
scripts/generate-sitemap.ts
Outdated
|
|
||
| scanDir(NEWS_DIR); | ||
|
|
||
| console.log(` Found ${articles.size} news articles`); |
There was a problem hiding this comment.
This log line now reports articles.size, which is the number of language-groups (base slugs), not the number of article files. Since the output label still says "news articles", it can be misleading during troubleshooting; consider renaming it to "article groups" or logging both group count and total file count.
| console.log(` Found ${articles.size} news articles`); | |
| console.log(` Found ${articles.size} news article groups`); |
There was a problem hiding this comment.
Fixed in bcb38c2. The log line now reads Found ${articles.size} news article groups to make clear that the count is language-group (base-slug) count, not individual HTML file count.
|
@copilot apply changes based on the comments in this thread |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
…st URL pattern - generate-sitemap.ts: log message now says 'news article groups' since articles.size counts base-slug language-groups, not individual HTML files - sitemap-generation.test.ts: 'should include news articles' URL regex now accepts both flat (news/YYYY-MM-DD-*) and date-based (news/YYYY/MM/YYYY-MM-DD-*) paths, consistent with the language-variant test in the same describe block Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
| const allFilePaths = collectNewsFiles(newsDir); | ||
| const files = allFilePaths.map((fp) => relative(newsDir, fp)); | ||
|
|
There was a problem hiding this comment.
relative(newsDir, fp) will return platform-specific separators (e.g., \ on Windows). Since file and derived slug are persisted into data/news-articles.json and used to represent URL-like paths, this can produce backslash-containing slugs and inconsistent outputs across OSes. Normalize the relative paths to POSIX-style (/) before storing them (and before computing slug).
There was a problem hiding this comment.
Fixed in 338170c. Added sep to the path named imports and changed line 74 to .split(sep).join('/') so file and derived slug always use forward-slash separators in news-articles.json, regardless of OS.
| function scanDir(dir: string): void { | ||
| const entries = fs.readdirSync(dir, { withFileTypes: true }); | ||
| for (const entry of entries) { | ||
| if (entry.isDirectory()) { | ||
| scanDir(path.join(dir, entry.name)); |
There was a problem hiding this comment.
The recursive scan inserts into articles in filesystem traversal order (readdirSync), and Array.from(articles.values()) preserves that order. With recursive traversal this can make sitemap output order non-deterministic across platforms/filesystems, causing noisy diffs. Consider sorting entries (e.g., by entry.name) before iterating and/or sorting the final ArticleGroup[] by baseSlug before emitting URLs.
There was a problem hiding this comment.
Fixed in 338170c. readdirSync entries are now sorted by name before iterating in scanDir(), and Array.from(articles.values()) is sorted by baseSlug before being returned — making sitemap output fully deterministic across platforms and filesystems.
|
@copilot apply changes based on the comments in this thread |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
… entries deterministically
- extract-news-metadata.ts: import `sep` from 'path'; normalize relative() output
with .split(sep).join('/') so file/slug fields in news-articles.json always use
forward slashes regardless of OS
- generate-sitemap.ts scanDir(): sort readdirSync entries by name before iterating
so Map insertion order is consistent across filesystems/platforms; sort the
returned ArticleGroup[] by baseSlug for fully deterministic sitemap output
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
news/2026/02/back to flatnews/(preserving existing shared URLs)path.join(NEWS_DIR, file)calls; removedarticlePath()helperarticles.sizecounts language-groups (base slugs), so label is now "news article groups" instead of "news articles"relative()output to POSIX separators inextract-news-metadata.ts(fixes Windows backslash issue in JSON output / slug derivation)readdirSyncentries before iterating ingenerate-sitemap.ts scanDir()and sort finalArticleGroup[]bybaseSlugfor deterministic sitemap outputOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.