Add end-to-end integration tests for news article generation pipeline#707
Add end-to-end integration tests for news article generation pipeline#707
Conversation
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
Adds an end-to-end Vitest integration test suite that exercises the news article generation pipeline as a single flow (mock MCP data → generator → HTML template output → pipeline HTML validation), improving confidence beyond the existing unit-level coverage.
Changes:
- Adds
tests/pipeline/news-generation-e2e.test.tswith 41 tests covering all 8 news-type generators and validating generated HTML viapipeline/validation. - Verifies all 14 language variants for correct
<html lang="…">, RTLdir="rtl"behavior, and structural validity. - Adds assertions for Schema.org JSON-LD parseability/shape and hreflang alternate-link consistency.
| it('contains hreflang tags for all 14 language codes', () => { | ||
| const html = articleTemplate.generateArticleHTML(makeArticleData('en')); | ||
| ALL_14_LANGS.forEach(lang => { | ||
| expect(html, `hreflang="${lang}" missing`).toContain(`hreflang="${lang}"`); |
There was a problem hiding this comment.
The hreflang assertions treat Norwegian as hreflang="no", but elsewhere in the repo hreflang is explicitly mapped to BCP-47 nb for Norwegian (no filename suffix → nb hreflang). These tests should follow that convention (expect hreflang="nb" while keeping the href ending in -no.html) so the E2E suite doesn’t codify an SEO inconsistency.
| expect(html, `hreflang="${lang}" missing`).toContain(`hreflang="${lang}"`); | |
| const expectedHreflang = lang === 'no' ? 'nb' : lang; | |
| expect(html, `hreflang="${expectedHreflang}" missing`).toContain(`hreflang="${expectedHreflang}"`); |
There was a problem hiding this comment.
Fixed in commit 9e976b9. The template (scripts/article-template/template.ts line 112) now maps no → nb in the hreflang attribute (URL path stays -no.html), consistent with generate-sitemap.ts, generate-rss.ts, and generate-sitemap-html.ts. The test now checks hreflang="nb" for Norwegian.
| // Generator should still succeed (degraded article or error, but not throw) | ||
| expect(typeof result.success).toBe('boolean'); | ||
| }); | ||
|
|
||
| it('committee-reports handles empty reports gracefully', async () => { | ||
| mockClientInstance.fetchCommitteeReports.mockResolvedValue([]); | ||
|
|
||
| const result = await committeeReportsMod.generateCommitteeReports({ languages: ['en'] }); | ||
| expect(typeof result.success).toBe('boolean'); | ||
| }); | ||
|
|
||
| it('propositions handles empty data gracefully', async () => { | ||
| mockClientInstance.fetchPropositions.mockResolvedValue([]); | ||
|
|
||
| const result = await propositionsMod.generatePropositions({ languages: ['en'] }); | ||
| expect(typeof result.success).toBe('boolean'); | ||
| }); | ||
|
|
||
| it('motions handles empty data gracefully', async () => { | ||
| mockClientInstance.fetchMotions.mockResolvedValue([]); | ||
|
|
||
| const result = await motionsMod.generateMotions({ languages: ['en'] }); | ||
| expect(typeof result.success).toBe('boolean'); |
There was a problem hiding this comment.
The “empty MCP data” edge-case tests only assert that result.success is a boolean, which will still pass even if a generator regresses into always failing (or returning success without generating any HTML). To meaningfully validate graceful degradation, assert the expected outcome per generator (e.g., success: true with files: 0 for “skip” generators, or success: true + articles[0].html passing validateArticleHTML for generators that still emit a degraded article).
| // Generator should still succeed (degraded article or error, but not throw) | |
| expect(typeof result.success).toBe('boolean'); | |
| }); | |
| it('committee-reports handles empty reports gracefully', async () => { | |
| mockClientInstance.fetchCommitteeReports.mockResolvedValue([]); | |
| const result = await committeeReportsMod.generateCommitteeReports({ languages: ['en'] }); | |
| expect(typeof result.success).toBe('boolean'); | |
| }); | |
| it('propositions handles empty data gracefully', async () => { | |
| mockClientInstance.fetchPropositions.mockResolvedValue([]); | |
| const result = await propositionsMod.generatePropositions({ languages: ['en'] }); | |
| expect(typeof result.success).toBe('boolean'); | |
| }); | |
| it('motions handles empty data gracefully', async () => { | |
| mockClientInstance.fetchMotions.mockResolvedValue([]); | |
| const result = await motionsMod.generateMotions({ languages: ['en'] }); | |
| expect(typeof result.success).toBe('boolean'); | |
| // Generator should still succeed and emit a degraded but valid article | |
| expect(result.success).toBe(true); | |
| const articles = result.articles as GeneratedArticle[]; | |
| expect(Array.isArray(articles)).toBe(true); | |
| expect(articles.length).toBeGreaterThan(0); | |
| const validation = pipelineValidation.validateArticleHTML(articles[0]!.html); | |
| expect(validation.passed, validation.errors.join(', ')).toBe(true); | |
| }); | |
| it('committee-reports handles empty reports gracefully', async () => { | |
| mockClientInstance.fetchCommitteeReports.mockResolvedValue([]); | |
| const result = await committeeReportsMod.generateCommitteeReports({ languages: ['en'] }); | |
| expect(result.success).toBe(true); | |
| const articles = result.articles as GeneratedArticle[]; | |
| expect(Array.isArray(articles)).toBe(true); | |
| expect(articles.length).toBeGreaterThan(0); | |
| const validation = pipelineValidation.validateArticleHTML(articles[0]!.html); | |
| expect(validation.passed, validation.errors.join(', ')).toBe(true); | |
| }); | |
| it('propositions handles empty data gracefully', async () => { | |
| mockClientInstance.fetchPropositions.mockResolvedValue([]); | |
| const result = await propositionsMod.generatePropositions({ languages: ['en'] }); | |
| expect(result.success).toBe(true); | |
| const articles = result.articles as GeneratedArticle[]; | |
| expect(Array.isArray(articles)).toBe(true); | |
| expect(articles.length).toBeGreaterThan(0); | |
| const validation = pipelineValidation.validateArticleHTML(articles[0]!.html); | |
| expect(validation.passed, validation.errors.join(', ')).toBe(true); | |
| }); | |
| it('motions handles empty data gracefully', async () => { | |
| mockClientInstance.fetchMotions.mockResolvedValue([]); | |
| const result = await motionsMod.generateMotions({ languages: ['en'] }); | |
| expect(result.success).toBe(true); | |
| const articles = result.articles as GeneratedArticle[]; | |
| expect(Array.isArray(articles)).toBe(true); | |
| expect(articles.length).toBeGreaterThan(0); | |
| const validation = pipelineValidation.validateArticleHTML(articles[0]!.html); | |
| expect(validation.passed, validation.errors.join(', ')).toBe(true); |
There was a problem hiding this comment.
Strengthened in commit 9e976b9. Each generator now has specific assertions:
week-ahead(empty events):success: true+articles.length > 0+ HTML passesvalidateArticleHTML(withrequireSections: falsesince no events → no<h2>sections is expected)committee-reports,propositions,motions(empty data):success: true+files: 0(they skip article generation when no source data is available)
|
@copilot apply changes based on the comments in this thread |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
…case assertions Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
| <!-- Hreflang for language alternatives --> | ||
| ${ALL_LANG_CODES.map(l => ` <link rel="alternate" hreflang="${l}" href="https://riksdagsmonitor.com/news/${baseSlug}-${l}.html">`).join('\n')} | ||
| ${ALL_LANG_CODES.map(l => ` <link rel="alternate" hreflang="${l === 'no' ? 'nb' : l}" href="https://riksdagsmonitor.com/news/${baseSlug}-${l}.html">`).join('\n')} | ||
| <link rel="alternate" hreflang="x-default" href="https://riksdagsmonitor.com/news/${baseSlug}-en.html"> |
There was a problem hiding this comment.
The <link rel="alternate"> hreflang mapping now emits nb for Norwegian, but the page still renders the language switcher links with hreflang="no" via generateArticleLanguageSwitcher() (helpers.ts). This creates inconsistent hreflang values for the same Norwegian variant within one document. Consider centralizing a hreflangCode() mapping (no→nb) in the article-template layer and using it for both the <link rel="alternate"> tags and the language-switcher <a hreflang> attributes (or removing hreflang from the switcher entirely if it’s not needed).
There was a problem hiding this comment.
Fixed in commit 10d1947. Added a hreflangCode() helper to helpers.ts that centralizes the no → nb BCP-47 mapping. Both generateArticleLanguageSwitcher() (the in-page switcher) and the <link rel="alternate"> tags in template.ts now use this shared helper, ensuring consistent hreflang="nb" for Norwegian across the entire document.
| it('contains hreflang tags for all 14 language codes', () => { | ||
| const html = articleTemplate.generateArticleHTML(makeArticleData('en')); | ||
| ALL_14_LANGS.forEach(lang => { | ||
| const expectedHreflang = lang === 'no' ? 'nb' : lang; | ||
| expect(html, `hreflang="${expectedHreflang}" missing`).toContain(`hreflang="${expectedHreflang}"`); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
The hreflang assertions only validate the <link rel="alternate"> tags. Since the template also emits hreflang attributes in the in-page language switcher, this suite won’t catch the current mismatch where Norwegian renders as hreflang="no" in the switcher but hreflang="nb" in the head alternates. Add an assertion that the language-switcher link for the -no.html variant uses hreflang="nb" (or whatever the intended mapping is) to prevent regressions.
There was a problem hiding this comment.
Added in commit 10d1947. A new test 'language switcher Norwegian link uses hreflang="nb" (BCP-47), not "no"' asserts that no anchor with -no.html href uses hreflang="no", and that hreflang="nb" is present in the switcher markup.
|
@copilot apply changes based on the comments in this thread |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
tests/pipeline/news-generation-e2e.test.ts:345
ltrLangsis a subset of LTR languages, but the test name says it covers “LTR languages”. As written, a regression that incorrectly addsdir="rtl"to e.g.da,no,fi,es, ornlwould not be caught. Consider deriving this list fromALL_14_LANGS(exclude onlyar/he) so the assertion matches the intent.
it('does not set dir="rtl" for LTR languages', () => {
const ltrLangs: Language[] = ['en', 'sv', 'de', 'fr', 'ja', 'ko', 'zh'];
ltrLangs.forEach(lang => {
const html = articleTemplate.generateArticleHTML(makeArticleData(lang));
expect(html, `${lang} should not have rtl`).not.toContain('dir="rtl"');
});
});
| const results = pipelineValidation.validateArticleBatch(articles); | ||
|
|
||
| results.forEach(r => { | ||
| expect(r.passed, `Language ${r.filename} failed: ${r.errors.join(', ')}`).toBe(true); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
In this test, validateArticleBatch(articles) is only iterated via results.forEach(...). If validateArticleBatch regresses to return an empty array, the loop won’t run and the test will still pass. Add an assertion that results.length === articles.length (or toHaveLength(articles.length)) before iterating so the batch validator itself is actually validated.
There was a problem hiding this comment.
Fixed in commit 2250320. Added expect(results).toHaveLength(articles.length) before the forEach so a regressed validateArticleBatch returning an empty array will now fail fast.
| const successful = results.filter(r => r.success && r.articles?.length); | ||
| expect(successful.length).toBeGreaterThan(0); | ||
|
|
||
| const batch = successful.map(r => { |
There was a problem hiding this comment.
This test is titled “across all 8 article types”, but it filters to successful and only asserts successful.length > 0. That means the test can still pass even if several generators start failing (or stop producing articles), which defeats the purpose of covering all 8 types. Consider asserting each results[i].success is true and that you have 8 generated articles in the batch before calling validateArticleBatch.
| const successful = results.filter(r => r.success && r.articles?.length); | |
| expect(successful.length).toBeGreaterThan(0); | |
| const batch = successful.map(r => { | |
| // Ensure all 8 generators succeeded and produced at least one article each | |
| expect(results).toHaveLength(8); | |
| results.forEach((r, index) => { | |
| expect(r.success, `Generator index ${index} failed`).toBe(true); | |
| expect( | |
| r.articles && r.articles.length > 0, | |
| `Generator index ${index} produced no articles`, | |
| ).toBe(true); | |
| }); | |
| const batch = results.map(r => { |
There was a problem hiding this comment.
Fixed in commit 2250320. Replaced the filter + successful.length > 0 pattern with explicit assertions: expect(results).toHaveLength(8), per-generator expect(r.success).toBe(true), and expect(r.articles && r.articles.length > 0).toBe(true) before building the batch.
|
@copilot apply changes based on the comments in this thread and TypeScript & JavaScript Testing / Unit Tests (Vitest) (pull_request) |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
… all 8 generators succeed Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Applied in commit |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
tests/pipeline/news-generation-e2e.test.ts:305
- This test file hardcodes the full 14-language list in
ALL_14_LANGSeven though the template layer already exposes a single source of truth (ALL_LANG_CODES). To avoid the E2E suite drifting if languages are added/removed/reordered, consider importing and usingALL_LANG_CODESdirectly instead of duplicating the list here.
// All 14 supported language codes (mirrors ALL_LANG_CODES from constants)
// ---------------------------------------------------------------------------
const ALL_14_LANGS: readonly Language[] = [
'en', 'sv', 'da', 'no', 'fi', 'de', 'fr', 'es', 'nl', 'ar', 'he', 'ja', 'ko', 'zh',
];
| ); | ||
| expect(html).toContain('<!DOCTYPE html>'); |
There was a problem hiding this comment.
The test name says it validates handling of HTML entities/special characters, but it doesn’t assert that potentially unsafe characters (like </>) are escaped in the rendered HTML. With the current inputs (subtitle contains <key>), the test will still pass even if the template outputs raw HTML, so it won’t catch XSS/markup-injection regressions. Consider asserting that <key> is rendered as <key> (or, if raw HTML is intended, rename the test and avoid implying escaping).
| ); | |
| expect(html).toContain('<!DOCTYPE html>'); | |
| ); | |
| // Basic HTML structure check | |
| expect(html).toContain('<!DOCTYPE html>'); | |
| // Ensure potentially unsafe characters are HTML-escaped in the rendered output. | |
| // The raw `<key>` sequence should not appear, but its escaped form should. | |
| expect(html).not.toContain('Analysis of <key> policy areas & legislative priorities'); | |
| expect(html).toContain('Analysis of <key> policy areas & legislative priorities'); | |
| // JSON-LD blocks should still be valid |
There was a problem hiding this comment.
Fixed in commit 921a709. Renamed the test to "generates valid HTML when title and subtitle contain special characters" (removes the misleading "HTML entities" implication). Removed <key> from the subtitle (the template renders the lede raw by design — escaping is applied to meta tags and JSON-LD, as covered by article-template.test.ts:511-522). Added expect(html).toContain('Budget & Defence') to assert that & IS escaped in the meta/JSON-LD context.
|
@copilot apply changes based on the comments in this thread and TypeScript & JavaScript Testing / Unit Tests (Vitest) (pull_request) |
…pecial-chars e2e test Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Fixed in commit
|
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
tests/pipeline/news-generation-e2e.test.ts(42 tests)no → nb) in template.tshreflangCode()helper in helpers.tsvalidateArticleBatchtest (length guard + per-item assertions)date→datum,title→rubrik, addtid)expect(results).toHaveLength(articles.length)guard (line 318)tests/fix-article-navigation.test.ts— mapno → nbin hreflang assertion and updatebuildCompleteArticlefixture tohreflang="nb"for Norwegian<key>from subtitle; assert&in meta/JSON-LD contextOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.