Status: draft
Owner: @jdevalk/emdash-plugin-seo
Depends on: @jdevalk/astro-seo-graph (new dep) — see seo-graph/packages/astro-seo-graph/SPEC-alternates.md for the shared helper
New files: src/hreflang.ts, src/urls.ts, test/hreflang.test.ts
Modified files: src/metadata.ts, src/canonical.ts, package.json
EmDash supports multilingual content via Astro's i18n config and a translation_group ULID shared across locale siblings (core migration 019_i18n.ts). It exposes getTranslations(type, id) and getI18nConfig() in the public emdash API, and it accepts <link rel="alternate" hreflang="…"> contributions through the page:metadata hook.
Nothing in EmDash core emits those tags automatically. A multilingual EmDash site ships without hreflang annotations unless the site author writes them by hand in a layout.
This plugin already owns page-level SEO metadata for EmDash (canonical, OG, JSON-LD via src/metadata.ts). Hreflang is in scope.
When seoPlugin is installed on a site with Astro i18n configured and content that uses translation_group, emit a complete, correct set of <link rel="alternate" hreflang="…"> tags in <head> for every content page that has translations, including an x-default entry and a self-referential entry for the current page.
- Rendering logic. URL normalization, BCP 47 tag casing,
x-defaultselection, dedup, and absolute-URL enforcement live in@jdevalk/astro-seo-graph'sbuildAlternateLinkshelper. This plugin is a thin adapter. - Custom (non-content) pages. They aren't in a
translation_group. Site owners own their routing. - Sitemap hreflang. Separate concern.
- Emitting hreflang for draft/scheduled/unpublished translations. Only
status === "published"siblings count. - BCP 47 region-tag distinction (
frvsfr-CA) via the Astro object-form locale. See "Region tag limitation" below.
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ emdash-plugin-seo │ │ astro-seo-graph │
│ │ │ │
│ hreflang.ts │ │ alternates.ts │
│ - generateHreflang() │──────▶│ - buildAlternateLinks() │
│ - calls getTranslations() │ │ (pure, normalizes tags, │
│ - calls getI18nConfig() │ │ adds x-default, │
│ - builds per-locale URLs │ │ validates URLs) │
│ via buildPageUrl() │ │ │
│ - feeds to helper │ │ │
│ - wraps output as │ │ │
│ PageMetadataContribution │ │ │
│ │ │ │
│ urls.ts │ │ │
│ - buildPageUrl() │ │ │
│ (shared by canonical.ts) │ │ │
└──────────────────────────────┘ └──────────────────────────────┘
The plugin's responsibility is data sourcing and URL construction. The helper's responsibility is normalization, validation, and x-default. Don't duplicate rules across the boundary.
Emit hreflang contributions only when all of the following hold:
isI18nEnabled()returnstrue(there's more than one locale configured).event.page.kind === "content"andevent.page.contentis defined.getTranslations(collection, id)returns at least two published siblings (including the current entry). The shared helper also short-circuits onentries.length < 2, but gating here avoids the DB round-trip on single-locale pages.- The current page itself is
status === "published". Don't leak alternate URLs for draft previews.
If any condition fails, return []. Do not log, do not warn — multilingual-off is the common case.
From getTranslations():
- Keep only
status === "published". - Keep only entries whose
localeis present ini18nConfig.locales. Defensive: drop siblings whose stored locale is no longer in the active Astro config. - Keep only entries with a non-null
slug. - Always include the current page itself. Self-referential hreflang is required by Google.
- Verify the current entry is in the returned list. If not, that signals a translation-group data integrity issue — return
[]rather than emitting partial annotations.
Extract a buildPageUrl helper into src/urls.ts. Signature:
import type { I18nConfig } from "emdash";
export function buildPageUrl(input: {
collection: string;
locale: string;
slug: string;
siteUrl: string; // from ctx.site.url
cfg: I18nConfig; // from getI18nConfig()
urlPattern: string; // collection's urlPattern, e.g. "/{slug}" or "/blog/{slug}"
}): string;Rules:
- Substitute
{slug}intourlPatternto get the locale-less path (e.g./blog/hello/). - Prepend the locale prefix:
- If
locale === cfg.defaultLocaleandcfg.prefixDefaultLocale === false: no prefix. The path stays at/blog/hello/. - Otherwise: prefix is
/{locale}. Result:/fr/blog/bonjour/.
- If
- Normalize: lowercase, collapse duplicate slashes, enforce trailing slash. This must match
canonical.ts's existing normalization exactly — that's why it's a shared helper, not two copies. - Combine with
siteUrl(strip trailing slash from siteUrl first) to produce an absolute URL.
canonical.ts must be refactored to call buildPageUrl too. The current canonical.ts (src/canonical.ts:29-42) reimplements URL building from page.url. After the refactor, both canonical and hreflang produce byte-identical URLs for any (collection, locale, slug) triple. This is the single most important invariant — Google silently invalidates hreflang annotations where the hreflang URL doesn't match the target page's canonical.
Lock this with a test (see §6).
With the filtered sibling list and URLs in hand:
import { buildAlternateLinks } from "@jdevalk/astro-seo-graph";
import type { PageMetadataContribution, PublicPageContext, PluginContext } from "emdash";
import { getI18nConfig, isI18nEnabled, getTranslations } from "emdash";
import { buildPageUrl } from "./urls.js";
export async function generateHreflang(
page: PublicPageContext,
ctx: PluginContext,
siteUrl: string,
): Promise<PageMetadataContribution[]> {
if (!isI18nEnabled()) return [];
if (page.kind !== "content" || !page.content) return [];
const cfg = getI18nConfig();
if (!cfg) return [];
let result;
try {
result = await getTranslations(page.content.collection, page.content.id);
} catch (error) {
ctx.logger?.warn?.("[seo] getTranslations failed", { error });
return [];
}
if (result.error || result.translations.length < 2) return [];
const currentInList = result.translations.some((t) => t.id === page.content!.id);
if (!currentInList) return [];
const urlPattern = await loadUrlPattern(ctx, page.content.collection);
if (!urlPattern) return [];
const entries = result.translations
.filter((t) => t.status === "published" && t.slug && cfg.locales.includes(t.locale))
.map((t) => ({
hreflang: t.locale,
href: buildPageUrl({
collection: page.content!.collection,
locale: t.locale,
slug: t.slug!,
siteUrl,
cfg,
urlPattern,
}),
}));
const alternates = buildAlternateLinks({
entries,
defaultLocale: cfg.defaultLocale,
});
return alternates.map((a) => ({
kind: "link" as const,
rel: "alternate" as const,
href: a.href,
hreflang: a.hreflang,
key: `hreflang:${a.hreflang}`,
}));
}The explicit key: "hreflang:..." prefix avoids collision with any future rel="alternate" contribution from another module (e.g. RSS discovery).
PublicPageContext.content exposes { collection, id, slug } but not urlPattern. The plugin needs to look it up. Options:
- Cache via
ctx.kv. Load all_emdash_collectionsrows once, cache by collection slug. Invalidate on schema changes (or just accept eventual consistency — urlPatterns change rarely). - Direct query via EmDash's public API. If EmDash exposes a
getCollectionMeta(slug)API, use it. Checkpackages/core/src/index.tsat implementation time. - Pass-through from page context. Propose adding
urlPatterntoPublicPageContext.contentin EmDash core. Cleanest long-term, but requires a core change.
v1 goes with the ctx.kv cache. Plugin-local, no core change required.
After the canonical contribution (current src/metadata.ts:44-47):
// 5b. hreflang alternates (multilingual content sites only)
const hreflangContributions = await generateHreflang(page, ctx, siteUrl);
contributions.push(...hreflangContributions);metadataHandler is already async. No changes to its signature.
getTranslations()throws or returns{ error }: swallow, log viactx.logger?.warn?., return[]. SEO metadata must never crash page rendering.buildAlternateLinksdrops all entries (malformed URLs): it returns[], so the plugin also returns[]. No action needed.- Missing
contenton the page context: return[]. - Current entry not in translation list: return
[]. urlPatternmissing for the collection: return[]. Log at debug — collections withouturlPatterngenuinely have no canonical URL, so there's nothing to emit.
Hreflang requires each sibling to point back with hreflang. Because getTranslations() returns the same group from any sibling, running this plugin on every page in the group automatically produces reciprocal annotations. Lock this with a reciprocity test (see §9).
No settings in v1. hreflang is either correct or wrong — there are no meaningful toggles. Site owners who want to disable it can disable the SEO plugin entirely.
Deferred: per-locale URL overrides for sites that rewrite URLs outside Astro routing. Not needed for any known EmDash use case.
EmDash cannot currently distinguish fr from fr-CA via Astro's object-form locale definition. The integration at packages/core/src/astro/integration/index.ts:180 flattens Astro's { path, codes } locale entries to just the path string:
locales: astroConfig.i18n.locales.map((l) => (typeof l === "string" ? l : l.path)),The codes array is dropped. So { path: "fr", codes: ["fr-CA", "fr-FR"] } arrives at the plugin as just "fr", and there is no round-trip path from Astro's codes back into hreflang output.
Sites needing region-specific hreflang should use the BCP 47 code as the locale path directly:
// astro.config.mjs
i18n: {
defaultLocale: "en",
locales: ["en", "fr-ca", "fr-fr"],
routing: { prefixDefaultLocale: false },
}- URLs become
/fr-ca/…and/fr-fr/…. - Stored
localeon content rows is"fr-ca"/"fr-fr". buildAlternateLinksnormalizes on output:fr-ca→fr-CAin the hreflang attribute.getTranslations()correctly treats them as separate translation groups.
Cost: the URL carries a visible region code. Fine for most sites.
Expand EmDash's virtual module shape to preserve codes:
locales: Array<{ path: string; codes?: readonly string[] }>Then resolve the emitted hreflang code per request from Accept-Language or a content-level override. This requires changes to:
packages/core/src/astro/integration/index.ts:178-184— preservecodes.packages/core/src/i18n/config.ts— widenI18nConfig.localestype.packages/core/src/astro/middleware/request-context.ts:73— optionally resolve region fromAccept-Language.- Admin manifest at
packages/core/src/emdash-runtime.ts:1301-1306— exposecodesfor the admin locale switcher.
Do not attempt this from the plugin side. File an EmDash discussion if a site owner asks for it.
File: test/hreflang.test.ts.
Mock getTranslations, isI18nEnabled, getI18nConfig, and buildPageUrl. Use a fake PluginContext and PublicPageContext.
- i18n disabled → returns
[], no DB call. page.kind === "custom"→[].- Missing
page.content→[]. getTranslationsthrows →[], warning logged.getTranslationsreturns{ error }→[].- Only one published sibling (self) →
[]. - Current entry not in returned translation list →
[]. - Three published siblings,
prefixDefaultLocale: false, defaulten→ 4 contributions (3 locales + x-default). Assert URLs, hreflang codes,keyvalues, all absolute. - Three published siblings,
prefixDefaultLocale: true→ default-locale URL also carries/en/prefix.x-defaultmatches theenURL. - Draft + published siblings → drafts excluded from entries.
- Sibling with null slug → excluded.
- Sibling with locale not in
cfg.locales→ excluded. urlPatternmissing for collection →[].- BCP 47 region tags (
fr-ca,fr-fr) as locale paths → hreflang emitted asfr-CA,fr-FR.
Set up three fake siblings in the same translation group (en/fr/nl), run generateHreflang as if each were the current page, assert:
- Each output contains the same set of URLs (ignoring order and x-default).
- Each output contains every locale's hreflang.
- Each output's
x-defaultURL matches across all three runs.
This is the real-world invariant search engines check.
For a given (collection, locale, slug), assert that the URL produced by generateCanonical (refactored to call buildPageUrl) and the URL produced by generateHreflang for that same sibling are byte-identical. Run for:
- Default locale without prefix.
- Default locale with prefix.
- Non-default locale.
- Collection with multi-segment
urlPattern(/blog/{slug}). - Collection with root
urlPattern(/{slug}).
This test is what prevents the most common hreflang bug in the wild.
Add to package.json:
"dependencies": {
"@jdevalk/astro-seo-graph": "^0.2.0",
"@jdevalk/seo-graph-core": "^0.1.0-alpha.0"
}@jdevalk/astro-seo-graph's main entry is pure TypeScript (astro is only a peer dep), so importing buildAlternateLinks does not drag in any Astro runtime. EmDash plugin consumers already have astro installed via EmDash itself, so the peer dep is satisfied transitively.
-
@jdevalk/astro-seo-graph@^0.2.0added as a dependency. -
src/urls.tsexists and exportsbuildPageUrl. -
src/canonical.tsrefactored to callbuildPageUrl(no behaviour change to existing canonical output — lock with snapshot tests first). -
src/hreflang.tsexists and exportsgenerateHreflang. -
src/metadata.tscallsgenerateHreflangand spreads the result after the canonical contribution. - All 14 unit tests above pass.
- Reciprocity test passes for a 3-locale fixture.
- URL-correctness test passes for all 5 configurations.
- Manual check on a demo site with
en/fr/nlcontent: view-source on/hello/shows 3rel="alternate"entries +x-default, all absolute, all pointing at URLs that return 200 on the site. - Plugin behaviour on a non-multilingual site is unchanged (zero hreflang tags emitted, zero extra DB queries — guarded by
isI18nEnabled()before anygetTranslationscall). - README gains a "hreflang" section documenting the
frvsfr-CAworkaround.
-
urlPatternaccess. Is there a cleaner path than caching viactx.kv? Investigate whether EmDash core exposes a public collection-metadata API at implementation time. If yes, use it. If no, thectx.kvcache is fine for v1. -
Refactor scope for
canonical.ts. The refactor to callbuildPageUrlis strictly necessary for correctness (canonical must equal hreflang target). Lock existing canonical output with a snapshot test before refactoring. If snapshots change, the existingcanonical.tsis subtly different from the new shared helper and we have a bug to file, not ignore. -
Should the plugin emit hreflang when there's only one published sibling after filtering out drafts? Current spec says no (short-circuit at count < 2). Alternative: still emit a self-referential hreflang for SEO consistency. Google's guidance is "only emit hreflang when there are real alternates", so
[]is correct, but worth confirming against a multilingual demo.