Skip to content

feat(etl): normalize genre on entity_manager track writes#367

Merged
dylanjeffers merged 2 commits into
mainfrom
feat/genre-normalization
Jun 18, 2026
Merged

feat(etl): normalize genre on entity_manager track writes#367
dylanjeffers merged 2 commits into
mainfrom
feat/genre-normalization

Conversation

@dylanjeffers

Copy link
Copy Markdown
Contributor

Summary

Adds genre normalization to the go-openaudio entity_manager ETL write path. This is intended as the canonical source of truth for genre normalization across the Audius platform.

NormalizeGenre(genre string) string (in pkg/etl/processors/entity_manager/genre_normalize.go):

  • Trims and collapses whitespace; matches case-insensitively.
  • Maps known community variants to their canonical GenreAllowlist form, e.g.:
    • hip-hop, hiphop, hip hop, hip hop/rap, rapHip-Hop/Rap
    • r&b, rnb, r and b, r&b/soulR&B/Soul
    • drum & bass, drum and bass, dnbDrum & Bass
    • edm, electronic dance musicElectronic
    • lo-fi, lofi, lo fiLo-Fi
  • Already-canonical allowlist entries round-trip regardless of input case.
  • Anything unrecognized falls back to a title-cased baseline.

The variant table is seeded from GenreAllowlist itself, so canonical spellings are always valid targets. GenreAllowlist is left unchanged.

Wiring

NormalizeGenre is called from the track create and update validate paths (track_create.go, track_update.go), normalizing params.Metadata["genre"] in place so the canonical value is what gets inserted/merged into PostgreSQL.

Notes

  • No backfill — normalization applies to new writes only.
  • Unit tests cover the key variant mappings plus a guard that every variant target is a real GenreAllowlist entry. DB-backed/integration tests skipped (Docker unavailable in this environment).
  • go build ./..., go vet, and go test -run TestNormalize all pass.
  • Display-layer normalization for /v1/genres/popular is handled separately in feat: normalize genre variants to a canonical form AudiusProject/api#962, which currently uses different canonical forms — to be aligned in a follow-up.

🤖 Generated with Claude Code

dylanjeffers and others added 2 commits June 9, 2026 14:16
Exposes three new JSON endpoints from the explorer console layer:

  GET /api/v1/events               - paginated list of active events
  GET /api/v1/events/:id           - single event by numeric ID
  GET /api/v1/events/:handle/:slug - resolve contest permalink to event

Each response includes a permalink field constructed as
/<handle>/contest/<slug> via a LEFT JOIN on the event_routes table
(added in migration 0034) and the users table for the owner handle.

New files:
  pkg/etl/db/event_reads.go  - hand-written db.Queries methods (GetEventByID,
                                GetEventBySlug, ListEvents) using the same
                                sqlc pattern as reads.sql.go
  pkg/explorer/event_api.go  - Echo handlers + eventResponse JSON type

All existing tests pass; go build ./... is clean.
Add NormalizeGenre, the canonical genre-normalization function for the
Audius platform. It maps known community variants (hiphop, r and b, dnb,
edm, lofi, etc.) to their canonical GenreAllowlist form and applies a
title-cased baseline to anything unrecognized. Matching is
case-insensitive with whitespace trimmed/collapsed.

Wired into the track create and update validate paths so every genre
written to PostgreSQL is normalized in place. No backfill — new writes
only. GenreAllowlist is used as the canonical target set and left
unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dylanjeffers dylanjeffers merged commit 5aa118b into main Jun 18, 2026
5 checks passed
@dylanjeffers dylanjeffers deleted the feat/genre-normalization branch June 18, 2026 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant