Skip to content

Fix Device UUID alignment between ConfigDB and Sparkplug Instance_UUID#627

Merged
AlexGodbehere merged 17 commits intomainfrom
fix-uuids
Apr 9, 2026
Merged

Fix Device UUID alignment between ConfigDB and Sparkplug Instance_UUID#627
AlexGodbehere merged 17 commits intomainfrom
fix-uuids

Conversation

@AlexGodbehere
Copy link
Copy Markdown
Contributor

@AlexGodbehere AlexGodbehere commented Apr 8, 2026

Summary

  • Helm pre-upgrade backup hook: runs pg_dump of the ConfigDB before every helm upgrade, stores timestamped backups in a PVC, keeps last 5. Aborts the upgrade if the backup fails. Configurable via configdb.backup.* in values.yaml (default: enabled, 2Gi storage).
  • ConfigDB v13 migration: aligns existing Sparkplug Device object UUIDs with their Instance_UUID from the origin map. Includes a pre-flight check for duplicate Instance_UUIDs that fails cleanly and allows retry after fixing. Also wires in the previously un-wired v12 migration.
  • Admin UI fix: prepareModelForSaving() in OriginMapEditor.vue now uses the device's ConfigDB object UUID as the top-level Instance_UUID instead of generating a random UUID. This prevents Instance_UUID regeneration on schema change and ensures new devices are aligned from creation.

Test Plan

  • Run helm template and verify pre-upgrade-backup Job and PVC render correctly
  • Build dev release and install on development cluster
  • Deploy to a test cluster, run helm upgrade, verify a backup appears in the configdb-backups PVC
  • Verify a second upgrade rotates backups correctly (keeps last 5)
  • Create a new device in the admin UI, save the origin map, verify Instance_UUID in ConfigDB config matches the device's object UUID
  • Change a device schema, reconfigure, save — verify Instance_UUID is unchanged
  • On a test cluster with existing devices, verify the v13 migration runs and aligns device UUIDs without data loss

@AlexGodbehere AlexGodbehere marked this pull request as draft April 8, 2026 13:36
@AlexGodbehere
Copy link
Copy Markdown
Contributor Author

AlexGodbehere commented Apr 8, 2026

Testing on dev cluster in progress...

Before

image image

During

Kerberos initialization for op1pgadmin@REALM
Starting migration...
Dropping existing permissions for "sv1configdb"
psql:/home/node/app/sql/migration.sql:8: NOTICE:  relation "version" already exists, skipping
psql:/home/node/app/sql/v13.sql:96: NOTICE:  Migrating database schema to version 13
psql:/home/node/app/sql/v13.sql:96: NOTICE:  Aligning Device 476: object UUID c631cfe0-7b02-4462-8cde-21da34d78063 -> Instance_UUID 567b1f23-7356-429c-9d92-16a395d94c7b
psql:/home/node/app/sql/v13.sql:96: NOTICE:  v13 migration complete: Sparkplug Device UUIDs aligned with Instance_UUIDs.
Migration complete.

After

image

Notes

Although the upgrade seemed to change the UUID I'm getting intermittent connectivity issues with Grafana and devices dropping. Need to investigate logs on fpd-ago.

@amrc-benmorrow FYI.

Update:
Restarting all services (including edge services) seemed to solve this.

amrc-benmorrow
amrc-benmorrow previously approved these changes Apr 8, 2026
Copy link
Copy Markdown
Contributor

@amrc-benmorrow amrc-benmorrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming it tests OK

The PVC must be a hook resource so it exists before the Job runs.
No delete policy on the PVC so it persists across upgrades.
The backup PVC is now a regular chart resource so it persists
across upgrades. The hook Job uses Helm lookup to skip gracefully
if the PVC doesn't exist yet (first install or upgrade from old version).
@AlexGodbehere AlexGodbehere marked this pull request as ready for review April 9, 2026 09:25
@AlexGodbehere AlexGodbehere merged commit f46612c into main Apr 9, 2026
1 check passed
@AlexGodbehere AlexGodbehere deleted the fix-uuids branch April 9, 2026 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants