Skip to content

Add retry logic to role synchronization#3100

Open
jorsol wants to merge 1 commit into
zalando:masterfrom
jorsol:fix-sync-roles-recovery
Open

Add retry logic to role synchronization#3100
jorsol wants to merge 1 commit into
zalando:masterfrom
jorsol:fix-sync-roles-recovery

Conversation

@jorsol

@jorsol jorsol commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Prevents syncRoles() from failing during cluster startup, crash recovery, or primary failover windows.

When a PostgreSQL instance is starting up or processing WAL logs, it will accept connections but reject write operations with a 26006 read-only transaction error. Since role syncing occurs within a reconciliation loop, failing flat during this phase creates unnecessary error noise.

This adds a retry logic around c.userSyncStrategy.ExecuteSyncRequests to not fail immediately.

Closes #3099

@FxKu FxKu added this to the 2.0.0 milestone Jun 4, 2026
@FxKu FxKu added the minor label Jun 4, 2026
@FxKu FxKu modified the milestones: 2.0.0, wishlist Jun 11, 2026
@FxKu

FxKu commented Jun 11, 2026

Copy link
Copy Markdown
Member

syncing roles is not the only time write operations happen. How about implementing a retry logic around db.Exec(query) parts in users.go file like you've mentioned in your issue. Search for:

retryutil.Retry(
			constants.PostgresConnectTimeout,
			constants.PostgresConnectRetryTimeout,
			func() (bool, error) {
			...

to see how it was done in other places.

@FxKu FxKu modified the milestones: wishlist, 2.0.0 Jun 11, 2026
@FxKu FxKu moved this to Open Questions in Postgres Operator Jun 11, 2026
@FxKu FxKu moved this from Open Questions to WIP / currently reviewed in Postgres Operator Jun 11, 2026
@jorsol jorsol force-pushed the fix-sync-roles-recovery branch from 2cb0f47 to dc147e3 Compare June 11, 2026 14:47
@jorsol

jorsol commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Hi @FxKu, can you review again?

@jorsol jorsol changed the title fix: skip role sync when database is in recovery mode Add retry logic to role synchronization Jun 11, 2026
@FxKu FxKu modified the milestones: 2.0.0, wishlist Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: WIP / currently reviewed

Development

Successfully merging this pull request may close these issues.

Race-condition when doing syncRoles()

2 participants