Skip to content

DB corruption upon restart #598

@Resousse

Description

@Resousse

Description

Inside a percona pg-db pod, the database container restarted properly, and after the restart, database was corrupted. All my tables are encrypted using pg_tde.

Expected Results

A reindex can solve the issue, but I have the feeling (as it occurred 3-5 times the past week), that it occurs at any restart of the DB

Actual Results

Leading to the following error after the restart

tenants=# select * from sessions;
ERROR:  invalid page in block 0 of relation "base/16593/17596"

Version

Percona PG 18 installed via helm charts (both pg-db and pg-operator in 2.8.2). I don't know what is the version of pg_tde.

Steps to reproduce

Not always replicable, but one restart of the DB some writes in it

Relevant logs

6-01-26 11:30:00.297	
connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL:  the database system is shutting down
	2026-01-26 11:30:00.297	
psycopg2.OperationalError: connection to server at "localhost" (::1), port 5432 failed: timeout expired
	2026-01-26 11:30:00.297	
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
	2026-01-26 11:30:00.297	
  File "/usr/lib64/python3.9/site-packages/psycopg2/__init__.py", line 127, in connect
	2026-01-26 11:30:00.297	
    ret = _connect(*args, **kwargs)
	2026-01-26 11:30:00.297	
  File "/usr/lib/python3.9/site-packages/patroni/psycopg.py", line 136, in connect
	2026-01-26 11:30:00.297	
    conn = psycopg.connect(**kwargs)
	2026-01-26 11:30:00.297	
  File "/usr/lib/python3.9/site-packages/patroni/postgresql/connection.py", line 158, in get_connection_cursor
	2026-01-26 11:30:00.297	
    return next(self.gen)
	2026-01-26 11:30:00.297	
  File "/usr/lib64/python3.9/contextlib.py", line 119, in __enter__
	2026-01-26 11:30:00.297	
    with get_connection_cursor(**conn_kwargs) as cur:
	2026-01-26 11:30:00.297	
  File "/usr/lib/python3.9/site-packages/patroni/postgresql/__init__.py", line 1095, in get_replication_connection_cursor
	2026-01-26 11:30:00.297	
    return next(self.gen)
	2026-01-26 11:30:00.297	
  File "/usr/lib64/python3.9/contextlib.py", line 119, in __enter__
	2026-01-26 11:30:00.297	
    with self.get_replication_connection_cursor(**self.config.local_replication_address) as cur:
	2026-01-26 11:30:00.297	
  File "/usr/lib/python3.9/site-packages/patroni/postgresql/__init__.py", line 1100, in get_replica_timeline
	2026-01-26 11:30:00.297	
Traceback (most recent call last):
	2026-01-26 11:30:00.294	
2026-01-26 10:29:59,475 ERROR: Can not fetch local timeline and lsn from replication connection

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions