What happened?
Problem
When a Redis pod restarts/moves in Kubernetes (getting a new IP), LibreChat continues attempting to connect to the old cached IP address indefinitely instead of re-resolving the DNS hostname.
Evidence
- Redis pod moved from one node to another.
- LibreChat attempted connection to
172.20.201.154:6379 for 51+ minutes straight
- 167 failed connection attempts all targeting the same stale IP
- Manual LibreChat restart resolved the issue immediately
Log examples:
14:00:08 → "connect ECONNREFUSED 172.20.201.154:6379"
14:51:30 → "connect ECONNREFUSED 172.20.201.154:6379" // Still same IP
Theory
The Redis client caches the resolved IP address on initial connection and never re-performs DNS lookups during reconnection attempts.
Proposed Solution
Add a DNS re-resolution handler in the Redis connection loss event:
redisClient.on('error', async (err) => {
if (err.code === 'ECONNREFUSED' || err.code === 'EHOSTUNREACH') {
// Trigger DNS re-lookup by recreating the client
await redisClient.disconnect();
redisClient = createRedisClient({ url: REDIS_URL }); // Forces fresh DNS resolution
}
});
The connection loss event should trigger a DNS lookup retry for the original hostname before attempting reconnection, rather than reusing the cached IP.
Impact
- Affects all session management (SAML, OpenID)
Reproduction
- Deploy LibreChat + Redis in K8s
- Delete Redis pod:
kubectl delete pod redis-xxx
- LibreChat tries old IP forever
- Restart LibreChat → works immediately
Expected: LibreChat should automatically recover by re-resolving DNS during reconnection attempts.
Version Information
v0.8.1
Steps to Reproduce
- Deploy LibreChat + Redis in K8s
- Delete Redis pod: kubectl delete pod redis-xxx
- LibreChat tries old IP forever
- Restart LibreChat → works immediately
What browsers are you seeing the problem on?
No response
Relevant log output
14:00:08 → "connect ECONNREFUSED 172.20.201.154:6379"
14:51:30 → "connect ECONNREFUSED 172.20.201.154:6379" // Still same IP
Screenshots
No response
Code of Conduct
What happened?
Problem
When a Redis pod restarts/moves in Kubernetes (getting a new IP), LibreChat continues attempting to connect to the old cached IP address indefinitely instead of re-resolving the DNS hostname.
Evidence
172.20.201.154:6379for 51+ minutes straightLog examples:
Theory
The Redis client caches the resolved IP address on initial connection and never re-performs DNS lookups during reconnection attempts.
Proposed Solution
Add a DNS re-resolution handler in the Redis connection loss event:
The connection loss event should trigger a DNS lookup retry for the original hostname before attempting reconnection, rather than reusing the cached IP.
Impact
Reproduction
kubectl delete pod redis-xxxExpected: LibreChat should automatically recover by re-resolving DNS during reconnection attempts.
Version Information
v0.8.1
Steps to Reproduce
What browsers are you seeing the problem on?
No response
Relevant log output
Screenshots
No response
Code of Conduct