HDDS-14794. Design doc for Incremental container replication#9913
HDDS-14794. Design doc for Incremental container replication#9913echonesis wants to merge 2 commits intoapache:masterfrom
Conversation
|
I didn't read this in detail, just quickly skimmed it, but I believe the Container Reconciler largely solves this problem, or if it doesn't solve it completely it could easily be extended to. Have you see the design and implementation of the reconciler? |
|
@echonesis Thanks for raising this patch and @sodonnel for pointing to the container reconcilier patch. This is just an idea to allow QUASI_CLOSED container replicas with lower sequence ID to catch up with the replica with higher sequence ID. I am raising this in the context of multi-DC stretch cluster (as opposed to cross-region DC) setup where a pipeline that have a main pipeline of 3 replicas on the main DC and one replica that uses incremental block replication on the target DC which will listen to the main pipeline regardless whether the Ratis group has been closed. @echonesis Let me think about this first. |
|
The idea of the reconciler, is to take unhealthy replicas (eg those with block corruptions) and fix the unhealthy state without replication from RM. In the initial version I believe it does this as part of the container scanner, but the idea was to extend to be trigger by RM when RM detects something like quasi_closed or an unhealthy replica, so it can try to repair them, rather than do a full replication. This sounds a lot like what you want to do, so it would be really great if you could look into building on the work started on the reconcilor already. |
What changes were proposed in this pull request?
Incremental container replication design doc
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14794
How was this patch tested?
NA