From 30d1d80ef8ac8375d112448730c9360c26e8aad2 Mon Sep 17 00:00:00 2001 From: schmidt-scaled Date: Fri, 19 Jun 2026 16:18:56 +0300 Subject: [PATCH 1/2] docs(erasure-coding): replace per-scheme sections with overview table Replace the seven separate '### Scheme:' sections with a single overview table (schema, tolerated node failures, data protection overhead, IOPS r/w performance & latency, minimum nodes), and add the missing 1+0 schema. Minimum-node counts: 1+0=1, 1+1=3, 2+1=4, 4+1=6, 1+2=5, 2+2=6, 4+2=8. --- .../erasure-coding-scheme.md | 92 ++++--------------- 1 file changed, 18 insertions(+), 74 deletions(-) diff --git a/docs/deployments/deployment-preparation/erasure-coding-scheme.md b/docs/deployments/deployment-preparation/erasure-coding-scheme.md index 0ac1697a..da8465a8 100644 --- a/docs/deployments/deployment-preparation/erasure-coding-scheme.md +++ b/docs/deployments/deployment-preparation/erasure-coding-scheme.md @@ -6,7 +6,7 @@ weight: 30100 Choosing the appropriate **erasure coding scheme** is crucial when deploying a simplyblock storage cluster, as it directly impacts **data redundancy, storage efficiency, and overall system performance**. Simplyblock currently supports -the following erasure coding schemes: **1+1**, **2+1**, **4+1**, **1+2**, **2+2**, and **4+2**. Understanding the +the following erasure coding schemes: **1+0**, **1+1**, **2+1**, **4+1**, **1+2**, **2+2**, and **4+2**. Understanding the trade-offs between redundancy and storage utilization will help determine the best option for your workload. All schemas have been performance-optimized by specialized algorithms. There is, however, a remaining capacity-to-performance trade-off. @@ -22,79 +22,23 @@ allowing data recovery in case of hardware failures. The notation **k+m** repres If you need more information on erasure coding, see the dedicated concept page for [erasure coding](../../architecture/concepts/erasure-coding.md). -### Scheme: 1+1 - -- **Description:** In the _1+1 scheme_, data is mirrored, effectively creating an exact copy of every data block. -- **Redundancy Level:** Can tolerate the failure of **one** storage node. -- **Raw-to-Effective Ratio:** **200%** -- **Available Storage Capacity:** **50%** -- **Performance Considerations:** Offers **fast recovery and high read performance** due to data mirroring. -- **Best Use Cases:** - - Workloads requiring **high availability and minimal recovery time**. - - Applications where **performance is prioritized over storage efficiency**. - - Requires 3 or more nodes for full redundancy. - -### Scheme: 2+1 - -- **Description:** In the _2+1 scheme_, data is divided into two fragments with one parity fragment, offering a - balance between performance and storage efficiency. -- **Redundancy Level:** Can tolerate the failure of **one** storage node. -- **Raw-to-Effective Ratio:** **150%** -- **Available Storage Capacity:** **66.6%** -- **Performance Considerations:** For writes of 8K or higher, **lower write amplification** compared to **1+1**, as data is distributed across multiple nodes. This typically results in similar or higher IOPS. However, for small random writes (4K), the write performance is worse than **1+1**. Write latency is somewhat higher than with **1+1**. Read performance is similar to **1+1**, if local node affinity is disabled. With node affinity enabled, read performance is slightly worse (up to 25%). In a degraded state (one node offline / unavailable or failed disk), the performance is worse than with **1+1**. Recovery time to full redundancy from single disk error is slightly higher than with **1+1**. -- **Best Use Cases:** - - Deployments where **storage efficiency is relevant** without significantly compromising performance. - - Requires 4 or more nodes for full redundancy. - - -### Scheme: 4+1 - -- **Description:** In the _4+1 scheme_, data is divided into four fragments with one parity fragment, offering - optimal storage efficiency. -- **Redundancy Level:** Can tolerate the failure of **one** storage node. -- **Raw-to-Effective Ratio:** **125%** -- **Available Storage Capacity:** **80%** -- **Performance Considerations:** For writes of 16K or higher, **lower write amplification** compared to **2+1**, as data is distributed across more nodes. This typically results in similar or higher write IOPS. However, for 4-8K random writes, the write performance is typically worse than **2+1**. Write latency is somewhat similar to **2+1**. Read performance is similar to **2+1**, if local node affinity is disabled. With node affinity enabled, read performance is slightly worse (up to 13%). In a degraded state (one node offline / unavailable or failed disk), the performance is worse than with **2+1**. Recovery time to full redundancy from single disk error is slightly higher than with **2+1**. -- **Best Use Cases:** - - Deployments where **storage efficiency is a priority** without significantly compromising performance. - - Requires 6 or more nodes for full redundancy. - -### Scheme: 1+2 - -- **Description:** In the _1+2 scheme_, data is replicated twice, effectively creating multiple copies of every data block. -- **Redundancy Level:** Can tolerate the failure of **two** storage nodes. -- **Raw-to-Effective Ratio:** **300%** -- **Available Storage Capacity:** **33.3%** -- **Performance Considerations:** Offers **fast recovery and high read performance** due to data replication, but write performance is lower than with **1+1** in all cases (~33%). -- **Best Use Cases:** - - Workloads requiring **high redundancy and minimal recovery time**. - - Applications where **performance is prioritized over storage efficiency**. - - Requires 4 or more nodes for full redundancy. - -### Scheme: 2+2 - -- **Description:** In the _2+2 scheme_, data is divided into two fragments with two parity fragments, offering a great - balance between redundancy and storage efficiency. -- **Redundancy Level:** Can tolerate the failure of **two** storage nodes. -- **Raw-to-Effective Ratio:** **200%** -- **Available Storage Capacity:** **50%** -- **Performance Considerations:** Similar to **2+1**, but with higher write latencies and lower effective write IOPS due to higher write amplification. -- **Best Use Cases:** - - Deployments where **high redundancy and storage efficiency is important** without compromising redundancy. - - Applications that can tolerate slightly **higher recovery times** compared to **1+2**. - - Requires 6 or more nodes for full redundancy. - -### Scheme: 4+2 - -- **Description:** In the _4+2 scheme_, data is divided into four fragments with two parity fragments, offering a great - balance between redundancy and storage efficiency. -- **Redundancy Level:** Can tolerate the failure of **two** storage nodes. -- **Raw-to-Effective Ratio:** **150%** -- **Available Storage Capacity:** **66.6%** -- **Performance Considerations:** Similar to **4+1**, but with higher write latencies and lower effective write IOPS due to higher write amplification. -- **Best Use Cases:** - - Deployments where **high redundancy and storage efficiency is a priority**. - - Requires 8 or more nodes in a cluster. +The table below gives an overview of the supported schemes: + +- **Tolerated node failures** is the number of storage nodes that can fail with no data loss (`m`). +- **Data protection overhead** is the additional raw capacity stored on top of the usable data (e.g. 100% means the + cluster holds 2× the raw capacity for the usable amount; 0% means no protection). +- **IOPS r/w performance and latency** is a relative rating of read/write IOPS and latency. +- **Minimum nodes** is the number of storage nodes required for full redundancy. + +| Schema | Tolerated node failures | Data protection overhead | IOPS r/w performance & latency | Minimum nodes | +|--------|-------------------------|--------------------------|--------------------------------|---------------| +| 1+0 | 0 (no redundancy) | 0% | Very good | 1 | +| 1+1 | 1 | 100% | Excellent | 3 | +| 2+1 | 1 | 50% | Very good | 4 | +| 4+1 | 1 | 25% | Very good | 6 | +| 1+2 | 2 | 200% | Very good | 5 | +| 2+2 | 2 | 100% | Very good | 6 | +| 4+2 | 2 | 50% | Very good | 8 | ## Choosing the Scheme From e53b1d961eff1e801ffff35579a0141020e7b2be Mon Sep 17 00:00:00 2001 From: schmidt-scaled Date: Fri, 19 Jun 2026 16:32:16 +0300 Subject: [PATCH 2/2] docs(erasure-coding): rename tolerated-failures column to FTT FTT (failures to tolerate) is the established term; same values (0 for 1+0, 1 for n+1, 2 for n+2). --- .../erasure-coding-scheme.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/deployments/deployment-preparation/erasure-coding-scheme.md b/docs/deployments/deployment-preparation/erasure-coding-scheme.md index da8465a8..e58cc586 100644 --- a/docs/deployments/deployment-preparation/erasure-coding-scheme.md +++ b/docs/deployments/deployment-preparation/erasure-coding-scheme.md @@ -24,21 +24,21 @@ If you need more information on erasure coding, see the dedicated concept page f The table below gives an overview of the supported schemes: -- **Tolerated node failures** is the number of storage nodes that can fail with no data loss (`m`). +- **FTT** (failures to tolerate) is the number of storage nodes that can fail with no data loss (`m`). - **Data protection overhead** is the additional raw capacity stored on top of the usable data (e.g. 100% means the cluster holds 2× the raw capacity for the usable amount; 0% means no protection). - **IOPS r/w performance and latency** is a relative rating of read/write IOPS and latency. - **Minimum nodes** is the number of storage nodes required for full redundancy. -| Schema | Tolerated node failures | Data protection overhead | IOPS r/w performance & latency | Minimum nodes | -|--------|-------------------------|--------------------------|--------------------------------|---------------| -| 1+0 | 0 (no redundancy) | 0% | Very good | 1 | -| 1+1 | 1 | 100% | Excellent | 3 | -| 2+1 | 1 | 50% | Very good | 4 | -| 4+1 | 1 | 25% | Very good | 6 | -| 1+2 | 2 | 200% | Very good | 5 | -| 2+2 | 2 | 100% | Very good | 6 | -| 4+2 | 2 | 50% | Very good | 8 | +| Schema | FTT | Data protection overhead | IOPS r/w performance & latency | Minimum nodes | +|--------|-----|--------------------------|--------------------------------|---------------| +| 1+0 | 0 | 0% | Very good | 1 | +| 1+1 | 1 | 100% | Excellent | 3 | +| 2+1 | 1 | 50% | Very good | 4 | +| 4+1 | 1 | 25% | Very good | 6 | +| 1+2 | 2 | 200% | Very good | 5 | +| 2+2 | 2 | 100% | Very good | 6 | +| 4+2 | 2 | 50% | Very good | 8 | ## Choosing the Scheme