Skip to content

Move all variants to EROFS #4727

@yeazelm

Description

@yeazelm

What I'd like:
Currently, only the newest variants use EROFS for their root volume. This means they do not benefit from the space savings and boot time improvements.

For NVIDIA variants, moving to the 580 driver for these variants is problematic since the 580 driver is much larger than the 570 (~817MB vs 1.1GB) which results in a few variants running out of space to fit the new driver. This makes the space savings even more important in this case.

We will move all existing variants to EROFS so that they can benefit from the improvements in EROFS.

I ran some analysis on a few k8s variants to show the general improvements in boot time:

Details

x86-m7i (aws-k8s-1.29)

Phase Control EROFS Change Statistical
firmware 0.625s 0.622s -0.4% No difference
loader 0.760s 0.760s -0.1% No difference
kernel 2.009s 1.104s -45.1% Large improvement (d=-3.38)
userspace 7.360s 4.259s -42.1% Large improvement (d=-5.58)

arm-m7g (aws-k8s-1.29)

Phase Control EROFS Change Statistical
firmware 0.418s 0.416s -0.3% No difference
loader 0.497s 0.497s +0.1% No difference
kernel 1.695s 0.908s -46.4% Large improvement (d=-3.24)
userspace 7.142s 4.188s -41.4% Large improvement (d=-4.13)
boottime 9.75s 6.01s -38.4% (3.7s faster) Large improvement (d=-4.61)

nvidia-x86-g4dn (aws-k8s-1.29-nvidia)

Phase Control EROFS Change Statistical
firmware 0.949s 0.830s -12.5% Large improvement (d=-1.38)
loader 1.653s 1.020s -38.3% Large improvement (d=-1.31)
kernel 1.946s 1.078s -44.6% Large improvement (d=-2.63)
userspace 19.31s 14.02s -27.4% Large improvement (d=-1.61)
boottime 23.52s 17.00s -27.7% (6.5s faster) Large improvement (d=-2.11)

Summary

Platform Baseline EROFS Improvement
x86 (m7i) 11.11s 6.72s -39.5% (~4.4s faster)
ARM (m7g) 9.73s 6.00s -38.3% (~3.7s faster)
NVIDIA (g4dn) 23.56s 17.03s -27.7% (~6.5s faster)

All results are statistically significant with large effect sizes (Cohen's d > 0.8, p < 0.00001). The biggest gains are in kernel (~45% faster) and userspace (~40% faster) phases.

Any alternatives you've considered:

Let the variants stay on ext4 and only add new variants with EROFS. This would be fine except the NVIDIA variants would still run out of space when trying to move to 580.

Metadata

Metadata

Assignees

Labels

area/coreIssues core to the OS (variant independent)type/enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions