Skip to content

[SPARK-55330][K8S] Make spark.kubernetes.legacy.useReadWriteOnceAccessMode a public config#54963

Open
nitrajen wants to merge 2 commits intoapache:masterfrom
nitrajen:fix/SPARK-55330-flat-pvc-recovery
Open

[SPARK-55330][K8S] Make spark.kubernetes.legacy.useReadWriteOnceAccessMode a public config#54963
nitrajen wants to merge 2 commits intoapache:masterfrom
nitrajen:fix/SPARK-55330-flat-pvc-recovery

Conversation

@nitrajen
Copy link

What changes were proposed in this pull request?

spark.kubernetes.legacy.useReadWriteOnceAccessMode was introduced in SPARK-46945 as an .internal() config. This PR removes the .internal() marker and improves the doc string so the config appears in the public documentation and can be used by operators.

Why are the changes needed?

SPARK-46786 (Spark 4.0) changed the default PVC access mode from ReadWriteOnce to ReadWriteOncePod. The Kubernetes documentation states that whether fsGroup is applied to a mounted volume depends on the CSI driver's fsGroupPolicy setting, and certain policies do not apply fsGroup for ReadWriteOncePod volumes. This can cause non-root executor processes to fail at job startup when trying to create the blockmgr-xxx directory under a PVC-backed local dir:

ERROR JavaUtils: Failed to create directory /apps/application/data/blockmgr-2469dd51-bfd2-478d-9b3c-d5593bf21c26
java.nio.file.AccessDeniedException: /apps/application/data/blockmgr-2469dd51-bfd2-478d-9b3c-d5593bf21c26

The escape hatch config (spark.kubernetes.legacy.useReadWriteOnceAccessMode=true) already exists in the codebase to restore ReadWriteOnce behavior, but because it is marked internal it does not appear in the docs and users cannot find it.

Does this PR introduce any user-facing change?

Yes. spark.kubernetes.legacy.useReadWriteOnceAccessMode becomes a public, documented configuration option (versioned as 4.2.0). Previously it was internal and invisible to users.

How was this patch tested?

Added two unit tests to MountVolumesFeatureStepSuite:

  • SPARK-55330: OnDemand PVC uses ReadWriteOncePod access mode by default — verifies the default PVC access mode is ReadWriteOncePod
  • SPARK-55330: OnDemand PVC uses ReadWriteOnce when legacy access mode is enabled — verifies that setting spark.kubernetes.legacy.useReadWriteOnceAccessMode=true results in PVCs being created with ReadWriteOnce

Both tests use getAdditionalKubernetesResources() on MountVolumesFeatureStep and assert the accessModes field on the returned PersistentVolumeClaim.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6

@nitrajen nitrajen force-pushed the fix/SPARK-55330-flat-pvc-recovery branch from 0f02261 to 4733a29 Compare March 23, 2026 21:24
@nitrajen nitrajen marked this pull request as ready for review March 23, 2026 23:27
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know ReadWriteOnce means that multiple executors on the same node share a single PVC in the worst case? Apache Spark community wants to avoid those situations completely. For those cases, please you can use hostPath.

Thus, -1 for this proposal, @nitrajen .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants