What happened?
I guess this bug happens when spark-operator abruptly exist while spark-submit command is done but updateSparkApplicationStatus is not executed so that SparkApplication status is still "new"(""). Then the new spark-operator is up and try to re-submit again as the status is "new".
Reproduction Code
Keep submitting lots of jobs and restart the spark-operator
Expected behavior
jobs can be started
Actual behavior
some jobs fail with the error of "driver pod already exist"
Environment & Versions
- Kubernetes Version: 1.33
- Spark Operator Version: 2.3.0
- Apache Spark Version:
Additional context
No response
Impacted by this bug?
Give it a 👍 We prioritize the issues with most 👍
What happened?
When I restart spark-operator, sometimes job fails with the error of "driver pod already exist". I checked the kubernetes event, and find the following event:
I guess this bug happens when spark-operator abruptly exist while spark-submit command is done but updateSparkApplicationStatus is not executed so that SparkApplication status is still "new"(""). Then the new spark-operator is up and try to re-submit again as the status is "new".
Reproduction Code
Keep submitting lots of jobs and restart the spark-operator
Expected behavior
jobs can be started
Actual behavior
some jobs fail with the error of "driver pod already exist"
Environment & Versions
Additional context
No response
Impacted by this bug?
Give it a 👍 We prioritize the issues with most 👍