Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 50 additions & 2 deletions pkg/controllers/v1alpha1/dataload/status_handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@

finishedJobCondition := kubeclient.GetFinishedJobCondition(job)
if finishedJobCondition == nil {
ctx.Log.V(1).Info("DataLoad job still running", "namespace", ctx.Namespace, "jobName", jobName)

Check failure on line 78 in pkg/controllers/v1alpha1/dataload/status_handler.go

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Define a constant instead of duplicating this literal "DataLoad job still running" 3 times.

See more on https://sonarcloud.io/project/issues?id=fluid-cloudnative_fluid&issues=AZ6IP-6jSLWvIaArpaaW&open=AZ6IP-6jSLWvIaArpaaW&pullRequest=5945
return
}
isJobSucceed := finishedJobCondition.Type == batchv1.JobComplete
Expand Down Expand Up @@ -188,6 +188,54 @@
}

func (o *OnEventStatusHandler) GetOperationStatus(ctx cruntime.ReconcileRequestContext, opStatus *datav1alpha1.OperationStatus) (result *datav1alpha1.OperationStatus, err error) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost a line-for-line copy of OnceStatusHandler.GetOperationStatus above (lines 54-111), and the same duplication shows up in the datamigrate package. Could the shared body be pulled into a small helper (e.g. one that takes the client plus releaseName/jobName and returns the updated OperationStatus) so the two handlers do not drift over time? Not blocking, but it would make future status-handling fixes a single edit instead of four.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged . I'll keep this fix focused and open a separate issue + PR for the refactor to avoid expanding scope here. Happy to do it as a follow-up once this merges.

//TODO implement me
return nil, nil
result = opStatus.DeepCopy()
releaseName := utils.GetDataLoadReleaseName(o.dataLoad.GetName())
jobName := utils.GetDataLoadJobName(releaseName)

ctx.Log.V(1).Info("DataLoad chart already existed, check its running status")
job, err := kubeclient.GetJob(o.Client, jobName, ctx.Namespace)
if err != nil {
if utils.IgnoreNotFound(err) == nil {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This job-missing branch (the helm.DeleteReleaseIfExists call plus early return) isn't exercised by any test. Since it's a real control-flow path, a case where the job is NotFound would be worth adding.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job-missing path is hard to cover in unit tests because DeleteReleaseIfExists shells out to the ddc-helm binary which isn't present in the test environment the same gap exists in OnceStatusHandler. I attempted to add a NotFound test but it fails with exec: "ddc-helm": executable file not found. Happy to add it if there's a recommended way to mock helm calls in this codebase, otherwise I'll leave it consistent with the existing OnceStatusHandler pattern.

ctx.Log.Info("Related Job missing, will delete helm chart and retry", "namespace", ctx.Namespace, "jobName", jobName)
if err = helm.DeleteReleaseIfExists(releaseName, ctx.Namespace); err != nil {
ctx.Log.Error(err, "can't delete dataload release", "namespace", ctx.Namespace, "releaseName", releaseName)
return
}
return
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call adding the explicit return here after DeleteReleaseIfExists succeeds. The OnceStatusHandler version falls through and ends up logging a nil error, so this branch is genuinely cleaner and matches the Cron handler. Worth keeping.

}
ctx.Log.Error(err, "can't get dataload job", "namespace", ctx.Namespace, "jobName", jobName)
return
}
Comment on lines +197 to +208
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a missing return statement when the related job is not found and the helm release is successfully deleted. If helm.DeleteReleaseIfExists succeeds, err is set to nil. Without an early return, the execution flows to ctx.Log.Error(err, ...) which will log a nil error as an error, and then return result, nil instead of returning early.

Suggested change
if err != nil {
if utils.IgnoreNotFound(err) == nil {
ctx.Log.Info("Related Job missing, will delete helm chart and retry", "namespace", ctx.Namespace, "jobName", jobName)
if err = helm.DeleteReleaseIfExists(releaseName, ctx.Namespace); err != nil {
ctx.Log.Error(err, "can't delete dataload release", "namespace", ctx.Namespace, "releaseName", releaseName)
return
}
}
ctx.Log.Error(err, "can't get dataload job", "namespace", ctx.Namespace, "jobName", jobName)
return
}
if err != nil {
if utils.IgnoreNotFound(err) == nil {
ctx.Log.Info("Related Job missing, will delete helm chart and retry", "namespace", ctx.Namespace, "jobName", jobName)
if err = helm.DeleteReleaseIfExists(releaseName, ctx.Namespace); err != nil {
ctx.Log.Error(err, "can't delete dataload release", "namespace", ctx.Namespace, "releaseName", releaseName)
return
}
return
}
ctx.Log.Error(err, "can't get dataload job", "namespace", ctx.Namespace, "jobName", jobName)
return
}


finishedJobCondition := kubeclient.GetFinishedJobCondition(job)
if finishedJobCondition == nil {
ctx.Log.V(1).Info("DataLoad job still running", "namespace", ctx.Namespace, "jobName", jobName)
return
}
isJobSucceed := finishedJobCondition.Type == batchv1.JobComplete

if result.NodeAffinity == nil && isJobSucceed {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The success-path tests use jobs without the affinity-inject annotation, so GenerateNodeAffinity short-circuits to (nil, nil) and this label-generation block (plus its wrapped-error path) never actually runs. A test with the annotation present would cover it.

result.NodeAffinity, err = dataflow.GenerateNodeAffinity(job)
if err != nil {
return nil, errors.Wrap(err, "error to generate the node labels")
}
}

result.Conditions = []datav1alpha1.Condition{
{
Type: common.ConditionType(finishedJobCondition.Type),
Status: finishedJobCondition.Status,
Reason: finishedJobCondition.Reason,
Message: finishedJobCondition.Message,
LastProbeTime: finishedJobCondition.LastProbeTime,
LastTransitionTime: finishedJobCondition.LastTransitionTime,
},
}
if isJobSucceed {
result.Phase = common.PhaseComplete
} else {
result.Phase = common.PhaseFailed
}
result.Duration = utils.CalculateDuration(job.CreationTimestamp.Time, finishedJobCondition.LastTransitionTime.Time)
return
}
Loading