Skip to content

[CASCL-1304] Add e2e assertion for dd-cluster-info ConfigMap#3025

Open
L3n41c wants to merge 2 commits into
mainfrom
lenaic/CASCL-1304-e2e-dd-cluster-info-configmap
Open

[CASCL-1304] Add e2e assertion for dd-cluster-info ConfigMap#3025
L3n41c wants to merge 2 commits into
mainfrom
lenaic/CASCL-1304-e2e-dd-cluster-info-configmap

Conversation

@L3n41c
Copy link
Copy Markdown
Member

@L3n41c L3n41c commented May 18, 2026

What does this PR do?

Adds a Verify dd-cluster-info ConfigMap sub-test to TestAutoscalingDefault
in test/e2e/tests/autoscaling_suite/autoscaling_test.go. The sub-test
fetches the dd-cluster-info ConfigMap that
kubectl datadog autoscaling cluster install writes in the dd-karpenter
namespace, unmarshals its YAML payload, and asserts every deterministic
field of the snapshot:

  • ConfigMap metadata (name, namespace, app.kubernetes.io/managed-by label, data key).
  • Schema version, ClusterName, ClusterARN (matched against a regex anchored
    on s.clusterName), Region, and a 30-minute freshness bound on GeneratedAt.
  • NodeManagement.eksManagedNodeGroup contains exactly two entries whose
    names match the linux and linux-arm patterns provisioned by the
    e2e-framework.
  • NodeManagement.fargate contains the install-time dd-karpenter-<cluster>
    profile, flagged as ManagedByDatadog.
  • NodeManagement.karpenter is non-empty and every entry has the
    dd-karpenter- name prefix and ManagedByDatadog: true.
  • Autoscaling.ClusterAutoscaler.Present == false, EKSAutoMode.Enabled == false,
    and Karpenter is present in dd-karpenter, named karpenter, with
    ManagedByDatadog, a populated InstallerVersion, and a SemVer-shaped
    Version.

The full YAML payload is logged unconditionally so any assertion failure
can be debugged from CI logs alone.

Motivation

PRs #2945 and #2980 introduced the dd-cluster-info ConfigMap as the source of truth consumed by the
follow-up Karpenter migration tool. No e2e coverage was asserting that
the ConfigMap actually gets written with the expected shape after a real
install on EKS. This adds that coverage as a sub-test of the existing
default install flow, so it shares the cluster provisioning and install
steps with the other autoscaling tests — no extra infrastructure cost.

Jira: CASCL-1304.

Additional Notes

The dd-cluster-info payload is unmarshaled into a locally-defined
wire-format struct rather than importing the producer's
cmd/kubectl-datadog/autoscaling/cluster/common/clusterinfo package, to
avoid pulling controller-runtime, karpenter, and the AWS autoscaling SDK
transitive dependencies into the e2e test module. The duplication is
documented and kept minimal.

Minimum Agent Versions

No agent-side change.

  • Agent: N/A
  • Cluster Agent: N/A

Describe your test plan

  • The new sub-test runs as part of the existing autoscaling e2e suite on
    the EKS GitLab job; no new pipeline wiring required.
  • Locally, the additions pass go vet, go build, and golangci-lint
    for the test/e2e module.
  • The full ConfigMap YAML is logged via t.Logf so any future field
    drift surfaces directly in CI logs.

Checklist

  • PR has at least one valid label: enhancement
  • PR has a milestone or the qa/skip-qa label (qa/skip-qa: this is
    test-only and doesn't affect user-facing behavior)
  • All commits are signed

Add a `Verify dd-cluster-info ConfigMap` sub-test to
TestAutoscalingDefault that fetches the ConfigMap written by
`kubectl datadog autoscaling cluster install`, unmarshals its YAML
payload, and asserts every deterministic field of the snapshot:
APIVersion, ClusterName, ClusterARN (regex), Region, GeneratedAt
freshness, the two EKS managed node groups, the install-time
`dd-karpenter-<cluster>` Fargate profile flagged as ManagedByDatadog,
the Datadog-managed NodePools, and the Autoscaling block reporting
the freshly installed Karpenter as the only autoscaler. The full
YAML payload is logged so CI failures are debuggable from logs.

The wire-format struct is duplicated rather than imported to keep
the e2e test module free of controller-runtime, karpenter, and the
AWS autoscaling SDK transitive dependencies.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@L3n41c L3n41c added enhancement New feature or request qa/skip-qa labels May 18, 2026
@L3n41c L3n41c changed the title [CASCL-1304] Add e2e assertion for dd-cluster-info ConfigMap [CASCL-1304] Add e2e assertion for dd-cluster-info ConfigMap May 18, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 41.50%. Comparing base (20ecb9e) to head (fad64b6).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #3025   +/-   ##
=======================================
  Coverage   41.50%   41.50%           
=======================================
  Files         335      335           
  Lines       28714    28714           
=======================================
  Hits        11919    11919           
  Misses      16001    16001           
  Partials      794      794           
Flag Coverage Δ
unittests 41.50% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20ecb9e...fad64b6. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 Bot commented May 18, 2026

Code Coverage

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 41.82% (+0.00%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: fad64b6 | Docs | Datadog PR Page | Give us feedback!

The initial CI run surfaced that Pulumi's DisplayName truncates the
linux-arm node group name to `linux-a-n` once the cluster name
consumes most of the 37-char prefix budget, breaking the substring
check. Replace the linux/linux-arm distinction with structural
assertions that hold regardless of the truncation:

- EKS managed node group bucket: 2 entries, each with exactly 1 node
  (DesiredSize=1) matching the kubelet AWS hostname pattern.
- Fargate bucket: 2 entries — the harness' own profile and the
  install-time `dd-karpenter-<cluster>`; only the latter is
  ManagedByDatadog; both carry at least one `fargate-ip-…` node.
- Karpenter bucket: 2 NodePools `dd-karpenter-*`, ManagedByDatadog,
  with empty Nodes lists (the snapshot is written at install time,
  before Karpenter has provisioned its first node).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@L3n41c
Copy link
Copy Markdown
Member Author

L3n41c commented May 18, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@L3n41c L3n41c marked this pull request as ready for review May 19, 2026 12:24
@L3n41c L3n41c requested a review from a team May 19, 2026 12:24
@L3n41c L3n41c requested a review from a team as a code owner May 19, 2026 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants