Skip to content

Fix Flaky TestScripts#22838

Open
kalverra wants to merge 7 commits into
developfrom
testScriptsStable
Open

Fix Flaky TestScripts#22838
kalverra wants to merge 7 commits into
developfrom
testScriptsStable

Conversation

@kalverra

@kalverra kalverra commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

TestScripts is one of the most flaky areas in /chainlink. I ran the fix-flaky-tests skill on it with opus 4.8. It cost ~$2.00 and took ~30 minutes (excluding time to run the tests over and over to verify fixes).

Changes

  • Potentially Breaking!!: StartUpHealthReport now returns 503 errors for non /health paths instead of the old 404 error. This is more in line with what most clients expect for this sort of situation. curl --retry is used in TestScripts to keep calling these endpoints until they're ready, but it will not retry if it gets a 404.
  • Added tests for tools/txtar/visitor.go and updated it with some more modern Go patterns, should help performance a teeny-tiny bit.

Results

# Before
❯ make test ARGS="diagnose --iterations 100 -- -run TestScripts ."
   Estimated flake rate = 1.6% - 4.2%

# After
❯ make test ARGS="diagnose --iterations 600 -- -run TestScripts ."
   Estimated flake rate = 0% - 0.6%

@kalverra kalverra requested review from a team as code owners June 12, 2026 22:35
@github-actions

Copy link
Copy Markdown
Contributor

👋 kalverra, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@github-actions

Copy link
Copy Markdown
Contributor

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

✅ No conflicts with other open PRs targeting develop

@cl-sonarqube-production

Copy link
Copy Markdown

Quality Gate failed Quality Gate failed

Failed conditions
C Security Rating on New Code (required ≥ A)

See analysis details on SonarQube

Catch issues before they fail your Quality Gate with our IDE extension SonarQube IDE SonarQube IDE

Comment thread tools/txtar/visitor.go
// If we're not recursing, skip all other directories except the root.
if !bool(d.recurse) && !isRootDir {
return nil
if !bool(d.recurse) && filepath.Clean(path) != root {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to use this less strict filepath.Clean comparison instead of os.SameFile?

"github.com/stretchr/testify/require"
)

func writeTreeFile(t *testing.T, path string) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use embedded testdata/ dir files rather than creating them along the way?

@trunk-io

trunk-io Bot commented Jun 12, 2026

Copy link
Copy Markdown

Static BadgeStatic BadgeStatic BadgeStatic Badge

Failed Test Failure Summary Logs
TestCCIPReader_MsgsBetweenSeqNums The test failed because it could not find the expected message or log within the specified sequence number range. Logs ↗︎

View Full Report ↗︎Docs

Comment thread main_test.go
t.Parallel()

require.NoError(t, os.Setenv("TMPDIR", "/tmp")) // osx default is too long for go-plugin sockets
require.NoError(t, os.Setenv("GOTMPDIR", "/tmp")) // keep workspaces in /tmp

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the t.TempDir()?

Comment thread main_test.go
ContinueOnError: true,
Files: filesToRun,
Setup: commonEnv(t),
ContinueOnError: false,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? If there are multiple failing tests, we would rather know about them all at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants