Skip to content

CAMEL-23216: Fix flaky mina sftp tests#22983

Open
gnodet wants to merge 1 commit intoapache:mainfrom
gnodet:CAMEL-23216-fix-flaky-mina-sftp-tests
Open

CAMEL-23216: Fix flaky mina sftp tests#22983
gnodet wants to merge 1 commit intoapache:mainfrom
gnodet:CAMEL-23216-fix-flaky-mina-sftp-tests

Conversation

@gnodet
Copy link
Copy Markdown
Contributor

@gnodet gnodet commented May 6, 2026

CAMEL-23216

Summary

Several mina-sftp integration tests were flaky due to SSH connection/authentication timeouts under CI load. The root causes:

  1. No server readiness check — the embedded SFTP server restarted per test but tests could attempt connections before the server was fully accepting them
  2. Too-tight @Timeout(30) — tests making multiple sequential SFTP connections (each with 10s default connectTimeout + 10s auth timeout) could easily exceed the 30-second JUnit timeout, causing thread interruption that propagated as InterruptedIOException in the SSH library

Changes

  • Add server readiness probe in SftpEmbeddedInfraService.setUpServer() — after starting the embedded SSHD server, verify it accepts TCP connections before returning. This eliminates cold-start race conditions.
  • Increase @Timeout from 30s to 60s across all mina-sftp integration tests (7 test classes), aligned with MinaSftpProtocolIT which already used 60s/120s for similar operations.

Test plan

  • All 26 previously-flaky tests pass (MinaSftpAdvancedFileOperationsIT + MinaSftpAuthenticationIT)
  • Full mina-sftp integration test suite passes (249 tests, 0 failures)

@gnodet gnodet requested review from apupier, davsclaus and oscerd May 6, 2026 05:38
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run
  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
  • You can label PRs using skip-tests and test-dependents to fine-tune the checks executed by this PR.
  • Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

🧪 CI tested the following changed modules:

  • components/camel-mina-sftp
  • test-infra/camel-test-infra-ftp
All tested modules (14 modules)
  • Camel :: Azure :: Files
  • Camel :: FTP
  • Camel :: JBang :: MCP
  • Camel :: JBang :: Plugin :: Route Parser
  • Camel :: JBang :: Plugin :: TUI
  • Camel :: JBang :: Plugin :: Validate
  • Camel :: Jsch
  • Camel :: Launcher :: Container
  • Camel :: MINA SFTP
  • Camel :: Test Infra :: All test services
  • Camel :: Test Infra :: Ftp
  • Camel :: YAML DSL :: Validator
  • Camel :: YAML DSL :: Validator Maven Plugin
  • Camel :: Zookeeper Master

⚙️ View full build and test results

- Add server readiness probe in SftpEmbeddedInfraService to verify the
  embedded SFTP server accepts connections before tests run
- Increase test @timeout from 30s to 60s across all mina-sftp integration
  tests to accommodate SSH connection + authentication overhead when
  tests make multiple sequential SFTP connections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gnodet gnodet force-pushed the CAMEL-23216-fix-flaky-mina-sftp-tests branch from 64356c7 to f8b0158 Compare May 6, 2026 06:50
Copy link
Copy Markdown
Contributor

@apupier apupier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the increase of timeout doesn't sound very useful. I looked to the time spent for several of the tests and the max amount of time on Jenkins Ci for one test is 12 seconds. With most of them around 2-3 seconds. When the tests are failing it was all of them. If it was due to the timeout, we should have only the ones with teime spent close to timeout impacted.
In case, the sftpservice is not fixing the issue, it will slow down the whole build.

Thread.onSpinWait();
}
}
LOG.warn("SFTP server may not be ready after timeout on port {}", port);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it a log?
Shouldn't we throw an exception? or use an assertion to prevent the test from going further?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants