Skip to content

Conversation

@AlexJones0
Copy link
Contributor

@AlexJones0 AlexJones0 commented Feb 9, 2026

This PR introduces a large number of tests specifically targeting the scheduler. The primary goals are to:

  1. Provide a better understanding of the existing features of the scheduler.
  2. Show a clearer picture of what is and isn't working within the scheduler, so we know what may need to be fixed.
  3. Enable refactoring and/or rewriting of the scheduler with reduced chances of introducing unintentional changes or regressions in functionality.

To those ends, this PR:

  1. Creates mocks of the scheduled jobs / Launcher class to allow us to test the behaviour of the scheduler, by querying the mocked data.
  2. Introduces additional test dev dependencies and project/CI config to allow better testing of the scheduler.
  3. Adds a wide variety of tests covering: smoke tests, job launching / polling, job weighting / priority, parallelism, dependency resolution, error cases, edge cases, etc. as well as tests for the scheduler's signal handlers.

See the commit messages and comments for more information. The intention is to address issues in follow up PRs (either with bug fixes or complete refactors/rewrites), rather than overloading this PR with even more content.

@AlexJones0 AlexJones0 force-pushed the scheduler_tests branch 3 times, most recently from 121a336 to 5bccb81 Compare February 9, 2026 21:13
Add mock implementations of the DVSim launcher that can be used to test
the scheduler implementation. This provides functionality for mocking
each job (by name, despite the scheduler only being provided the class),
with the ability to change the reported status, vary the reported status
over a number of polls, emulate launcher errors and launcher busy
errors, and emulate a job that takes some time to be killed. We can also
track the number of `launch()`, `poll()` and `kill()` calls on a per-job
basis.

The mock launcher itself maintains a central context which can be used
to control the job configuration, but also track the maximum number of
concurrent jobs, as well as the order in which jobs started running and
were completed. These features are useful for writing a variety of unit
tests for the scheduler via the public APIs that remain opaque to the
scheduler operation itself.

Also define some fixtures that will be commonly used across the
different scheduler tests.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Add 3 initial tests for the scheduler which act as basic smoke tests to
ensure that the scheduler at least appears to work on a basic level.

The tests that the scheduler can handle being given no jobs, 1 job, and
5 jobs, where jobs are just some basic mock jobs.

To help define these tests (and the creation of future tests), a variety
of factories and helper functions / utilities are introduced for common
test patterns, including the ability to define a single job spec or
multiple job specifications that vary in some pre-determined way, and to
create paths for tests where the scheduler logic makes use of file I/O
operations and therefore expects output paths to exist.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Add the `--strict` flag to the Python tests in CI.

This flag makes Pytest run with strict operation, which does a few
useful things for us. Importantly:
* For tests that are expected to fail (due to known failures being
  addressed in the future), it will error if the test unexpectedly
  passes and is not correspondingly marked strict=False. This lets us
  still support flaky tests but ensures that test xfail markers are kept
  up-to-date with the code itself.
* For any markers which pytest doesn't recognize / hasn't been informed
  about, it will explicitly error.
* For any unknown command-line options, pytest will error.

This lets us catch typos / stale configuration, and ensure that the code
remains more in sync with the statuses of the tests.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
For changes to the scheduler, which is a core part of DVSim, we write
tests beforehand to be sure that we are not breaking core functionality.
Some of these test cases find issues where DVSim cannot handle its given
inputs, but we do not want to make changes to fix DVSim at this stage.

In such cases, where tests may get caught in infinite loops, it makes
sense to have the ability to specify a test timeout. This can already be
done as a pytest option, but we want to enable a default timeout so that
anyone running the test suite without knowledge of these issues does not
run into issues and can still see the "expected fail" (xfail) result,
rather than the test being skipped.

The `pytest-timeout` dependency lets us mark individual tests with
timeouts values so that we can do this.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
This commit introduces many more tests (approx ~25 unique test cases)
aimed to cover a variety of different aspects and functionalities of
DVSim's scheduler, particularly relating to:
- How jobs are dispatched in parallel, and whether `max_parallel` is
  respected.
- Check that jobs are repeatedly polled until they are completed, and
  then not additionally polled.
- Test for different launcher error / launcher busy error cases, and
  that launcher errors and failed jobs appropriately propagate to
  subsequent / dependent jobs.
- Various tests for job dependencies and how they are handled across
  and within targets.
- Tests that jobs are prioritised according to their (target's) assigned
  weighting, and that high weight jobs with unfulfilled dependencies
  can't starve low weight jobs with filled dependencies.
- Checks for various edge cases (needs / does not need all passing with
  no dependencies, dependency cycles, zero weight sum, etc.)

todo: add bulk of scheduler tests

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
These packages are added as optional `test` dependencies to aid in the
development of tests. `pytest-repeat` is a plugin for pytest to make it
easily to repeat tests a number of times. This can be useful for
checking for test flakiness and non-idempotency. `pytest-xdist` is a
plugin for pytest to allow you to run tests in parallel, distributing
tests across multiple CPUs to speed up execution.

The combination of these two plugins lets us run many iterations of
tests in parallel to quickly catch potential issues with flaky behaviour
in tests. Consider running e.g.
  pytest -n auto --count=10 --timeout=5

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Add tests for the Scheduler's signal handler used to allow you to quit
DVSim while the scheduler is running. We test a few different things
using different parameters:
* Test that sending either SIGTERM/SIGINT causes DVSim to gracefully
  exit, killing ongoing (or to-be-dispatched) processes.
* Test that sending SIGINT twice will call the original installed SIGINT
  handler, i.e. it will cause DVSim to instantly die instead.
* Test the above cases with both short and long polls, where long polls
  mean that the scheduler will enter a sleep/wait for ~100 hours between
  polls. As long as this does not timeout, this tests that the scheduler
  is correctly woken from its sleep/wait by the signal and does not have
  to wait for a `poll_freq` duration to handle the signal.

Note the current marked expected failure - from local testing during
development, the use of a `threading.Event` (and perhaps logging) in the
signal handler is not async-signal-safe and therefore, especially when
configured will a poll frequency of 0 (i.e. poll as fast as possible),
we sometimes see tests enter deadlocks and thus fail a small percentage
of the time.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
The signal handler tests do not count towards coverage because they are
tested in a separate process (to avoid signals being intercepted by
`pytest` instead). We can still capture this coverage however by
configuring pytest-coverage to know that we may be executing tests with
multiple processes by using `multiprocessing`.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant