-
Notifications
You must be signed in to change notification settings - Fork 8
Add many scheduler unit & integration tests #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AlexJones0
wants to merge
8
commits into
lowRISC:master
Choose a base branch
from
AlexJones0:scheduler_tests
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,171
−1
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
121a336 to
5bccb81
Compare
Add mock implementations of the DVSim launcher that can be used to test the scheduler implementation. This provides functionality for mocking each job (by name, despite the scheduler only being provided the class), with the ability to change the reported status, vary the reported status over a number of polls, emulate launcher errors and launcher busy errors, and emulate a job that takes some time to be killed. We can also track the number of `launch()`, `poll()` and `kill()` calls on a per-job basis. The mock launcher itself maintains a central context which can be used to control the job configuration, but also track the maximum number of concurrent jobs, as well as the order in which jobs started running and were completed. These features are useful for writing a variety of unit tests for the scheduler via the public APIs that remain opaque to the scheduler operation itself. Also define some fixtures that will be commonly used across the different scheduler tests. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Add 3 initial tests for the scheduler which act as basic smoke tests to ensure that the scheduler at least appears to work on a basic level. The tests that the scheduler can handle being given no jobs, 1 job, and 5 jobs, where jobs are just some basic mock jobs. To help define these tests (and the creation of future tests), a variety of factories and helper functions / utilities are introduced for common test patterns, including the ability to define a single job spec or multiple job specifications that vary in some pre-determined way, and to create paths for tests where the scheduler logic makes use of file I/O operations and therefore expects output paths to exist. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Add the `--strict` flag to the Python tests in CI. This flag makes Pytest run with strict operation, which does a few useful things for us. Importantly: * For tests that are expected to fail (due to known failures being addressed in the future), it will error if the test unexpectedly passes and is not correspondingly marked strict=False. This lets us still support flaky tests but ensures that test xfail markers are kept up-to-date with the code itself. * For any markers which pytest doesn't recognize / hasn't been informed about, it will explicitly error. * For any unknown command-line options, pytest will error. This lets us catch typos / stale configuration, and ensure that the code remains more in sync with the statuses of the tests. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
For changes to the scheduler, which is a core part of DVSim, we write tests beforehand to be sure that we are not breaking core functionality. Some of these test cases find issues where DVSim cannot handle its given inputs, but we do not want to make changes to fix DVSim at this stage. In such cases, where tests may get caught in infinite loops, it makes sense to have the ability to specify a test timeout. This can already be done as a pytest option, but we want to enable a default timeout so that anyone running the test suite without knowledge of these issues does not run into issues and can still see the "expected fail" (xfail) result, rather than the test being skipped. The `pytest-timeout` dependency lets us mark individual tests with timeouts values so that we can do this. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
This commit introduces many more tests (approx ~25 unique test cases) aimed to cover a variety of different aspects and functionalities of DVSim's scheduler, particularly relating to: - How jobs are dispatched in parallel, and whether `max_parallel` is respected. - Check that jobs are repeatedly polled until they are completed, and then not additionally polled. - Test for different launcher error / launcher busy error cases, and that launcher errors and failed jobs appropriately propagate to subsequent / dependent jobs. - Various tests for job dependencies and how they are handled across and within targets. - Tests that jobs are prioritised according to their (target's) assigned weighting, and that high weight jobs with unfulfilled dependencies can't starve low weight jobs with filled dependencies. - Checks for various edge cases (needs / does not need all passing with no dependencies, dependency cycles, zero weight sum, etc.) todo: add bulk of scheduler tests Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
These packages are added as optional `test` dependencies to aid in the development of tests. `pytest-repeat` is a plugin for pytest to make it easily to repeat tests a number of times. This can be useful for checking for test flakiness and non-idempotency. `pytest-xdist` is a plugin for pytest to allow you to run tests in parallel, distributing tests across multiple CPUs to speed up execution. The combination of these two plugins lets us run many iterations of tests in parallel to quickly catch potential issues with flaky behaviour in tests. Consider running e.g. pytest -n auto --count=10 --timeout=5 Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Add tests for the Scheduler's signal handler used to allow you to quit DVSim while the scheduler is running. We test a few different things using different parameters: * Test that sending either SIGTERM/SIGINT causes DVSim to gracefully exit, killing ongoing (or to-be-dispatched) processes. * Test that sending SIGINT twice will call the original installed SIGINT handler, i.e. it will cause DVSim to instantly die instead. * Test the above cases with both short and long polls, where long polls mean that the scheduler will enter a sleep/wait for ~100 hours between polls. As long as this does not timeout, this tests that the scheduler is correctly woken from its sleep/wait by the signal and does not have to wait for a `poll_freq` duration to handle the signal. Note the current marked expected failure - from local testing during development, the use of a `threading.Event` (and perhaps logging) in the signal handler is not async-signal-safe and therefore, especially when configured will a poll frequency of 0 (i.e. poll as fast as possible), we sometimes see tests enter deadlocks and thus fail a small percentage of the time. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
The signal handler tests do not count towards coverage because they are tested in a separate process (to avoid signals being intercepted by `pytest` instead). We can still capture this coverage however by configuring pytest-coverage to know that we may be executing tests with multiple processes by using `multiprocessing`. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
5bccb81 to
a7e172f
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a large number of tests specifically targeting the scheduler. The primary goals are to:
To those ends, this PR:
Launcherclass to allow us to test the behaviour of the scheduler, by querying the mocked data.testdev dependencies and project/CI config to allow better testing of the scheduler.See the commit messages and comments for more information. The intention is to address issues in follow up PRs (either with bug fixes or complete refactors/rewrites), rather than overloading this PR with even more content.