Before running any testing locally, ensure you have run pip install -r requirements-dev.txt in your environment.
Unit tests are designed to test specific Eval components or features in isolation. Generally, new code should be adding or modifying unit tests.
All unit tests currently live in the tests/ directory and are run with pytest via tox.
To run the unit tests, you can run tox -e unit or tox -e unitcov if you want to generate coverage metrics as well.
In CI, the tests are run with on Ubuntu and MacOS runners - you can see the details here
Functional tests are designed to test Eval components or features in tandem, but not necessarily as part of a complex workflow. New code may or may not need a functional test but should strive to implement one if possible.
The functional test script is Shell-based and can be found at scripts/functional-tests.sh.
To run the functional tests, you can run tox -e functional.
In CI, the tests are run on Ubuntu and MacOS runners - you can see the details here
InstructLab Eval has several end-to-end jobs that run to ensure compatibility with the InstructLab Core project. You can see details about the types of jobs being run in the matrix below.
For more details about the E2E scripts themselves, see the InstructLab Core documentation.
| Name | T-Shirt Size | Runner Host | Instance Type | OS | GPU Type | Script | Flags | Runs when? | Slack/Discord reporting? |
|---|---|---|---|---|---|---|---|---|---|
e2e-nvidia-l4-x1.yml |
Medium | AWS | g6.8xlarge |
CentOS Stream 9 | 1 x NVIDIA L4 w/ 24 GB VRAM | e2e-ci.sh |
m |
Pull Requests, Push to main or release-* branch |
No |
e2e-nvidia-l40s-x4.yml |
Large | AWS | g6e.12xlarge |
CentOS Stream 9 | 4 x NVIDIA L40S w/ 48 GB VRAM (192 GB) | e2e-ci.sh |
l |
Manually by Maintainers, Automatically against main branch at 4PM UTC |
Yes |
Some E2E jobs send their results to the channel #e2e-ci-results via the Son of Jeeves bot in both Discord and Slack. You can see which jobs currently have reporting via the "Current E2E Jobs" table above.
In Slack, this has been implemented via the official Slack GitHub Action. In Discord, we use actions/actions-status-discord and the built-in channel webhooks feature.
For the E2E jobs that can be launched manually, they take an input field that specifies the PR number or git branch to run them against. If you run them against a PR, they will automatically post a comment to the PR when the tests begin and end so it's easier for those involved in the PR to follow the results.
- Visit the Actions tab.
- Click on one of the E2E workflows on the left side of the page.
- Click on the
Run workflowbutton on the right side of the page. - Enter a branch name or a PR number in the input field.
- Click the green
Run workflowbutton.
Note
Only users with "Write" permissions to the repo can run CI jobs manually