Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .claude/commands/red.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,21 @@ This phase is **not part of the regular TDD workflow** and must only be applied
- Once sufficient understanding is achieved, all spike code is discarded, and normal TDD resumes starting from the **Red Phase**.
- A Spike is justified only when it is impossible to define a meaningful failing test due to technical uncertainty or unknown system behavior.

### If a New Test Passes Immediately

If a newly written test passes without any implementation change, do not assume it is correct. Verify it actually exercises the intended behavior:

1. Identify the implementation line most likely responsible for the pass
2. Temporarily remove that line
3. Run the **full test suite** (not just the new test)

Then interpret the result:

- **Only the new test fails** — the line was never driven by a prior test. This is accidental over-implementation: delete the line permanently and proceed to the green phase to reintroduce it properly.
- **Other existing tests also fail** — the line was already legitimately required by prior work. The new test is valid regression coverage. Restore the line; the test is confirmed correct as written.

In both cases, confirm the new test fails for the expected reason before proceeding (the right assertion, not a syntax or import error).

### General Information

- Sometimes the test output shows as no tests have been run when a new test is failing due to a missing import or constructor. In such cases, allow the agent to create simple stubs. Ask them if they forgot to create a stub if they are stuck.
Expand Down
21 changes: 20 additions & 1 deletion .claude/settings/permissions/bash.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,17 @@
"Bash(tail *)",
// Search
"Bash(rg *)",
// Research
"Bash(gh issue list *)",
"Bash(gh pr view *)",
"Bash(gh pr diff *)"
],
"ask": [
"Bash(gh *)", // let's hold off before we let it use the github CLI in any free running allow mode...I don't want it somehow approving PRs with the user's credentials
// let's hold off before we let it use the github CLI in any free running allow mode...I don't want it somehow approving PRs with the user's credentials
"Bash(gh repo *)",
"Bash(gh release *)",
"Bash(gh secret *)",
"Bash(gh ruleset *)",
"Bash(aws *)", // let's hold off before we let it use AWS CLI in any free running allow mode. We need to be very sure we don't have any access to staging or production credentials in our dev environment (...which we shouldn't...but we need to double check that or consider any other safeguards first)
"Bash(curl *)",
"Bash(ln *)",
Expand All @@ -85,6 +93,17 @@
"deny": [
// Exceptions to generally allowed AI tooling
"Bash(bd init*)", // we need to control the init process, don't let AI do that in the background
// Github
// Claude should not ever interfere with the PR process, that is how we gate AI's work
"Bash(gh pr create *)",
"Bash(gh pr edit *)",
"Bash(gh pr ready *)",
"Bash(gh pr review *)",
"Bash(gh pr merge *)",
"Bash(gh pr close *)",
"Bash(gh pr comment *)",
"Bash(gh pr update-branch *)",

// Destructive File Operations
"Bash(chmod -R *)",
"Bash(chown -R *)",
Expand Down
2 changes: 2 additions & 0 deletions .coderabbit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ reviews:
instructions: "These files came from a vendor and we're not allowed to change them. Refer to it if you need to understand how the main code interacts with it, but do not make comments about it."
- path: "**/*.py"
instructions: "Check the `ruff.toml` and `ruff-test.toml` for linting rules we've explicitly disabled and don't suggest changes to please conventions we've disabled. Do not express concerns about ruff rules; a pre-commit hook already runs a ruff check. Do not warn about unnecessary super().__init__() calls; pyright prefers those to be present. Do not warn about missing type hints; a pre-commit hook already checks for that."
- path: "**/.copier-answers.yml"
instructions: "Do not comment about the `_commit` value needing to be a clean release tag. A CI job will fail if that is not the case."
tools:
eslint: # when the code contains typescript, eslint will be run by pre-commit, and coderabbit often generates false positives
enabled: false
Expand Down
2 changes: 1 addition & 1 deletion .copier-answers.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Changes here will be overwritten by Copier
_commit: v0.0.107
_commit: v0.0.110
_src_path: gh:LabAutomationAndScreening/copier-base-template.git
description: A web app that is hosted within a local intranet. Nuxt frontend, python
backend, docker-compose
Expand Down
2 changes: 1 addition & 1 deletion .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -65,5 +65,5 @@
"initializeCommand": "sh .devcontainer/initialize-command.sh",
"onCreateCommand": "sh .devcontainer/on-create-command.sh",
"postStartCommand": "sh .devcontainer/post-start-command.sh"
// Devcontainer context hash (do not manually edit this, it's managed by a pre-commit hook): 80d9f36a # spellchecker:disable-line
// Devcontainer context hash (do not manually edit this, it's managed by a pre-commit hook): 69f80248 # spellchecker:disable-line
}
23 changes: 23 additions & 0 deletions .devcontainer/manual-setup-deps.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

REPO_ROOT_DIR = Path(__file__).parent.parent.resolve()
ENVS_CONFIG = REPO_ROOT_DIR / ".devcontainer" / "envs.json"
PULUMI_CLI_INSTALL_SCRIPT = REPO_ROOT_DIR / ".devcontainer" / "install-pulumi-cli.sh"
UV_PYTHON_ALREADY_CONFIGURED = "UV_PYTHON" in os.environ
parser = argparse.ArgumentParser(description="Manual setup for dependencies in the repo")
_ = parser.add_argument(
Expand Down Expand Up @@ -44,6 +45,12 @@
default=False,
help="Allow uv to install new versions of Python on the fly. This is typically only needed when instantiating the copier template.",
)
_ = parser.add_argument(
"--skip-installing-pulumi-cli",
action="store_true",
default=False,
help="Do not install the Pulumi CLI even if the lock file references it",
)


class PackageManager(str, enum.Enum):
Expand Down Expand Up @@ -127,6 +134,22 @@ def main():
check=True,
env=uv_env,
)
if (
not generate_lock_file_only
and not args.skip_installing_pulumi_cli
and platform.system() == "Linux"
and env.lock_file.exists()
and '"pulumi"' in env.lock_file.read_text()
):
if not PULUMI_CLI_INSTALL_SCRIPT.exists():
print(
f"Pulumi CLI install script not found at {PULUMI_CLI_INSTALL_SCRIPT}, skipping Pulumi CLI installation"
)
else:
_ = subprocess.run(
["sh", str(PULUMI_CLI_INSTALL_SCRIPT), str(env.lock_file)],
check=True,
)
elif env.package_manager == PackageManager.PNPM:
pnpm_command = ["pnpm", "install", "--dir", str(env.path)]
if env_check_lock:
Expand Down
7 changes: 6 additions & 1 deletion .github/actions/install_deps/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ inputs:
description: Whether to skip updating the hash when running manual-setup-deps.py
default: true
required: false
skip-installing-pulumi-cli:
type: boolean
description: Whether to skip installing the Pulumi CLI even if the lock file references it
default: false
required: false


runs:
Expand Down Expand Up @@ -83,5 +88,5 @@ runs:
- name: Install dependencies
# the funky syntax is github action ternary
if: ${{ inputs.install-deps }}
run: python .devcontainer/manual-setup-deps.py ${{ inputs.python-version == 'notUsing' && '--no-python' || '' }} ${{ inputs.node-version == 'notUsing' && '--no-node' || '' }} ${{ inputs.skip-updating-devcontainer-hash && '--skip-updating-devcontainer-hash' || '' }}
run: python .devcontainer/manual-setup-deps.py ${{ inputs.python-version == 'notUsing' && '--no-python' || '' }} ${{ inputs.node-version == 'notUsing' && '--no-node' || '' }} ${{ inputs.skip-updating-devcontainer-hash && '--skip-updating-devcontainer-hash' || '' }} ${{ inputs.skip-installing-pulumi-cli && '--skip-installing-pulumi-cli' || '' }}
shell: pwsh
2 changes: 2 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ env:

permissions:
id-token: write # needed to assume OIDC roles (e.g. for downloading from CodeArtifact)
contents: read # need to explicitly provide this whenever defining permissions because the default value is 'none' for anything not explicitly set when permissions are defined

jobs:
get-values:
Expand All @@ -22,6 +23,7 @@ jobs:

check-skip-duplicate:
runs-on: ubuntu-24.04
timeout-minutes: 2
outputs:
should-run: ${{ steps.check.outputs.should-run }}
steps:
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/pre-commit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ jobs:
python-version: ${{ inputs.python-version }}
node-version: ${{ inputs.node-version }}
skip-installing-ssm-plugin-manager: true
skip-installing-pulumi-cli: true

- name: Set up mutex # Github concurrency management is horrible, things get arbitrarily cancelled if queued up. So using mutex until github fixes itself. When multiple jobs are modifying cache at once, weird things can happen. possible issue is https://github.com/actions/toolkit/issues/658
if: ${{ runner.os != 'Windows' }} # we're just gonna have to YOLO on Windows, because this action doesn't support it yet https://github.com/ben-z/gh-action-mutex/issues/14
Expand Down
17 changes: 13 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,24 +17,32 @@ This project is a Copier template used to generate applications that are able to

## Testing

- Always run tests with an explicit path (e.g. uv run pytest tests/unit) — test runners discover all types by default.
- Always run tests with an explicit path (e.g. uv run pytest tests/unit) — test runners discover all types (unit, integration, E2E...) by default.
- When iterating on a single test, run that test in isolation first and confirm it is in the expected state (red or green) before widening to the full suite. Use the most targeted invocation available: a specific test function for Python (e.g. `uv run pytest path/to/test.py::test_name --no-cov`) or a file path and name filter for TypeScript (e.g. `pnpm test-unit -- path/to/test.spec.ts -t "test name" --no-coverage`). Only run the full suite once the target test behaves as expected.
- Test coverage requirements are usually at 100%, so when running a subset of tests, always disable test coverage to avoid the test run failing for insufficient coverage.
- Avoid magic values in comparisons in tests in all languages (like ruff rule PLR2004 specifies)
- Prefer using random values in tests rather than arbitrary ones (e.g. the faker library, uuids, random.randint) when possible. For enums, pick randomly rather than hardcoding one value.
- Avoid loops in tests — assert each item explicitly so failures pinpoint the exact element. When verifying a condition across all items in a collection, collect the violations into a list and assert it's empty (e.g., assert [x for x in items if bad_condition(x)] == []).
- Key `data-testid` selectors off unique IDs (e.g. UUIDs), not human-readable names which may collide or change.
- When a test's final assertion is an absence (e.g., element is `null`, list is empty, modal is closed), include a prior presence assertion confirming the expected state existed before the action that removed it. A test whose only assertion is an absence check can pass vacuously if setup silently failed.
- When asserting a mock or spy was called with specific arguments, always constrain as tightly as possible. In order of preference: (1) assert called exactly once with those args (`assert_called_once_with` in Python, `toHaveBeenCalledExactlyOnceWith` in Vitest/Jest); (2) if multiple calls are expected, assert the total call count and use a positional or last-call assertion (`nthCalledWith`, `lastCalledWith` / `assert_has_calls` with `call_args_list[n]`); (3) plain "called with at any point" (`toHaveBeenCalledWith`, `assert_called_with`) is a last resort only when neither the call count nor the call order can reasonably be constrained.

### Python Testing

- When using `mocker.spy` on a class-level method (including inherited ones), the spy records the unbound call, so assertions need `ANY` as the first argument to match self: `spy.assert_called_once_with(ANY, expected_arg)`
- When using `mocker.spy` on a class-level method (including inherited ones), the spy records the unbound call, so assertions need `ANY` as the first argument to match self: `spy.assert_called_once_with(ANY, expected_arg)`
- Before writing new mock/spy helpers, check the `tests/unit/` folder for pre-built helpers in files like `fixtures.py` or `*mocks.py`
- When a test needs a fixture only for its side effects (not its return value), use `@pytest.mark.usefixtures(fixture_name.__name__)` instead of adding an unused parameter with a noqa comment
- Use `__name__` instead of string literals when referencing functions/methods (e.g., `mocker.patch.object(MyClass, MyClass.method.__name__)`, `pytest.mark.usefixtures(my_fixture.__name__)`). This enables IDE refactoring tools to catch renames.
- When using the faker library, prefer the pytest fixture (provided by the faker library) over instantiating instances of Faker.
- **Choosing between cassettes and mocks:** At the layer that directly wraps an external API or service, strongly prefer VCR cassette-recorded interactions (via pytest-recording/vcrpy) — they capture real HTTP traffic and verify the wire format, catching integration issues that mocks would miss. At layers above that (e.g. business logic, route handlers), mock the wrapper layer instead (e.g. `mocker.patch.object(ThresholdsRepository, ...)`) — there is no value in re-testing the HTTP interaction from higher up.
- **Never hand-write VCR cassette YAML files.** Cassettes must be recorded from real HTTP interactions by running the test once with `--record-mode=once` against a live external service: `uv run pytest --record-mode=once <test path> --no-cov`. The default mode is `none` — a missing cassette will cause an error, which is expected until recorded.
- **Never hand-edit syrupy snapshot files.** Snapshots are auto-generated — to create or update them, run `uv run pytest --snapshot-update <test path> --no-cov`. A missing snapshot causes the test to fail, which is expected until you run with `--snapshot-update`. When a snapshot mismatch occurs, fix the code if the change was unintentional; run `--snapshot-update` if it was intentional.
- **Never hand-write or hand-edit pytest-reserial `.jsonl` recording files.** Recordings must be captured from real serial port traffic by running the test with `--record` while the device is connected: `uv run pytest --record <test path> --no-cov`. The default mode replays recordings — a missing recording causes an error, which is expected until recorded against a live device.

### Frontend Testing

- Key `data-testid` selectors off unique IDs (e.g. UUIDs), not human-readable names which may collide or change.
- In DOM-based tests, scope queries to the tightest relevant container. Only query `document` or `document.body` directly to find the top-level portal/popup element (e.g. a Reka UI dialog via `[role="dialog"][data-state="open"]`); all further queries should run on that element, not on `document.body` again.

# Agent Implementations & Configurations

## Memory and Rules
Expand All @@ -49,7 +57,8 @@ This project is a Copier template used to generate applications that are able to
- For frontend tests, run commands via `pnpm` scripts from `frontend/package.json` — never invoke tools directly (not pnpm exec <tool>, npx <tool>, etc.). ✅ pnpm test-unit ❌ pnpm vitest ... or npx vitest ...
- For linting and type-checking, prefer `pre-commit run <hook-id>` over invoking tools directly — this matches the permission allow-list and mirrors what CI runs. Key hook IDs: `typescript-check`, `eslint`, `pyright`, `ruff`, `ruff-format`.
- Never rely on IDE diagnostics for ruff warnings — the IDE may not respect the project's ruff.toml config. Run `pre-commit run ruff -a` to get accurate results.
- When running terminal commands, execute exactly one command per tool call. Do not chain commands with &&, ||, ;, or & — this prohibition has no exceptions, even for `cd && ...` patterns. Use absolute paths instead of `cd` to avoid needing to chain. Pipes (|) are allowed for output transformation (e.g., head, tail, grep). If two sequential commands are needed, run them in separate tool calls. Chained commands break the permission allow-list matcher and cause unnecessary permission prompts
- When running terminal commands, execute exactly one command per tool call. Do not chain commands with &&, ||, ;, or & — this prohibition has no exceptions, even for `cd && ...` patterns. Use `cd` to change to the directory you want before running the command, avoiding the need to chain. Pipes (|) are allowed for output transformation (e.g., head, tail, grep). If two sequential commands are needed, run them in separate tool calls. Chained commands break the permission allow-list matcher and cause unnecessary permission prompts
- Never use `pnpm --prefix <path>` or `uv --directory <path>` to target a different directory — these flags break the permission allow-list matcher the same way chained `cd &&` commands do. Instead, rely on the working directory already being correct (the cwd persists between Bash tool calls), or issue a plain `cd <path>` as a separate prior tool call to reposition before running the command.
- Never use backslash line continuations in shell commands — always write the full command on a single line. Backslashes break the permission allow-list matcher.
- **Never manually edit files in any `generated/` folder.** These files are produced by codegen tooling (typically Kiota) and any manual changes will be overwritten. If a generated file needs to change, update the source (e.g. the OpenAPI schema) and re-run the generator.

Expand Down
6 changes: 3 additions & 3 deletions extensions/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,12 @@ def hook(self, context: dict[Any, Any]) -> dict[Any, Any]:
context["copier_version"] = "==9.14.0"
context["copier_template_extensions_version"] = "==0.3.3"
context["sphinx_version"] = "9.0.4"
context["pulumi_version"] = ">=3.226.0"
context["pulumi_version"] = ">=3.228.0"
context["pulumi_aws_version"] = ">=7.23.0"
context["pulumi_aws_native_version"] = ">=1.57.0"
context["pulumi_aws_native_version"] = ">=1.59.0"
context["pulumi_command_version"] = ">=1.2.1"
context["pulumi_github_version"] = ">=6.12.1"
context["pulumi_okta_version"] = ">=6.2.3"
context["pulumi_okta_version"] = ">=6.4.0"
context["boto3_version"] = ">=1.42.53"
context["ephemeral_pulumi_deploy_version"] = ">=0.0.6"
context["pydantic_version"] = ">=2.12.5"
Expand Down
15 changes: 15 additions & 0 deletions template/.claude/commands/red.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,21 @@ This phase is **not part of the regular TDD workflow** and must only be applied
- Once sufficient understanding is achieved, all spike code is discarded, and normal TDD resumes starting from the **Red Phase**.
- A Spike is justified only when it is impossible to define a meaningful failing test due to technical uncertainty or unknown system behavior.

### If a New Test Passes Immediately

If a newly written test passes without any implementation change, do not assume it is correct. Verify it actually exercises the intended behavior:

1. Identify the implementation line most likely responsible for the pass
2. Temporarily remove that line
3. Run the **full test suite** (not just the new test)

Then interpret the result:

- **Only the new test fails** — the line was never driven by a prior test. This is accidental over-implementation: delete the line permanently and proceed to the green phase to reintroduce it properly.
- **Other existing tests also fail** — the line was already legitimately required by prior work. The new test is valid regression coverage. Restore the line; the test is confirmed correct as written.

In both cases, confirm the new test fails for the expected reason before proceeding (the right assertion, not a syntax or import error).

### General Information

- Sometimes the test output shows as no tests have been run when a new test is failing due to a missing import or constructor. In such cases, allow the agent to create simple stubs. Ask them if they forgot to create a stub if they are stuck.
Expand Down
Loading