Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .agents/TOOLING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Agent Tooling Notes

These notes are for humans maintaining repository agent setup. They are not part
of the always-loaded agent instructions.

## Shared Instructions

Update `AGENTS.md` for repository-wide agent instructions. `CLAUDE.md` is
symlinked to `AGENTS.md`, so changes there apply to both Codex and Claude Code.

## Local Overrides

For private local instructions, use the tool-specific override file:

- Claude Code: `CLAUDE.local.md` is additive; it is read after `CLAUDE.md`.
- Codex: `AGENTS.override.md` replaces `AGENTS.md` in the same directory, so it
is not additive. Restate any shared instructions that should still apply.
58 changes: 58 additions & 0 deletions .agents/developer-guidelines.md
Comment thread
realAsma marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Coding Principles

Guidelines for production code in ModelOpt. Key values: simplicity, modularity,
and conciseness.

## Principles

- **Prefer simple, surgical changes.** Touch only what the task requires. Avoid speculative
refactors, broad rewrites, and "while we're here" cleanups.
- **Design for simplicity and readability.** Choose the design that is easiest to understand and maintain.
Code is read top to bottom: put high-level behavior first, hide lower-level details behind well-named helpers,
and treat heavy branching as a signal to reconsider the design.
- **Prefer modular, composable solutions.** Avoid input-specific or case-specific hard-coding.
Use existing extension points when they fit. If none fit, add a simple, focused helper,
class, or plugin that cleanly captures the new behavior. Keep scope limited to known cases.
- **Respect inheritance boundaries.** Parent abstractions should define shared contracts and
shared behavior, not child-specific special cases.
- **Don't repeat yourself; keep a single source of truth.** Consolidate repeated logic or intent with a shared helper, API,
or abstraction when doing so keeps the design simpler. Avoid duplication that can drift out of sync.
- **Comment cautiously.** Comments should add context, not translate code into English.
Prefer making the code self-explanatory first. Use comments only for non-obvious
intent or constraints that remain unclear from the code. Apply this guidance to new
comments only; do not rewrite or delete existing comments just for style.
- **Document public APIs.** Public and higher-level APIs should have docstrings, including examples when useful.
Internal helpers should usually be self-documenting through clear names and structure.
- **Fix the bug cause, not the side effect.** For bug fixes, find the root cause instead of patching for its side effect.
- **Validate external input once.** Check types and values at the interface boundary. Internal code can trust those
checks and avoid redundant assertions.
- **Remove dead code.** Delete unused imports, unreachable branches, and obsolete helpers.
- **Use relative paths** from the repo root in commands and file references.

## Testing

- **Develop with focused tests.** During development, write as many focused
tests as needed, including lower-level unit tests or internal probes, to
understand and harden behavior.
- **Curate production tests and keep them lean.** Before staging or committing,
decide which tests should be checked in. Checked-in tests should document
expected behavior, protect against regressions, or flag backward-incompatible
behavior changes. Remove redundant lower-level tests when a higher-level test
already covers the same behavior, keeping CI/CD fast and lean.

## Performant AI Code

- **Keep tensor work on the GPU and avoid unnecessary CPU-GPU syncs.** Reading metadata such as `tensor.shape` is fine.
Avoid Python scalar extraction and operators such as `tensor.item()`, `float(tensor)`, or `min(tensor)` because they
can trigger CPU-GPU syncs. Use PyTorch tensor ops such as `tensor.min()` by default, and only extract Python scalars
when the CPU needs the value. Tensor-value-based Python branching can also break CUDA graphs.
- **Develop with distributed processing in mind.** Examples: Use `print_rank_0` or `warn_rank_0`
when possible to avoid noisy logs. Guard shared side effects, such as
file writes or shared state updates, against race conditions between ranks.

## Compatibility

- **Preserve config and checkpoint backward compatibility.** ModelOpt checkpoints include serialized
`ModeloptBaseConfig` instances such as `QuantizeConfig`. If these Pydantic-based configs change
without backward compatibility handling, older checkpoints may no longer load. Make breaking changes
explicit and intentional.
3 changes: 2 additions & 1 deletion .github/workflows/claude_review.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,8 @@ jobs:

Mandatory workflow — never skip or reorder:
1. Read the PR diff first (gh pr diff).
2. Read CLAUDE.md and CONTRIBUTING.md for project conventions and architecture.
2. Read AGENTS.md, .agents/developer-guidelines.md,
and CONTRIBUTING.md for project conventions, coding principles, and architecture.
3. For changed files under `modelopt/torch/<sub-package>/`, read the sub-package's
`__init__.py` plus any `mode.py` / `config.py` to understand mode registration
and config schema.
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ venv/

# Ignore claude local settings
.claude/settings.local.json
CLAUDE.local.md
AGENTS.override.md

# Ignore SonarQube analysis
.sonar/
39 changes: 39 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Agent Instructions for ModelOpt

These instructions apply to AI-assisted work in this repository.

## Repository orientation

- Start with `README.md` for project overview and install.
- Use `modelopt/` for source, `tests/` for focused test coverage, and
`examples/` or `docs/` for usage patterns.

## Coding guidelines

- **Coding guide:** Code development and review require reading and following
[.agents/developer-guidelines.md](.agents/developer-guidelines.md);
do not skip this step.

## Iterative development

- **Running tests:** Follow the
[writing and running tests](CONTRIBUTING.md#-writing-and-running-tests)
instructions. For fast initial iteration, choose focused tests for the
changed area from `tests/`.
- **Running pre-commit:** Follow the
[pre-commit hook instructions](CONTRIBUTING.md#pre-commit-hooks). Hooks may
modify files; review and re-stage those changes before committing.
- **Signed commit:** Use `git commit -s -S -m "<message>"` for commits so they
follow the [signing your work](CONTRIBUTING.md#-signing-your-work)
requirements.
- **Never `git push` without explicit approval in the current turn.** Commit
locally is fine; publishing to a remote is not.
- After `git commit`, stop and wait for the user to say "push", "publish",
"ship", or equivalent before running `git push`, `gh pr create`, or any
push-option flags like `-o merge_request.create`.

## Contributing and PR readiness

- Before opening or marking a PR ready for review, read the
[submitting your code](CONTRIBUTING.md#submitting-your-code) guidance.
- Read `.github/PULL_REQUEST_TEMPLATE.md` and satisfy the checklist.
133 changes: 0 additions & 133 deletions CLAUDE.md

This file was deleted.

1 change: 1 addition & 0 deletions CLAUDE.md
14 changes: 12 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ If you are an external contributor, seek guidance from `@NVIDIA/modelopt-setup-c

See [`modelopt/torch/quantization/utils/calib_utils.py`](./modelopt/torch/quantization/utils/calib_utils.py) for an example of the correct license header format.

## 📝 Writing tests
## 📝 Writing and running tests

We use [pytest](https://docs.pytest.org/) for all tests. For any new features / examples, make sure to add tests and that the coverage check in your PR passes. The tests are organized into the following directories:

Expand All @@ -89,7 +89,17 @@ We use [pytest](https://docs.pytest.org/) for all tests. For any new features /
- `tests/gpu_trtllm`: Fast GPU-based unit tests for the core ModelOpt library for TensorRT-LLM features. In most cases, they should not take more than a few seconds to run.
- `tests/examples`: Integration tests for ModelOpt examples. They should not take more than a few minutes to run. Please refer to [example test README](./tests/examples/README.md) for more details.

Please refer to [noxfile.py](./noxfile.py) for more details on how to run the tests and their dependencies.
For lightweight focused local validation, run `pytest` directly on the relevant test path. For example:

```bash
pytest tests/unit/torch/quantization
```

For broader repo validation and dependency setup, use [noxfile.py](./noxfile.py). Run `nox -l` to list available sessions, then run the matching session with `nox -s <session>`. The `unit-3.12(torch_211, tf_latest)` session runs `tests/unit` with a specific Torch and Transformers combination:

```bash
nox -s "unit-3.12(torch_211, tf_latest)"
```

## ✍️ Signing your work

Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,10 @@ Model Optimizer follows a structured approach to managing deprecated features:
Model Optimizer is now open source! We welcome any feedback, feature requests and PRs.
Please read our [Contributing](./CONTRIBUTING.md) guidelines for details on how to contribute to this project.

## AI Agents

For AI-assisted development setup, see the [agent tooling notes](./.agents/TOOLING.md).

### Top Contributors

[![Contributors](https://contrib.rocks/image?repo=NVIDIA/Model-Optimizer)](https://github.com/NVIDIA/Model-Optimizer/graphs/contributors)
Expand Down
Loading