From ce63e624d633f6ea310919029d1278756c38954a Mon Sep 17 00:00:00 2001 From: Jan Krivanek Date: Fri, 6 Mar 2026 20:30:43 +0100 Subject: [PATCH] initiate ai --- .github/agents/learn-from-pr.md | 31 ++++ .github/agents/pr.md | 126 ++++++++++++++ .github/agents/pr/SHARED-RULES.md | 72 ++++++++ .github/agents/pr/post-gate.md | 157 ++++++++++++++++++ .github/agents/write-tests-agent.md | 58 +++++++ .github/copilot-instructions.md | 146 ++++++++++++++++ .github/instructions/genai.instructions.md | 30 ++++ .github/instructions/tests.instructions.md | 81 +++++++++ .../instructions/tokenizers.instructions.md | 24 +++ .github/prompts/release-notes.prompt.md | 20 +++ .github/skills/ai-summary-comment/SKILL.md | 51 ++++++ .github/skills/find-reviewable-pr/SKILL.md | 33 ++++ .github/skills/issue-triage/SKILL.md | 47 ++++++ .github/skills/learn-from-pr/SKILL.md | 51 ++++++ .github/skills/pr-build-status/SKILL.md | 72 ++++++++ .github/skills/pr-finalize/SKILL.md | 47 ++++++ .github/skills/run-tests/SKILL.md | 67 ++++++++ .github/skills/try-fix/SKILL.md | 52 ++++++ .github/skills/verify-tests-fail/SKILL.md | 46 +++++ .github/workflows/find-similar-issues.yml | 92 ++++++++++ .github/workflows/inclusive-heat-sensor.yml | 22 +++ README-AI.md | 84 ++++++++++ 22 files changed, 1409 insertions(+) create mode 100644 .github/agents/learn-from-pr.md create mode 100644 .github/agents/pr.md create mode 100644 .github/agents/pr/SHARED-RULES.md create mode 100644 .github/agents/pr/post-gate.md create mode 100644 .github/agents/write-tests-agent.md create mode 100644 .github/copilot-instructions.md create mode 100644 .github/instructions/genai.instructions.md create mode 100644 .github/instructions/tests.instructions.md create mode 100644 .github/instructions/tokenizers.instructions.md create mode 100644 .github/prompts/release-notes.prompt.md create mode 100644 .github/skills/ai-summary-comment/SKILL.md create mode 100644 .github/skills/find-reviewable-pr/SKILL.md create mode 100644 .github/skills/issue-triage/SKILL.md create mode 100644 .github/skills/learn-from-pr/SKILL.md create mode 100644 .github/skills/pr-build-status/SKILL.md create mode 100644 .github/skills/pr-finalize/SKILL.md create mode 100644 .github/skills/run-tests/SKILL.md create mode 100644 .github/skills/try-fix/SKILL.md create mode 100644 .github/skills/verify-tests-fail/SKILL.md create mode 100644 .github/workflows/find-similar-issues.yml create mode 100644 .github/workflows/inclusive-heat-sensor.yml create mode 100644 README-AI.md diff --git a/.github/agents/learn-from-pr.md b/.github/agents/learn-from-pr.md new file mode 100644 index 0000000000..e66afdff30 --- /dev/null +++ b/.github/agents/learn-from-pr.md @@ -0,0 +1,31 @@ +--- +name: learn-from-pr +description: "Analyzes completed PRs for lessons learned from agent behavior. Use after any PR with agent involvement to identify what worked, what failed, and what to improve in instruction files, skills, or documentation." +--- + +# Learn From PR Agent + +Analyzes a completed PR, extracts lessons, and **applies improvements** to the repo's AI infrastructure. + +## Workflow + +1. **Invoke learn-from-pr skill** to analyze the PR and get recommendations +2. **Present recommendations** to the user for approval +3. **Apply approved changes** to instruction files, skills, or documentation +4. **Commit** with descriptive message + +## Where to Apply Changes + +| Recommendation Category | Target File | +|------------------------|-------------| +| instruction-file | `.github/instructions/*.instructions.md` | +| copilot-instructions | `.github/copilot-instructions.md` | +| skill | `.github/skills/*/SKILL.md` | +| agent | `.github/agents/*.md` | +| code-comment | Source files | + +## Rules + +- Always get user approval before applying changes +- Make minimal, surgical edits +- Don't remove existing valid instructions — add alongside diff --git a/.github/agents/pr.md b/.github/agents/pr.md new file mode 100644 index 0000000000..546ea34cc8 --- /dev/null +++ b/.github/agents/pr.md @@ -0,0 +1,126 @@ +--- +name: pr +description: "Sequential 4-phase PR workflow: Pre-Flight, Gate, Fix (multi-model), Report. Phases MUST complete in order." +--- + +# PR Agent + +End-to-end agent that takes a GitHub issue from investigation through to a completed PR. + +## Workflow Overview + +This file covers **Phases 1-2** (Pre-Flight → Gate). + +After Gate passes, read `.github/agents/pr/post-gate.md` for **Phases 3-4** (multi-model Fix → Report). + +``` +┌──────────────────────────────┐ ┌────────────────────────────────────────┐ +│ THIS FILE: pr.md │ │ pr/post-gate.md │ +│ │ │ │ +│ 1. Pre-Flight → 2. Gate │ ──► │ 3. Fix (multi-model) → 4. Report │ +│ ⛔ │ │ │ +│ MUST PASS │ │ (Only read after Gate ✅ PASSED) │ +└──────────────────────────────┘ └────────────────────────────────────────┘ +``` + +**Read `.github/agents/pr/SHARED-RULES.md` for rules that apply across all phases**, including multi-model configuration. + +--- + +## Critical Rules + +- ❌ Never commit directly to `main`. Always create a feature branch. +- ❌ Never stop and ask the user during autonomous execution — use best judgment to continue. +- ❌ Never mark a phase ✅ with pending fields remaining. +- Phase 3 uses a multi-model exploration workflow. See `post-gate.md` after Gate passes. + +--- + +## PRE-FLIGHT: Context Gathering (Phase 1) + +> **SCOPE**: Document only. No code analysis. No fix opinions. No running tests. + +### What TO Do + +- Read issue description and comments +- Note platforms/areas affected +- Identify files changed (if PR exists) +- Document disagreements and edge cases from comments + +### What NOT To Do + +| ❌ Do NOT | Why | When to do it | +|-----------|-----|---------------| +| Research git history | Root cause analysis | Phase 3: Fix | +| Look at implementation code | Understanding the bug | Phase 3: Fix | +| Design or implement fixes | Solution design | Phase 3: Fix | +| Run tests | Verification | Phase 2: Gate | + +### Steps + +**If starting from a PR:** +```bash +gh pr view XXXXX --json title,body,url,author,labels,files +gh pr diff XXXXX +gh issue view ISSUE_NUMBER --json title,body,comments +``` + +**If starting from an Issue:** +```bash +gh issue view XXXXX --json title,body,comments,labels +``` + +--- + +## GATE: Verify Tests Catch the Issue (Phase 2) + +> **SCOPE**: Verify tests exist and correctly detect the fix (for PRs) or reproduce the bug (for issues). + +**⛔ This phase MUST pass before continuing.** + +### Step 1: Check if Tests Exist + +```bash +# For PRs — check changed files for test files +gh pr view XXXXX --json files --jq '.files[].path' | grep -iE "test" + +# For issues — search for tests +find . -name "*Tests.cs" -o -name "*Test.cs" | head -10 +``` + +**If NO tests exist** → Let the user know. They can use `write-tests-agent` to create them. + +### Step 2: Run Verification + +```bash +./build.sh +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj +``` + +For PRs with a fix, ideally verify both directions (invoke `verify-tests-fail` skill): +1. Tests FAIL without fix ← proves tests catch the bug +2. Tests PASS with fix ← proves fix works + +### Complete Gate + +- ✅ **PASSED**: Tests fail without fix, pass with fix → Read `pr/post-gate.md` for Phases 3-4 +- ❌ **FAILED**: Tests don't catch the bug → Request changes from PR author + +--- + +## ⛔ STOP HERE + +**If Gate `✅ PASSED`** → Read `.github/agents/pr/post-gate.md` to continue with phases 3-4. + +**If Gate `❌ FAILED`** → Stop. Request changes from the PR author to fix the tests. + +--- + +## Commands + +| Action | Command | +|--------|---------| +| Build | `./build.sh` | +| Test | `dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj` | +| Format | `dotnet format Microsoft.ML.sln --no-restore` | +| CI Status | Invoke `pr-build-status` skill | diff --git a/.github/agents/pr/SHARED-RULES.md b/.github/agents/pr/SHARED-RULES.md new file mode 100644 index 0000000000..616d931b33 --- /dev/null +++ b/.github/agents/pr/SHARED-RULES.md @@ -0,0 +1,72 @@ +# PR Agent: Shared Rules + +Rules that apply across all PR agent phases. Referenced by `pr.md` and `post-gate.md`. + +--- + +## Multi-Model Configuration + +Phase 3 uses these AI models for try-fix exploration (run **SEQUENTIALLY**): + +| Order | Model | +|-------|-------| +| 1 | `claude-sonnet-4-5` | +| 2 | `gpt-4.1` | +| 3 | `gemini-2.5-pro` | + +**Note:** The `model` parameter is passed to the `task` tool's agent invocation. Each model runs try-fix independently. + +**⚠️ SEQUENTIAL ONLY**: try-fix runs modify the same files and use the same build/test environment. Never run in parallel. + +### Recommended Default Models + +If no specific models are configured, use a diverse set across providers: + +| Order | Model | Why | +|-------|-------|-----| +| 1 | `claude-sonnet-4-5` | Strong code reasoning | +| 2 | `gpt-4.1` | Fast, different perspective | +| 3 | `gemini-2.5-pro` | Different training data | + +Adjust based on available models and budget. More models = more fix diversity = better chance of finding optimal solution. + +--- + +## Phase Completion Protocol + +**Before changing ANY phase status to ✅ COMPLETE:** + +1. Review the phase checklist +2. Verify all required items are addressed +3. Then mark the phase as ✅ COMPLETE + +**Rule:** Status ✅ means "work complete and verified", not "I finished thinking about it." + +--- + +## Stop on Environment Blockers + +If you encounter a blocker that prevents completing a phase: + +1. **Try ONE retry** (install missing tool, rebuild, etc.) +2. **If still blocked after one retry**, skip the blocked phase and continue +3. **Document what was skipped and why** in the Report phase +4. **Always prefer continuing with partial results** over stopping completely + +| Blocker Type | Max Retries | Then Do | +|--------------|-------------|---------| +| Missing tool/dependency | 1 install attempt | Skip phase, continue | +| Server errors (500, timeout) | 1 retry | Skip phase, continue | +| Build failures in try-fix | 2 attempts | Skip remaining models, proceed to Report | +| Configuration issues | 1 fix attempt | Skip phase, continue | + +--- + +## No Direct Git State Changes + +The agent should not run git commands that change branch state during PR review. Use read-only commands: + +- ✅ `gh pr diff`, `gh pr view`, `gh issue view` +- ❌ `git checkout`, `git switch`, `git stash`, `git reset` + +Exception: `git checkout HEAD -- .` and `git clean -fd` are allowed for cleanup between try-fix attempts. diff --git a/.github/agents/pr/post-gate.md b/.github/agents/pr/post-gate.md new file mode 100644 index 0000000000..7272dbc4e4 --- /dev/null +++ b/.github/agents/pr/post-gate.md @@ -0,0 +1,157 @@ +# PR Agent: Post-Gate Phases (3-4) + +**⚠️ PREREQUISITE: Only read this file after 🚦 Gate shows `✅ PASSED`.** + +If Gate is not passed, go back to `.github/agents/pr.md` and complete phases 1-2 first. + +--- + +## Workflow Overview + +| Phase | Name | What Happens | +|-------|------|--------------| +| 3 | **Fix** | Invoke `try-fix` skill with multiple models to explore independent fix alternatives, then compare with PR's fix | +| 4 | **Report** | Deliver result (approve PR, request changes, or create new PR) | + +**All rules from `.github/agents/pr/SHARED-RULES.md` apply**, including multi-model configuration. + +--- + +## 🔧 FIX: Multi-Model Exploration (Phase 3) + +> **SCOPE**: Explore independent fix alternatives using `try-fix` skill across multiple AI models, compare with PR's fix, select the best approach. + +### Why Multi-Model? + +Each AI model has different strengths — one may spot a root cause another misses, or propose a simpler fix. By running try-fix with 3 models sequentially, you maximize fix diversity and increase the chance of finding the optimal solution. + +### 🚨 CRITICAL: try-fix is Independent of PR's Fix + +**The PR's fix has already been validated by Gate.** Phase 3 is NOT re-testing the PR's fix — it's exploring whether a better alternative exists. + +**Do NOT let the PR's fix influence your thinking.** Generate ideas as if you hadn't seen the PR. + +### Step 1: Run try-fix with Each Model (Round 1) + +Run the `try-fix` skill **3 times sequentially**, once with each model (see `SHARED-RULES.md` for model list). + +**⚠️ SEQUENTIAL ONLY**: try-fix runs modify the same files and use the same build/test environment. Never run in parallel. + +**For each model**, invoke as a task agent with the specified model: + +``` +Invoke the try-fix skill for PR #XXXXX: +- problem: [Description of the bug — what's broken and expected behavior] +- test_command: dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj +- target_files: [files likely affected] + +Generate ONE independent fix idea. Review the PR's fix first to ensure your approach is DIFFERENT. +``` + +**Wait for each to complete before starting the next.** + +**🧹 MANDATORY: Clean up between attempts.** After each try-fix completes (pass or fail): + +```bash +# Restore all tracked files to HEAD +git checkout HEAD -- . + +# Remove untracked files added by the previous attempt +git clean -fd +``` + +### Step 2: Cross-Pollination (Round 2+) + +After Round 1, share each model's results with the others and ask for new ideas. + +**For each model**, invoke again with this context: + +``` +Here are the fix attempts from Round 1: +[List each model's approach and result] + +Given what worked and what didn't, propose a NEW fix idea that: +- Is DIFFERENT from all attempts above +- Learns from the failures (avoid the same mistakes) +- Combines insights from passing fixes if applicable + +If you genuinely have no new idea, respond "NO NEW IDEAS" — don't force a bad attempt. +``` + +**Exhaustion criteria**: Cross-pollination is exhausted when ALL models respond "NO NEW IDEAS" via actual invocation (not assumed). + +### Step 3: Select Best Fix + +Build a comparison table of all candidates: + +```markdown +### Fix Candidates +| # | Model | Approach | Result | Files Changed | Notes | +|---|-------|----------|--------|---------------|-------| +| 1 | claude-sonnet-4-5 | [approach] | ✅/❌ | `file.cs` | [why] | +| 2 | gpt-4.1 | [approach] | ✅/❌ | `file.cs` | [why] | +| PR | PR author | [approach] | ✅ (Gate) | `file.cs` | Original | +``` + +**Selection criteria** (in order): +1. Tests pass +2. Minimal changes (fewer files, fewer lines) +3. Root cause fix (not symptom suppression) +4. Code quality and maintainability + +--- + +## 📋 REPORT: Deliver Result (Phase 4) + +### If Starting from PR — Write Review + +| Scenario | Recommendation | +|----------|---------------| +| PR's fix was selected | ✅ **APPROVE** — PR's approach is correct/optimal | +| Alternative fix was better | ⚠️ **REQUEST CHANGES** — suggest the better approach | +| PR's fix failed tests | ⚠️ **REQUEST CHANGES** — fix doesn't work | + +Run `pr-finalize` skill to verify PR title/description match implementation. + +### If Starting from Issue — Create PR + +Present the selected fix to the user: + +```markdown +I've implemented the fix for issue #XXXXX: +- **Selected fix**: Candidate #N — [approach] +- **Files changed**: [list] +- **Other candidates considered**: [brief summary] + +Please review the changes and create a PR when ready. +``` + +### Report Format + +```markdown +## Final Recommendation: APPROVE / REQUEST CHANGES + +### Summary +[Brief summary of the review] + +### Fix Exploration +[How many models tried, how many passed, which was selected and why] + +### Root Cause +[Root cause analysis] + +### Fix Quality +[Assessment of the selected fix] +``` + +--- + +## Common Mistakes + +- ❌ **Looking at PR's fix before generating ideas** — Generate independently first +- ❌ **Re-testing the PR's fix in try-fix** — Gate already validated it +- ❌ **Skipping models in Round 1** — All models must run before cross-pollination +- ❌ **Running try-fix in parallel** — SEQUENTIAL ONLY +- ❌ **Declaring exhaustion prematurely** — All models must confirm "NO NEW IDEAS" +- ❌ **Not cleaning up between attempts** — Always restore working directory +- ❌ **Selecting a failing fix** — Only select from passing candidates diff --git a/.github/agents/write-tests-agent.md b/.github/agents/write-tests-agent.md new file mode 100644 index 0000000000..04ea5a36a3 --- /dev/null +++ b/.github/agents/write-tests-agent.md @@ -0,0 +1,58 @@ +--- +name: write-tests-agent +description: "Determines what type of tests to write and creates them following repo conventions." +--- + +# Write Tests Agent + +Determines what tests are needed, finds the right test project, and writes tests following existing conventions. + +## When to Use +- "Write tests for issue #XXXXX" +- "Add test coverage for..." + +## Workflow + +### 1. Determine test type +| Scenario | Type | +|----------|------| +| Data pipeline behavior | Unit test with `TestDataPipeBase` | +| Trainer functionality | Unit test with model training | +| API surface | Unit test verifying public API | +| End-to-end scenarios | Integration test | +| Tokenizer behavior | Unit test with known token sequences | + +### 2. Find test project and conventions + +```bash +# List test projects +find . -name "*Tests.csproj" | head -20 + +# Read existing test patterns +head -50 $(find . -name "*Tests.cs" | head -3) +``` + +Mirror source project naming: `Microsoft.ML.Foo` → `test/Microsoft.ML.Foo.Tests/` + +### 3. Write tests following repo conventions + +Test framework: **xUnit** + +All tests must: +- Inherit from `TestDataPipeBase` (or `BaseTestClass` for simpler tests) +- Accept `ITestOutputHelper output` in constructor and pass to base +- Use PascalCase descriptive method names +- Include the 3-line .NET Foundation MIT license header +- Use `[Fact]` for single tests, `[Theory]` with `[InlineData]` for parameterized tests +- Use `Assert.*` (xUnit) or `AwesomeAssertions` for assertions + +Test projects: `test/Microsoft.ML.Tests/`, `test/Microsoft.ML.Core.Tests/`, `test/Microsoft.ML.Tokenizers.Tests/`, `test/Microsoft.ML.GenAI.Core.Tests/`, and 30+ others + +### 4. Run tests + +```bash +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj --filter "FullyQualifiedName~NewTestClassName" +``` + +### 5. Verify tests catch the bug +If testing a fix: tests should fail without fix, pass with fix. Use verify-tests-fail skill. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000000..18a70e9132 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,146 @@ +--- +description: "Guidance for GitHub Copilot when working on ML.NET (dotnet/machinelearning)." +--- + +# Development Instructions + +## Repository Overview + +ML.NET is a cross-platform, open-source machine learning framework for .NET. It provides APIs for training, evaluating, and deploying ML models including classification, regression, clustering, ranking, anomaly detection, time series, recommendation, and generative AI (LLaMA, Phi, Mistral via TorchSharp). + +### Key Technologies + +- .NET SDK 10.0.100 (see `global.json`) +- Build system: Microsoft Arcade SDK (`eng/common/`) +- Test framework: xUnit (with `AwesomeAssertions`, `Xunit.Combinatorial`) +- Native dependencies: MKL, OpenMP, libmf, oneDNN +- Major dependencies: TorchSharp, ONNX Runtime, TensorFlow, LightGBM, Semantic Kernel +- Central package management: `Directory.Packages.props` + +## Build & Test + +### Build + +```bash +# Linux/macOS +./build.sh + +# Windows +build.cmd + +# Build specific project +dotnet build src/Microsoft.ML.Core/Microsoft.ML.Core.csproj +``` + +The repo uses Arcade SDK — `build.sh`/`build.cmd` wraps `eng/common/build.sh`/`eng/common/build.ps1` with `--restore --build`. Native dependencies require `eng/common/native/install-dependencies.sh` on Linux. + +### Test + +```bash +# Run tests for a specific project +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj + +# Run tests with filter +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj --filter "FullyQualifiedName~ClassName.MethodName" + +# Run all tests (slow — use specific projects) +dotnet test Microsoft.ML.sln +``` + +Test projects multi-target `net8.0;net48;net9.0` on Windows, `net8.0` only on Linux/macOS/arm64. + +### Format + +```bash +dotnet format Microsoft.ML.sln --no-restore +``` + +The repo has `.editorconfig` and `EnforceCodeStyleInBuild=true`. + +## Project Structure + +``` +src/ +├── Microsoft.ML.Core/ ← Core types, contracts, host environment +├── Microsoft.ML.Data/ ← Data pipeline, DataView, schema +├── Microsoft.ML/ ← MLContext, public API surface +├── Microsoft.ML.StandardTrainers/ ← Built-in trainers (logistic regression, SVM, etc.) +├── Microsoft.ML.Transforms/ ← Data transforms (normalize, featurize, etc.) +├── Microsoft.ML.AutoML/ ← Automated ML pipeline selection +├── Microsoft.ML.FastTree/ ← Tree-based trainers +├── Microsoft.ML.LightGbm/ ← LightGBM integration +├── Microsoft.ML.Recommender/ ← Matrix factorization recommenders +├── Microsoft.ML.TimeSeries/ ← Time series analysis +├── Microsoft.ML.Tokenizers/ ← BPE/WordPiece/SentencePiece tokenizers +├── Microsoft.ML.GenAI.Core/ ← GenAI base types (CausalLM pipeline) +├── Microsoft.ML.GenAI.LLaMA/ ← LLaMA model support +├── Microsoft.ML.GenAI.Phi/ ← Phi model support +├── Microsoft.ML.GenAI.Mistral/ ← Mistral model support +├── Microsoft.ML.TorchSharp/ ← TorchSharp-based trainers +├── Microsoft.ML.OnnxTransformer/ ← ONNX model inference +├── Microsoft.ML.TensorFlow/ ← TensorFlow model inference +├── Microsoft.ML.Vision/ ← Image classification +├── Microsoft.ML.ImageAnalytics/ ← Image transforms +├── Microsoft.ML.CpuMath/ ← SIMD-optimized math operations +├── Microsoft.Data.Analysis/ ← DataFrame API +├── Native/ ← C/C++ native library sources +└── Common/ ← Shared internal code +test/ +├── Microsoft.ML.TestFramework/ ← Base test classes and helpers +├── Microsoft.ML.TestFrameworkCommon/ ← Shared test utilities +├── Microsoft.ML.Tests/ ← Main functional tests +├── Microsoft.ML.Core.Tests/ ← Core unit tests +├── Microsoft.ML.IntegrationTests/ ← End-to-end integration tests +├── Microsoft.ML.Tokenizers.Tests/ ← Tokenizer tests +├── Microsoft.ML.GenAI.*.Tests/ ← GenAI component tests +└── ... (30+ test projects) +``` + +## Conventions + +### Code Style + +- **License header**: Every `.cs` file starts with the 3-line .NET Foundation MIT license header +- **Namespaces**: Match assembly name (`Microsoft.ML`, `Microsoft.ML.Data`, `Microsoft.ML.Trainers`) +- **Usings**: `System.*` first, then `Microsoft.*`, then others +- **Visibility**: Use `[BestFriend]` attribute for internal members shared across assemblies; `private protected` where appropriate +- **Validation**: Use `Contracts.Check*` / `Contracts.Except*` for argument and state validation — not raw `throw` statements +- **XML docs**: Required on all public types and members with `` tags +- **Style priority**: Match the existing style of the file you're editing, even if it differs from general guidelines +- Follow [dotnet/runtime coding-style](https://github.com/dotnet/runtime/blob/main/docs/coding-guidelines/coding-style.md) + +### Test Conventions + +- **Framework**: xUnit (`[Fact]`, `[Theory]`, `[InlineData]`) +- **Base class**: Inherit from `TestDataPipeBase` → `BaseTestClass` (provides `ITestOutputHelper`, test data paths, locale pinning to `en-US`) +- **Constructor**: Accept `ITestOutputHelper output` and pass to base +- **Naming**: PascalCase descriptive method names (e.g., `RandomizedPcaTrainerBaselineTest`) +- **Assertions**: `Assert.*` (xUnit), `AwesomeAssertions` for fluent assertions +- **Test data**: Use `Microsoft.ML.TestDatabases` package or files in `test/data/` +- **Baseline output**: Compare against expected output in `test/BaselineOutput/` + +### Architecture + +- The main entry point is `MLContext` — it exposes catalogs for each ML task +- Data flows through `IDataView` — a lazy, columnar, cursor-based data pipeline +- Trainers implement `IEstimator` → `ITransformer` pattern (fit → transform) +- Custom trainers go in their own project under `src/` +- New test projects mirror source project naming: `Microsoft.ML.Foo` → `Microsoft.ML.Foo.Tests` + +## Git Workflow + +- Default branch: `main` +- Never commit directly to `main` — always create a feature branch +- Branch naming: `feature/description`, `fix/description` +- PRs are squash-merged +- Must reference a filed issue in PR description +- Address review feedback in additional commits (don't amend/force-push) +- Use `git rebase` for conflict resolution, not merge commits + +## CI + +- **Primary CI**: Azure DevOps Pipelines (`build/vsts-ci.yml`) — official signed build +- Builds on Windows, Linux (Ubuntu 22.04), macOS +- Test runs include both managed (.NET) and native components +- Code coverage via `coverlet.collector` +- A custom internal Roslyn analyzer (`Microsoft.ML.InternalCodeAnalyzer`) runs on all test projects diff --git a/.github/instructions/genai.instructions.md b/.github/instructions/genai.instructions.md new file mode 100644 index 0000000000..592a47bf46 --- /dev/null +++ b/.github/instructions/genai.instructions.md @@ -0,0 +1,30 @@ +--- +applyTo: + - "src/Microsoft.ML.GenAI*/**" + - "test/Microsoft.ML.GenAI*/**" + - "src/Microsoft.ML.TorchSharp/**" + - "test/Microsoft.ML.TorchSharp*/**" +--- + +# GenAI & TorchSharp Guidelines + +## Overview + +The `Microsoft.ML.GenAI.*` projects provide .NET-native support for running large language models (LLaMA, Phi, Mistral) via TorchSharp. These components integrate with Semantic Kernel and Microsoft.Extensions.AI. + +## Key Patterns + +- `CausalLMPipeline` is the core abstraction for running autoregressive text generation +- Model implementations live in separate projects per architecture: `GenAI.LLaMA`, `GenAI.Phi`, `GenAI.Mistral` +- Shared types and utilities live in `GenAI.Core` +- TorchSharp tensor operations are in `Microsoft.ML.TorchSharp` + +## Dependencies + +- `Microsoft.SemanticKernel` / `Microsoft.SemanticKernel.Abstractions` +- `Microsoft.Extensions.AI.Abstractions` +- `TorchSharp` (native PyTorch bindings for .NET) + +## Testing GenAI + +GenAI tests may require model weight files. Check test setup for model download steps or mock data patterns before running. diff --git a/.github/instructions/tests.instructions.md b/.github/instructions/tests.instructions.md new file mode 100644 index 0000000000..5fb30168ec --- /dev/null +++ b/.github/instructions/tests.instructions.md @@ -0,0 +1,81 @@ +--- +applyTo: + - "test/**" + - "**/*Tests.cs" + - "**/*Test.cs" +--- + +# Test Guidelines for ML.NET + +## Framework: xUnit + +### Base Class + +All test classes inherit from `TestDataPipeBase` (or `BaseTestClass` for simpler tests): + +```csharp +using Xunit; +using Xunit.Abstractions; + +namespace Microsoft.ML.Tests +{ + public class MyFeatureTests : TestDataPipeBase + { + public MyFeatureTests(ITestOutputHelper output) : base(output) { } + + [Fact] + public void MyFeatureBasicTest() + { + // ... + } + } +} +``` + +### Naming + +- Test classes: `{Feature}Tests` (e.g., `AnomalyDetectionTests`, `CachingTests`) +- Test methods: PascalCase descriptive names (e.g., `RandomizedPcaTrainerBaselineTest`) +- No `Test_` prefix or `_Should_` patterns — use direct descriptive names + +### Assertions + +- Primary: `Assert.*` (xUnit) — `Assert.Equal`, `Assert.Throws`, `Assert.Contains`, `Assert.True` +- Fluent: `AwesomeAssertions` is available for more expressive assertions +- Never use `Assert.That` (NUnit style) — this is an xUnit repo + +### Test Data + +- Use `Microsoft.ML.TestDatabases` package for standard datasets +- Test-specific data goes in `test/data/` +- Reference data path via `GetDataPath("filename")` from the base class + +### Baseline Testing + +- Expected output files in `test/BaselineOutput/` +- Use `BaseTestClass` methods to compare actual vs baseline output +- Update baselines carefully — they are the source of truth for output format stability + +### Running Tests + +```bash +# Specific test project +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj + +# Filter by test name +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj --filter "FullyQualifiedName~AnomalyDetectionTests" + +# Filter by single test +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj --filter "FullyQualifiedName~RandomizedPcaTrainerBaselineTest" +``` + +### Target Frameworks + +Tests multi-target `net8.0;net48;net9.0` on Windows, `net8.0` only on Linux/macOS/arm64. Make sure tests pass on all targeted frameworks. + +### Common Gotchas + +- **Locale**: Base class pins `Thread.CurrentThread.CurrentCulture` to `en-US` — don't change it +- **License header**: Every `.cs` file needs the 3-line .NET Foundation MIT header +- **Unsafe code**: `AllowUnsafeBlocks` is enabled in test projects — this is intentional for testing native interop +- **XML doc warnings**: Suppressed (CS1573, CS1591, CS1712) in test code — no need to add XML docs to tests diff --git a/.github/instructions/tokenizers.instructions.md b/.github/instructions/tokenizers.instructions.md new file mode 100644 index 0000000000..f16cc9795a --- /dev/null +++ b/.github/instructions/tokenizers.instructions.md @@ -0,0 +1,24 @@ +--- +applyTo: + - "src/Microsoft.ML.Tokenizers*/**" + - "test/Microsoft.ML.Tokenizers*/**" +--- + +# Tokenizer Guidelines + +## Overview + +`Microsoft.ML.Tokenizers` provides BPE, WordPiece, and SentencePiece tokenizer implementations. Pre-built tokenizer data packages exist for common vocabularies: `Cl100kBase`, `Gpt2`, `O200kBase`, `P50kBase`, `R50kBase`. + +## Structure + +- `src/Microsoft.ML.Tokenizers/` — Core tokenizer engine +- `src/Microsoft.ML.Tokenizers.Data.*/` — Pre-built vocabulary data packages +- `test/Microsoft.ML.Tokenizers.Tests/` — Unit tests +- `test/Microsoft.ML.Tokenizers.Data.Tests/` — Data package tests + +## Conventions + +- Tokenizer implementations should be stateless where possible +- Vocabulary data is embedded as assembly resources in the `.Data.*` packages +- Test tokenizer output against known expected token sequences diff --git a/.github/prompts/release-notes.prompt.md b/.github/prompts/release-notes.prompt.md new file mode 100644 index 0000000000..39e4f3aeac --- /dev/null +++ b/.github/prompts/release-notes.prompt.md @@ -0,0 +1,20 @@ +# ML.NET Release Notes + +Generate classified release notes between two commits. + +## Categories + +1. **Product** — Bug fixes, features, improvements +2. **Dependencies** — Package/SDK updates +3. **Testing** — Test changes and infrastructure +4. **Documentation** — Docs, samples +5. **Housekeeping** — Build, CI, cleanup + +## Process + +```bash +# Get commits between two points +git log --pretty=format:"%h - %s (%an)" BRANCH1..BRANCH2 > commits.txt +``` + +Classify each commit. When uncertain, default to Housekeeping. Group related commits. Flag breaking changes with ⚠️. diff --git a/.github/skills/ai-summary-comment/SKILL.md b/.github/skills/ai-summary-comment/SKILL.md new file mode 100644 index 0000000000..11d0c84acb --- /dev/null +++ b/.github/skills/ai-summary-comment/SKILL.md @@ -0,0 +1,51 @@ +--- +name: ai-summary-comment +description: Posts or updates automated progress comments on GitHub PRs. Creates single unified comment with collapsible sections. Use after completing any agent phase to post results, or when asked to "post comment to PR", "update PR progress". +--- + +# AI Summary Comment + +Posts a single unified comment on a PR with collapsible sections for each phase of work. + +## Architecture + +One comment per PR, identified by ``. Each update modifies only its section. + +```markdown + +## 🤖 AI Summary + + +
📋 PR Review — ✅ Complete +[Review findings] +
+ + + +
🧪 Tests — ✅ Pass +[Test results] +
+ +``` + +## Usage + +```bash +# Find existing comment +COMMENT_ID=$(gh pr view PR --json comments --jq '.comments[] | select(.body | contains("")) | .databaseId') + +# Update or create +if [ -n "$COMMENT_ID" ]; then + gh api repos/OWNER/REPO/issues/comments/$COMMENT_ID --method PATCH -f body="$BODY" +else + gh pr comment PR --body "$BODY" +fi +``` + +## Rules + +1. **Self-contained** — Never reference local files in comments +2. **Idempotent** — Running twice produces same result +3. **Section isolation** — Updating one section preserves others +4. **Collapsible** — Use `
` tags to keep comments compact +5. **No approvals** — Never use `--approve` or `--request-changes` diff --git a/.github/skills/find-reviewable-pr/SKILL.md b/.github/skills/find-reviewable-pr/SKILL.md new file mode 100644 index 0000000000..66161966b4 --- /dev/null +++ b/.github/skills/find-reviewable-pr/SKILL.md @@ -0,0 +1,33 @@ +--- +name: find-reviewable-pr +description: Finds open PRs that are good candidates for review, prioritizing by milestone, priority labels, and community status. Use when asked to "find PRs to review", "what needs review", or "show me open PRs". +--- + +# Find Reviewable PR + +Searches for open PRs prioritized by importance. + +## Priority Order + +1. 🔴 **P/0** — Critical priority, review first +2. ✅ **Approved (not merged)** — Ready to merge +3. 📅 **Milestoned** — Has a deadline +4. ✨ **Community** — External contributions +5. 🕐 **Recent** — Created in last 2 weeks, no review yet + +## Commands + +```bash +# Priority PRs +gh pr list --search "label:p/0" --json number,title,author,labels,createdAt + +# Milestoned +gh pr list --search "milestone:*" --json number,title,author,milestone,createdAt + +# Recent needing review +gh pr list --search "review:none created:>=$(date -v-14d +%Y-%m-%d)" --json number,title,author,createdAt +``` + +## Output + +Group results by category. Include: PR number, title, author, age, complexity (Easy/Medium/Complex based on file count and additions). diff --git a/.github/skills/issue-triage/SKILL.md b/.github/skills/issue-triage/SKILL.md new file mode 100644 index 0000000000..8beb7a50e1 --- /dev/null +++ b/.github/skills/issue-triage/SKILL.md @@ -0,0 +1,47 @@ +--- +name: issue-triage +description: Queries and triages open GitHub issues that need attention. Helps identify issues needing milestones, labels, or investigation. Use when asked to "triage issues", "find issues without milestones", or "what needs attention". +--- + +# Issue Triage + +Present issues ONE AT A TIME for human triage decisions. + +## Workflow + +### 1. Query issues without milestones + +```bash +gh issue list --repo OWNER/REPO \ + --search "no:milestone -label:needs-info -label:needs-repro" \ + --limit 50 --json number,title,labels,createdAt,author,comments,url +``` + +### 2. Present one issue + +```markdown +## Issue #XXXXX — [Title] +🔗 [URL] + +| Field | Value | +|-------|-------| +| Author | username | +| Labels | labels | +| Age | N days | +| Comments | N | + +**Suggestion**: `Milestone` — Reason +``` + +### 3. Wait for user decision +- Milestone name → apply it +- "skip" → next issue +- "yes" → accept suggestion + +### 4. Apply and move to next + +```bash +gh issue edit NUMBER --repo OWNER/REPO --milestone "MILESTONE" +``` + +Auto-reload when batch is exhausted. Don't ask "Load more?" — just do it. diff --git a/.github/skills/learn-from-pr/SKILL.md b/.github/skills/learn-from-pr/SKILL.md new file mode 100644 index 0000000000..071d36df37 --- /dev/null +++ b/.github/skills/learn-from-pr/SKILL.md @@ -0,0 +1,51 @@ +--- +name: learn-from-pr +description: Analyzes a completed PR to extract lessons learned from agent behavior. Use after any PR with agent involvement to identify what worked, what failed, and what to improve in instruction files, skills, or documentation. +--- + +# Learn From PR + +Extracts lessons from completed PRs and produces actionable recommendations. + +## Workflow + +### 1. Gather data + +```bash +gh pr view XXXXX --json title,body,files,commits,comments,reviews +gh pr diff XXXXX +``` + +### 2. Analyze + +- **Fix location**: Which files were changed? Which module/layer? +- **Failure modes** (if agent struggled): Wrong files targeted? Missing domain knowledge? Bad test command? +- **What worked**: What led to the successful fix? + +### 3. Generate recommendations + +Each recommendation: + +| Field | Description | +|-------|-------------| +| Priority | high / medium / low | +| Category | instruction-file, skill, code-comment, documentation | +| Location | Which file to update | +| Change | Specific text to add/modify | +| Why | How this prevents future failures | + +### 4. Present to user + +```markdown +## Lessons from PR #XXXXX + +### What Happened +[Problem → Attempts → Solution] + +### Recommendations +| # | Priority | Where | What to Change | +|---|----------|-------|----------------| +| 1 | High | .github/instructions/X.md | Add guidance about Y | +``` + +The learn-from-pr **agent** (separate from this skill) takes it further by actually applying the recommendations to files. diff --git a/.github/skills/pr-build-status/SKILL.md b/.github/skills/pr-build-status/SKILL.md new file mode 100644 index 0000000000..dccfae47ae --- /dev/null +++ b/.github/skills/pr-build-status/SKILL.md @@ -0,0 +1,72 @@ +--- +name: pr-build-status +description: "Read CI build results for a PR — which jobs failed, why, and what the error messages say. Use when asked about build status, CI failures, or why a PR is red. This is the 'eyes' of the CI feedback loop." +--- + +# PR Build Status + +The agent's ability to see CI results. Without this, agents push code blindly. + +## CI System: Azure Pipelines (primary) + GitHub Actions (secondary) + +ML.NET uses Azure DevOps Pipelines (`build/vsts-ci.yml`) for official CI and GitHub Actions for auxiliary workflows (backport, locker, copilot-setup-steps). + +## For Azure Pipelines + +```bash +# List builds for a PR (requires AZDO_PAT env var) +curl -s "https://dev.azure.com/dnceng-public/public/_apis/build/builds?branchName=refs/pull/PR_NUM/merge&api-version=7.0" \ + -H "Authorization: Basic $(echo -n :$AZDO_PAT | base64)" | jq '.value[] | {id, status, result}' + +# Get build timeline (shows failed tasks) +curl -s "https://dev.azure.com/dnceng-public/public/_apis/build/builds/BUILD_ID/timeline?api-version=7.0" \ + -H "Authorization: Basic $(echo -n :$AZDO_PAT | base64)" | jq '.records[] | select(.result == "failed") | {name, issues}' + +# Get task log (actual error output) +curl -s "https://dev.azure.com/dnceng-public/public/_apis/build/builds/BUILD_ID/logs/LOG_ID?api-version=7.0" \ + -H "Authorization: Basic $(echo -n :$AZDO_PAT | base64)" | tail -100 +``` + +## For GitHub Actions + +```bash +# Find the latest run for a PR branch +gh run list --branch BRANCH_NAME --limit 3 --json databaseId,status,conclusion,name,createdAt + +# Get failed jobs from a run +gh run view RUN_ID --json jobs --jq '.jobs[] | select(.conclusion == "failure") | {name, conclusion, startedAt}' + +# Get failure logs (most important command) +gh run view RUN_ID --log-failed 2>&1 | tail -200 + +# Get specific job log +gh run view RUN_ID --log --job JOB_ID 2>&1 | tail -100 +``` + +## Output Format + +```markdown +## CI Status for PR #XXXXX + +| Job | Status | Duration | +|-----|--------|----------| +| Build | ✅ / ❌ | Nm | +| Tests | ✅ / ❌ | Nm | + +### Failures +**[Job Name]**: [Error summary] +``` +[Key error lines] +``` + +### Diagnosis +[What failed and why — extracted from logs] + +### Suggested Fix +[What to change based on the error] +``` + +## Pipeline Names + +- **Azure Pipelines**: `ML.NET Official Build` (`build/vsts-ci.yml`) +- **GitHub Actions**: `Copilot Setup Steps`, `Backport`, `Lock` diff --git a/.github/skills/pr-finalize/SKILL.md b/.github/skills/pr-finalize/SKILL.md new file mode 100644 index 0000000000..0ef41dca3f --- /dev/null +++ b/.github/skills/pr-finalize/SKILL.md @@ -0,0 +1,47 @@ +--- +name: pr-finalize +description: Finalizes any PR for merge by verifying title/description match implementation AND performing code review. Use when asked to "finalize PR", "check PR description", "review commit message", before merging any PR. +--- + +# PR Finalize + +Verifies PR title and description accurately reflect the implementation, then reviews code for best practices. + +## Rules + +- **NEVER** use `gh pr review --approve` or `--request-changes` — approval is a human decision +- **NEVER** post comments directly — this skill is analysis only. Use ai-summary-comment to post. + +## Workflow + +### Phase 1: Title & Description + +```bash +gh pr view XXXXX --json title,body,files,commits +gh pr diff XXXXX +``` + +1. **Evaluate existing description** — Is it good? Don't replace quality with a template. +2. **Check title** — Should describe behavior, not just "Fix #123". Format: `[Scope] What changed` +3. **Check description** — Should explain what changed and why, link to issues, note breaking changes. + +### Phase 2: Code Review + +Focus on: code quality, error handling, performance, breaking changes, test coverage. + +### Output + +```markdown +## PR #XXXXX Review + +### Title: ✅ Good / ⚠️ Needs Update +**Current**: "existing" +**Suggested**: "improved" (if needed) + +### Description: ✅ Good / ⚠️ Needs Update + +### Code Review +#### 🔴 Critical: [issue in path/to/file] +#### 🟡 Suggestion: [improvement] +#### ✅ Looks Good: [positive observation] +``` diff --git a/.github/skills/run-tests/SKILL.md b/.github/skills/run-tests/SKILL.md new file mode 100644 index 0000000000..c983f13eb3 --- /dev/null +++ b/.github/skills/run-tests/SKILL.md @@ -0,0 +1,67 @@ +--- +name: run-tests +description: "Build and run tests locally with filtering. Use when asked to run tests, verify a fix, or check test results." +--- + +# Run Tests + +## Quick Start + +```bash +# All tests (slow — avoid unless necessary) +dotnet test Microsoft.ML.sln + +# Specific project (preferred) +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj + +# Filter by name +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj --filter "FullyQualifiedName~AnomalyDetectionTests" +``` + +## Test Projects + +| Project | Path | Area | +|---------|------|------| +| Microsoft.ML.Tests | test/Microsoft.ML.Tests/ | Main functional tests | +| Microsoft.ML.Core.Tests | test/Microsoft.ML.Core.Tests/ | Core type tests | +| Microsoft.ML.CpuMath.UnitTests | test/Microsoft.ML.CpuMath.UnitTests/ | SIMD math tests | +| Microsoft.ML.AutoML.Tests | test/Microsoft.ML.AutoML.Tests/ | AutoML tests | +| Microsoft.ML.Tokenizers.Tests | test/Microsoft.ML.Tokenizers.Tests/ | Tokenizer tests | +| Microsoft.ML.GenAI.Core.Tests | test/Microsoft.ML.GenAI.Core.Tests/ | GenAI core tests | +| Microsoft.ML.GenAI.LLaMA.Tests | test/Microsoft.ML.GenAI.LLaMA.Tests/ | LLaMA tests | +| Microsoft.ML.GenAI.Phi.Tests | test/Microsoft.ML.GenAI.Phi.Tests/ | Phi tests | +| Microsoft.ML.GenAI.Mistral.Tests | test/Microsoft.ML.GenAI.Mistral.Tests/ | Mistral tests | +| Microsoft.ML.TorchSharp.Tests | test/Microsoft.ML.TorchSharp.Tests/ | TorchSharp tests | +| Microsoft.ML.TimeSeries.Tests | test/Microsoft.ML.TimeSeries.Tests/ | Time series tests | +| Microsoft.ML.IntegrationTests | test/Microsoft.ML.IntegrationTests/ | End-to-end tests | +| Microsoft.ML.Fairlearn.Tests | test/Microsoft.ML.Fairlearn.Tests/ | Fairness tests | +| Microsoft.Data.Analysis.Tests | test/Microsoft.Data.Analysis.Tests/ | DataFrame tests | +| Microsoft.Extensions.ML.Tests | test/Microsoft.Extensions.ML.Tests/ | DI integration tests | +| Microsoft.ML.SearchSpace.Tests | test/Microsoft.ML.SearchSpace.Tests/ | Search space tests | +| Microsoft.ML.Sweeper.Tests | test/Microsoft.ML.Sweeper.Tests/ | Hyperparameter sweep tests | +| Microsoft.ML.Predictor.Tests | test/Microsoft.ML.Predictor.Tests/ | Predictor tests | +| Microsoft.ML.FSharp.Tests | test/Microsoft.ML.FSharp.Tests/ | F# interop tests | + +## Filtering + +```bash +# By class name +dotnet test PROJECT --filter "FullyQualifiedName~ClassName" + +# By single method +dotnet test PROJECT --filter "FullyQualifiedName~ClassName.MethodName" + +# By trait/category +dotnet test PROJECT --filter "Category=Unit" +``` + +## Prerequisites + +```bash +# Full build (includes restore) +./build.sh # Linux/macOS +build.cmd # Windows + +# Or build specific project +dotnet build src/Microsoft.ML/Microsoft.ML.csproj +``` diff --git a/.github/skills/try-fix/SKILL.md b/.github/skills/try-fix/SKILL.md new file mode 100644 index 0000000000..ba09bdda66 --- /dev/null +++ b/.github/skills/try-fix/SKILL.md @@ -0,0 +1,52 @@ +--- +name: try-fix +description: "Attempts ONE alternative fix for a bug, tests it empirically, and reports results. Always explores a DIFFERENT approach from existing fixes." +--- + +# Try Fix + +Single-shot: receive context → try ONE fix → test → report → revert. + +## Principles +1. **Single-shot** — One fix idea per invocation +2. **Alternative** — Always different from existing fixes +3. **Empirical** — Implement and test, don't theorize +4. **Clean** — Always revert after, leave repo clean + +## Workflow + +### 1. Baseline +```bash +git stash +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj # Confirm failure +``` + +### 2. Implement one fix +Minimal changes. Read prior attempts (if any) and do something different. + +### 3. Test +```bash +./build.sh +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj +``` + +### 4. Report +```markdown +## Try-Fix Attempt #N +**Approach**: [what and why] +**Changes**: `path/file` — [what changed] +**Build**: ✅/❌ +**Tests**: ✅/❌ +**Verdict**: ✅ FIX WORKS / ❌ FAILED — [reason] +``` + +### 5. Revert +```bash +git checkout -- . +git stash pop +``` + +## Rules +- Sequential only — never parallel +- Max 5 attempts per session +- Always revert — leave repo clean diff --git a/.github/skills/verify-tests-fail/SKILL.md b/.github/skills/verify-tests-fail/SKILL.md new file mode 100644 index 0000000000..aa933aca39 --- /dev/null +++ b/.github/skills/verify-tests-fail/SKILL.md @@ -0,0 +1,46 @@ +--- +name: verify-tests-fail-without-fix +description: "Verifies tests catch the bug — fail without fix, pass with fix. Use after writing tests for a bug fix, or when asked to prove tests are valid." +--- + +# Verify Tests Fail Without Fix + +Proves tests actually catch the bug. + +## Full Verification + +```bash +# 1. Remove the fix, keep the tests +git stash push -m "fix" -- + +# 2. Build and run — should FAIL +./build.sh +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj --filter "FullyQualifiedName~RelevantTests" +# Expected: ❌ FAIL (proves tests catch the bug) + +# 3. Restore fix +git stash pop + +# 4. Build and run — should PASS +./build.sh +dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj --filter "FullyQualifiedName~RelevantTests" +# Expected: ✅ PASS (proves fix works) +``` + +## Output + +```markdown +## Verification + +| State | Build | Tests | Expected | +|-------|-------|-------|----------| +| Without fix | ✅/❌ | ❌ FAIL | ❌ (good — catches bug) | +| With fix | ✅ | ✅ PASS | ✅ (good — fix works) | + +**Verdict**: ✅ Tests properly validate the fix / ❌ Tests don't catch the bug +``` + +## Rules +- Always restore working state after verification +- If tests pass without the fix → they don't catch the bug, report this +- If tests fail with the fix → fix is incomplete, report this diff --git a/.github/workflows/find-similar-issues.yml b/.github/workflows/find-similar-issues.yml new file mode 100644 index 0000000000..dd3934010d --- /dev/null +++ b/.github/workflows/find-similar-issues.yml @@ -0,0 +1,92 @@ +name: "Find Similar Issues with AI" + +on: + issues: + types: [opened] + +permissions: + contents: read + issues: write + models: read + +jobs: + find-similar-issues: + runs-on: ubuntu-latest + if: github.event_name == 'issues' + steps: + - uses: actions/setup-node@v4 + with: + node-version: '20' + + - run: npm init -y && npm install @octokit/rest + + - name: Find and post similar issues + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + ISSUE_NUMBER: ${{ github.event.issue.number }} + ISSUE_TITLE: ${{ github.event.issue.title }} + ISSUE_BODY: ${{ github.event.issue.body }} + run: | + node << 'SCRIPT' + const { Octokit } = require("@octokit/rest"); + const fs = require('fs'); + const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN }); + const endpoint = "https://models.inference.ai.azure.com"; + const model = "gpt-4o-mini"; + const token = process.env.GITHUB_TOKEN; + const issueNum = parseInt(process.env.ISSUE_NUMBER); + const title = process.env.ISSUE_TITLE; + const body = process.env.ISSUE_BODY || ''; + const [owner, repo] = process.env.GITHUB_REPOSITORY.split('/'); + + function extractWords(text) { + const stop = new Set(['the','and','for','with','this','that','from','have','not','are','was','will','can','when','what','how','use','does','issue','error','work']); + return [...new Set(text.replace(/```[\s\S]*?```/g,'').replace(/https?:\/\/\S+/g,'').replace(/[^a-z0-9\s]/gi,' ').toLowerCase().split(/\s+/).filter(w=>w.length>3&&!stop.has(w)))]; + } + function jaccard(a,b) { const i=a.filter(w=>b.includes(w)); const u=[...new Set([...a,...b])]; return u.length?i.length/u.length:0; } + + (async()=>{ + const issues=[]; + for(let p=1;p<=10;p++){ + const r=await octokit.issues.listForRepo({owner,repo,state:'all',per_page:100,page:p,sort:'updated',direction:'desc'}); + if(!r.data.length)break; + issues.push(...r.data.filter(i=>i.number!==issueNum&&!i.pull_request)); + } + const words=extractWords(`${title}\n${body}`); + const candidates=issues.map(i=>({issue:i,score:jaccard(words,extractWords(`${i.title}\n${i.body||''}`))})) + .filter(c=>c.score>0.1).sort((a,b)=>b.score-a.score).slice(0,30); + + const results=[]; + for(const{issue}of candidates){ + try{ + const r=await fetch(`${endpoint}/chat/completions`,{method:"POST",headers:{"Content-Type":"application/json","Authorization":`Bearer ${token}`}, + body:JSON.stringify({model,temperature:0.3,max_tokens:150,messages:[ + {role:"system",content:'Analyze GitHub issue similarity. Return JSON only: {"score":0.0,"reason":"brief"}'}, + {role:"user",content:`Current:\nTitle: ${title}\nBody: ${body}\n\nCompare:\nTitle: ${issue.title}\nBody: ${issue.body||'None'}`} + ]})}); + const d=await r.json(); + if(!d.choices?.[0])continue; + const parsed=JSON.parse(d.choices[0].message.content.trim().replace(/^```json?\s*/gm,'').replace(/```$/gm,'')); + if(parsed.score>=0.6) results.push({number:issue.number,title:issue.title,state:issue.state,url:issue.html_url,score:parsed.score,reason:parsed.reason,labels:issue.labels.map(l=>l.name)}); + await new Promise(r=>setTimeout(r,100)); + }catch(e){console.error(`#${issue.number}:`,e.message)} + } + results.sort((a,b)=>b.score-a.score); + const top=results.slice(0,5); + + let comment=''; + if(top.length){ + comment=`## 🔍 Similar Issues Found\n\n`; + top.forEach((s,i)=>{ + comment+=`
${i+1}. #${s.number}: ${s.title} (${Math.round(s.score*100)}%)\n\n`; + comment+=`**State:** ${s.state==='open'?'🟢 Open':'🔴 Closed'} \n**Labels:** ${s.labels.slice(0,5).map(l=>'`'+l+'`').join(', ')||'None'}\n`; + if(s.reason) comment+=`**Why:** ${s.reason}\n`; + comment+=`
\n\n`; + }); + comment+=`---\n*AI-powered similar issue detection*`; + } else { + comment=`## 🔍 No similar issues found with high confidence.\n\n---\n*AI-powered similar issue detection*`; + } + await octokit.issues.createComment({owner,repo,issue_number:issueNum,body:comment}); + })(); + SCRIPT diff --git a/.github/workflows/inclusive-heat-sensor.yml b/.github/workflows/inclusive-heat-sensor.yml new file mode 100644 index 0000000000..26664076b2 --- /dev/null +++ b/.github/workflows/inclusive-heat-sensor.yml @@ -0,0 +1,22 @@ +name: Inclusive Heat Sensor +on: + issues: + types: [opened, reopened] + issue_comment: + types: [created, edited] + pull_request_review_comment: + types: [created, edited] + +permissions: + contents: read + issues: write + pull-requests: write + +jobs: + detect-heat: + uses: jonathanpeppers/inclusive-heat-sensor/.github/workflows/comments.yml@v0.1.2 + secrets: inherit + with: + minimizeComment: true + offensiveThreshold: 9 + angerThreshold: 9 diff --git a/README-AI.md b/README-AI.md new file mode 100644 index 0000000000..36bf00a264 --- /dev/null +++ b/README-AI.md @@ -0,0 +1,84 @@ +# AI-Native Development Infrastructure + +This document describes the AI-native development infrastructure added to the ML.NET repository. These files teach GitHub Copilot (and other AI agents) how to navigate, build, test, and contribute to this codebase. + +## What Was Created + +### Instructions (teach AI your repo) + +| File | Type | Purpose | +|------|------|---------| +| `.github/copilot-instructions.md` | Generated | Global Copilot context — repo overview, build/test commands, conventions, project structure | +| `.github/instructions/tests.instructions.md` | Generated | Test-specific guidance — xUnit patterns, base classes, assertion styles, naming | +| `.github/instructions/genai.instructions.md` | Generated | GenAI component guidance — LLaMA/Phi/Mistral via TorchSharp, Semantic Kernel | +| `.github/instructions/tokenizers.instructions.md` | Generated | Tokenizer guidance — BPE/WordPiece/SentencePiece, data packages | + +### Agents (multi-step AI workflows) + +| File | Type | Purpose | +|------|------|---------| +| `.github/agents/pr.md` | Configured | 4-phase PR workflow: Pre-Flight → Gate → Fix → Report | +| `.github/agents/pr/post-gate.md` | Configured | Multi-model fix exploration (Phase 3-4) | +| `.github/agents/pr/SHARED-RULES.md` | Configured | Shared rules and model configuration | +| `.github/agents/write-tests-agent.md` | Configured | Test writing dispatcher following xUnit conventions | +| `.github/agents/learn-from-pr.md` | Configured | Self-improvement — extract lessons from PRs | + +### Skills (focused AI capabilities) + +| File | Type | Purpose | +|------|------|---------| +| `.github/skills/pr-build-status/SKILL.md` | Configured | Read Azure Pipelines + GitHub Actions CI results | +| `.github/skills/try-fix/SKILL.md` | Configured | Single-shot fix → test → report cycle | +| `.github/skills/run-tests/SKILL.md` | Configured | Build and run tests with filtering | +| `.github/skills/verify-tests-fail/SKILL.md` | Configured | Prove tests catch bugs (fail without fix, pass with) | +| `.github/skills/pr-finalize/SKILL.md` | Universal | Verify PR title/description match implementation | +| `.github/skills/issue-triage/SKILL.md` | Universal | Triage open issues by milestone/priority | +| `.github/skills/find-reviewable-pr/SKILL.md` | Universal | Find PRs needing review | +| `.github/skills/learn-from-pr/SKILL.md` | Universal | Analyze PRs for lessons learned | +| `.github/skills/ai-summary-comment/SKILL.md` | Universal | Post unified progress comments on PRs | + +### Workflows (automated GitHub Actions) + +| File | Type | Purpose | +|------|------|---------| +| `.github/workflows/copilot-setup-steps.yml` | Pre-existing | Remote Copilot Coding Agent build environment | +| `.github/workflows/find-similar-issues.yml` | Universal | AI duplicate detection on new issues | +| `.github/workflows/inclusive-heat-sensor.yml` | Universal | Detects heated language in comments | + +### Prompts + +| File | Type | Purpose | +|------|------|---------| +| `.github/prompts/release-notes.prompt.md` | Configured | Generate classified release notes between commits | + +## File Types + +- **Generated** — Produced by analyzing the ML.NET repo's specific structure and conventions +- **Configured** — Template filled with ML.NET's build/test commands and project structure +- **Universal** — Works on any GitHub repo unchanged +- **Pre-existing** — Already present in the repo before onboarding + +## CI Feedback Loop + +The CI feedback loop enables AI agents to iterate on failures: + +``` +Agent writes code → Push → CI runs → Agent reads results → Agent fixes → Repeat +``` + +Three components make this work: +1. **`copilot-setup-steps.yml`** — Remote Copilot Coding Agent can build the repo (pre-existing) +2. **`pr-build-status` skill** — Agent reads Azure Pipelines/GitHub Actions results +3. **Build/test commands in instructions** — Agent knows how to build and test locally + +## Next Steps + +1. **Review `copilot-instructions.md`** — It's the highest-impact file. Verify it captures your team's conventions accurately. +2. **Review scoped instructions** — Check that glob patterns in `tests.instructions.md`, `genai.instructions.md`, and `tokenizers.instructions.md` match your project structure. +3. **Commit**: + ```bash + git add .github/ + git commit -m "Add AI-native development infrastructure" + ``` +4. **Test it** — Open a PR and ask Copilot to review it. +5. **Improve** — After your first PR with AI involvement, use `learn-from-pr` to refine the instructions based on real experience.