[LSP] use source.fixAll.pyrefly for fix-all code actions#3039
[LSP] use source.fixAll.pyrefly for fix-all code actions#3039Arths17 wants to merge 3 commits intofacebook:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR adds a synthetic “mock_org” Python package (and a generator script) under perf/scale_test/ intended to stress import graphs and typing/indexing performance. However, the changes shown do not match the PR title/description about LSP source.fixAll.pyrefly code actions (Fixes #3024), suggesting a significant mismatch between the stated intent and the actual diff.
Changes:
- Added a code generator (
generate_mock_org.py) to create a large, heavily-typed, circularly-importing package for perf scaling tests. - Checked in several large generated modules (
core.py,models.py,events.py,analytics.py) plus package__init__.pyand a README for the perf dataset. - (Mismatch) No visible LSP/server capability changes related to
source.fixAll.pyreflyare included in the provided diff.
Reviewed changes
Copilot reviewed 12 out of 26 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| perf/scale_test/mock_org/models.py | Adds large generated “models” module with deep generic types and entity classes. |
| perf/scale_test/mock_org/events.py | Adds large generated “events” module with similar typing/import stress patterns. |
| perf/scale_test/mock_org/core.py | Adds large generated “core” module for the perf corpus. |
| perf/scale_test/mock_org/analytics.py | Adds large generated “analytics” module for the perf corpus. |
| perf/scale_test/mock_org/init.py | Declares package exports for the synthetic perf package. |
| perf/scale_test/generate_mock_org.py | Generator script to (re)create the synthetic perf corpus. |
| perf/scale_test/README.md | Intro README for the perf dataset and regeneration instructions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
perf/scale_test/generate_mock_org.py
Outdated
| #!/usr/bin/env python3 | ||
| """Generate a synthetic enterprise-scale Python codebase for indexing benchmarks. | ||
|
|
||
| This generator creates a package with many modules that intentionally stress: | ||
| - deep inheritance and large class graphs | ||
| - heavy typing usage (Generic, Protocol, Annotated, callable-heavy signatures) | ||
| - dense cross-module and circular import structure | ||
| - high metadata volume via Google-style docstrings | ||
| """ |
There was a problem hiding this comment.
The PR title/description is about LSP fix-all code action kinds (source.fixAll.pyrefly), but the diff adds a perf-scale synthetic Python corpus generator + generated modules. Please reconcile this by either (a) updating the PR description/title to match these perf changes, or (b) moving these perf corpus changes to a separate PR and including the intended LSP changes here.
perf/scale_test/mock_org/models.py
Outdated
| from . import core as prev_mod | ||
| from . import services as next_mod | ||
| from . import utils as next_next_mod |
There was a problem hiding this comment.
This module imports sibling modules (services, utils) that are not included in the provided diff. As-is, importing mock_org.models will raise ModuleNotFoundError unless those modules already exist. If the intent is to check in a ready-to-import perf corpus, add the missing modules (even as lightweight stubs) or adjust the import strategy (e.g., generate/check in the full set, or gate non-essential imports behind TYPE_CHECKING).
perf/scale_test/mock_org/events.py
Outdated
| from . import auth as prev_mod | ||
| from . import api as prev_prev_mod | ||
| from . import analytics as next_mod | ||
|
|
There was a problem hiding this comment.
Similar to models.py, this module imports siblings (auth, api) that are not present in the provided diff. If those modules are not already in the repo, mock_org.events will fail to import at runtime. Please ensure the full referenced module set is present/checked in, or modify these imports so the package is importable.
| from . import auth as prev_mod | |
| from . import api as prev_prev_mod | |
| from . import analytics as next_mod | |
| try: | |
| from . import auth as prev_mod | |
| except ImportError: | |
| prev_mod = None | |
| try: | |
| from . import api as prev_prev_mod | |
| except ImportError: | |
| prev_prev_mod = None | |
| try: | |
| from . import analytics as next_mod | |
| except ImportError: | |
| next_mod = None |
perf/scale_test/mock_org/models.py
Outdated
| V = TypeVar("V") | ||
| W = TypeVar("W") | ||
|
|
||
| ComplexCallable = Callable[[List[T], Dict[str, Any]], Awaitable[V]] |
There was a problem hiding this comment.
ComplexCallable is defined with Dict[str, Any] (free type vars: T, V), but the file later uses it as if it takes three type parameters (e.g., ComplexCallable[T, U, V], ComplexCallable[int, str, float]). With postponed evaluation, this won’t crash at import time, but type checkers will flag it (wrong number of type args) and it also makes callbacks typed with Dict[str, str] not safely compatible with Dict[str, Any] due to dict invariance. Align the alias with usage (e.g., Dict[str, U]) or update all call sites to match the actual arity.
| ComplexCallable = Callable[[List[T], Dict[str, Any]], Awaitable[V]] | |
| ComplexCallable = Callable[[List[T], Dict[str, U]], Awaitable[V]] |
perf/scale_test/generate_mock_org.py
Outdated
| ComplexCallable = Callable[[List[T], Dict[str, U]], Awaitable[V]] | ||
| MergeCallable = Callable[[Sequence[V], Mapping[str, U]], Awaitable[Dict[str, V]]] | ||
| AuditTag = Annotated[str, "benchmark", "{name}"] | ||
| Vector = Annotated[List[Annotated[T, "vector-item"]], "vector"] | ||
|
|
There was a problem hiding this comment.
The checked-in generated modules don’t appear to be reproducible from the current generator output. For example, the generator emits callback(..., {\"position\": position}) (an int), while some committed modules use str(position), and models.py’s ComplexCallable uses Dict[str, Any] rather than Dict[str, U]. This makes regeneration noisy and undermines the README instruction to regenerate the dataset. Recommend updating the generator to exactly match the committed corpus (or re-generating and committing the corpus from this generator) so the process is deterministic.
perf/scale_test/generate_mock_org.py
Outdated
| """ | ||
| output: List[V] = [] | ||
| for position, item in enumerate(items): | ||
| output.append(await callback([item], {{"position": position}})) |
There was a problem hiding this comment.
The checked-in generated modules don’t appear to be reproducible from the current generator output. For example, the generator emits callback(..., {\"position\": position}) (an int), while some committed modules use str(position), and models.py’s ComplexCallable uses Dict[str, Any] rather than Dict[str, U]. This makes regeneration noisy and undermines the README instruction to regenerate the dataset. Recommend updating the generator to exactly match the committed corpus (or re-generating and committing the corpus from this generator) so the process is deterministic.
| output.append(await callback([item], {{"position": position}})) | |
| output.append(await callback([item], {{"position": str(position)}})) |
perf/scale_test/generate_mock_org.py
Outdated
|
|
||
| while len(content.splitlines()) < line_target: | ||
| filler_index = len(content.splitlines()) | ||
| content += f'''\n\n | ||
| def filler_{name}_{filler_index:04d}(\n callback: ComplexCallable[int, str, float],\n) -> Annotated[Callable[[List[int], Dict[str, str]], Awaitable[float]], "filler-signature"]:\n """Return callback unchanged to enlarge AST and annotation graph.\n\n Args:\n callback: Existing callback to pass through unchanged.\n\n Returns:\n The same callback object.\n """\n return callback\n''' |
There was a problem hiding this comment.
This loop repeatedly calls content.splitlines() (twice per iteration), which becomes increasingly expensive as content grows. Consider tracking the current line count in a variable and incrementing it based on appended filler, or computing splitlines() once per iteration, to keep generation time stable for larger corpora.
| while len(content.splitlines()) < line_target: | |
| filler_index = len(content.splitlines()) | |
| content += f'''\n\n | |
| def filler_{name}_{filler_index:04d}(\n callback: ComplexCallable[int, str, float],\n) -> Annotated[Callable[[List[int], Dict[str, str]], Awaitable[float]], "filler-signature"]:\n """Return callback unchanged to enlarge AST and annotation graph.\n\n Args:\n callback: Existing callback to pass through unchanged.\n\n Returns:\n The same callback object.\n """\n return callback\n''' | |
| current_line_count = len(content.splitlines()) | |
| while current_line_count < line_target: | |
| filler_index = current_line_count | |
| filler = f'''\n\n | |
| def filler_{name}_{filler_index:04d}(\n callback: ComplexCallable[int, str, float],\n) -> Annotated[Callable[[List[int], Dict[str, str]], Awaitable[float]], "filler-signature"]:\n """Return callback unchanged to enlarge AST and annotation graph.\n\n Args:\n callback: Existing callback to pass through unchanged.\n\n Returns:\n The same callback object.\n """\n return callback\n''' | |
| content += filler | |
| current_line_count += len(filler.splitlines()) |
Code action save hooks currently require enabling generic source.fixAll, which can trigger unrelated servers. This change advertises and emits source.fixAll.pyrefly so editors can target pyrefly-only fix-all behavior.\n\nThe code action filter now accepts both source.fixAll and source.fixAll.* requests so parent-kind clients remain compatible while pyrefly-specific requests are supported.
When a client requests source.fixAll.*, we should not run pyrefly fix-all for sibling providers. Restrict matching to either the generic source.fixAll ancestor or source.fixAll.pyrefly descendants.
3f64a97 to
4948b1e
Compare
Arths17
left a comment
There was a problem hiding this comment.
Force-push update: this PR now contains only the intended LSP fix for #3024 (2 files changed: server.rs and the corresponding LSP interaction test). The earlier Copilot comments on perf/scale_test/* refer to outdated threads from a prior diff and are no longer applicable to the current head commit.
This comment has been minimized.
This comment has been minimized.
pyrefly/lib/lsp/non_wasm/server.rs
Outdated
| .any(|kind| kind == &CodeActionKind::SOURCE_FIX_ALL) | ||
| kinds.iter().any(|kind| { | ||
| kind == &CodeActionKind::SOURCE_FIX_ALL | ||
| || kind.as_str().starts_with(SOURCE_FIX_ALL_PYREFLY) |
There was a problem hiding this comment.
I think that the second condition should also match source.fixAll.pyrefly exactly rather than using starts_with because source.fixAll.pyreflyyyyyy shouldn't match, nor should source.fixAll.pyrefly.foo
|
The PR correctly updates all 4 locations (constant, capability registration, action tagging, and test). Nice clean change. Agree with @rchl's comment on the filter matching — starts_with(SOURCE_FIX_ALL_PYREFLY) should be an exact match instead: starts_with would incorrectly match After making this change, the PR should be good to merge. |
Reviewer feedback pointed out that prefix matching for source.fixAll.pyrefly would accept unrelated kinds such as source.fixAll.pyrefly.foo. This makes fix-all behavior broader than intended. Use exact matching for source.fixAll.pyrefly (while still accepting source.fixAll), and add regression tests that prove accepted and rejected kind values. This keeps code-action filtering aligned with LSP expectations and prevents false-positive fix-all triggers.
|
Addressed the review feedback on fix-all kind matching and pushed an update. What changed:
Validation run locally:
Latest commit: |
|
@rchl @NathanTempest re-requesting review after the latest push ( |
|
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅ |
NathanTempest
left a comment
There was a problem hiding this comment.
The PR looks solid to merge but I have a few nitpicks -
- Missing negative test for unrelated kinds
The unit tests verify that source.fixAll and source.fixAll.pyrefly are accepted, and that source.fixAll.pyrefly.foo / source.fixAll.pyreflyyyyyy are rejected. But they don't test that completely unrelated kinds like quickfix or refactor.extract are rejected. Adding one line would improve confidence:
assert!(!is_fix_all_code_action_kind_requested(
&CodeActionKind::QUICKFIX
));
- No integration test for the actual code action response kind
The basic.rs test verifies the server advertises source.fixAll.pyrefly in capabilities, but no test verifies that when a fix-all action is actually returned, it carries kind: "source.fixAll.pyrefly" (not the old "source.fixAll"). A lightweight LSP interaction test that triggers a redundant cast scenario and asserts the returned code action kind would close this gap. This is pre-existing technical debt, but since the PR is touching this exact area, it would be the ideal time to add it.
- Function naming
is_fix_all_code_action_kind_requested is verbose. Since it takes a single CodeActionKind and checks if it matches fix-all semantics, something like is_fix_all_kind or matches_fix_all_kind would be more concise while equally clear. This is subjective though.
Summary
source.fixAll.pyreflyin server capabilities instead of genericsource.fixAllcontext.only, run pyrefly fix-all only when the request includes eithersource.fixAllor asource.fixAll.pyrefly...kindFixes #3024.