Conversation
📝 WalkthroughWalkthroughIntroduces lifecycle flag and centralized result construction in CytoDataFrame to preserve widget/HTML state for pandas-derived results; adjusts transposition and wrapping behavior. Also pins two pre-commit hooks, reorders Bandit config in pyproject, and extends tests for slicing and passthrough behavior. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Pull request overview
Fixes CytoDataFrame slice access (cdf.iloc[x:y:z]) so derived frames preserve CytoDataFrame behavior and notebook HTML/widget display state (Issue #204).
Changes:
- Add
_constructor/_build_result_cdfpathway to preserve CytoDataFrame metadata across pandas operations (includingilocslicing). - Adjust transpose/HTML-generation logic to avoid relying on CytoDataFrame’s overridden transpose during rendering.
- Add tests asserting
ilocslices returnCytoDataFrameand keep HTML formatting; bump pre-commit tool versions and relocate Bandit config inpyproject.toml.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/cytodataframe/frame.py |
Introduces _constructor + _build_result_cdf to preserve CytoDataFrame state through slicing/operations; tweaks transpose + HTML generation logic. |
tests/test_frame.py |
Adds regression tests for iloc slicing returning CytoDataFrame and preserving HTML renderer/widget state. |
pyproject.toml |
Moves Bandit config block earlier in the file (plus comment retained). |
.pre-commit-config.yaml |
Updates pyproject-fmt and ruff-pre-commit hook revisions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/cytodataframe/frame.py (1)
385-419: Fix star-arg unpacking after keyword arguments (B026).The static analysis tool correctly flags line 406-407: unpacking
*argsafter keyword arguments is strongly discouraged and can lead to confusing behavior. Since_build_result_cdfis primarily called from__getitem__and_return_cytodataframewithout extra positional args, consider removing the*argsforwarding or restructuring.♻️ Proposed fix to address B026 warning
def _build_result_cdf( self: CytoDataFrame_type, data: Union["CytoDataFrame", pd.DataFrame, pd.Series, Any], - *args: Tuple[Any, ...], **kwargs: Dict[str, Any], ) -> CytoDataFrame_type: """Construct a result frame while preserving CytoDataFrame metadata.""" if data is None and "data" in kwargs: data = kwargs.pop("data") cdf = CytoDataFrame( data=data, data_context_dir=self._custom_attrs["data_context_dir"], data_image_paths=self._custom_attrs["data_image_paths"], data_bounding_box=self._custom_attrs["data_bounding_box"], compartment_center_xy=self._custom_attrs["compartment_center_xy"], data_mask_context_dir=self._custom_attrs["data_mask_context_dir"], data_outline_context_dir=self._custom_attrs["data_outline_context_dir"], segmentation_file_regex=self._custom_attrs["segmentation_file_regex"], image_adjustment=self._custom_attrs["image_adjustment"], display_options=self._custom_attrs["display_options"], - *args, **kwargs, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cytodataframe/frame.py` around lines 385 - 419, The _build_result_cdf function currently forwards *args after many keyword args to the CytoDataFrame constructor which triggers B026 (star-arg unpacking after keywords); remove the *args forwarding and any unused positional propagation (or move positional args before the explicit keywords) so the call becomes an explicit keyword-only construction of CytoDataFrame; update callers like __getitem__ and _return_cytodataframe if they relied on passing extra positional args (validate they don't pass extras) and ensure _build_result_cdf signature can omit *args (or accept only **kwargs) to avoid the unsafe unpacking while preserving metadata assignments to cdf._custom_attrs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cytodataframe/frame.py`:
- Around line 450-463: The code computes result (variable result) then only
assigns cdf when result is a pandas DataFrame, but always returns cdf, which can
trigger UnboundLocalError if result isn't a DataFrame; change the control flow
in the method that calls _build_result_cdf so that after computing result it
checks isinstance(result, pd.DataFrame) and: if True, call
self._build_result_cdf(result), set cdf._custom_attrs["is_transposed"] when
method_name == "transpose" and not self._custom_attrs["is_transposed"], and
return that cdf; otherwise directly return result (or explicitly return result
when not a DataFrame). Ensure references to result, cdf, method_name,
_build_result_cdf and _custom_attrs["is_transposed"] are updated accordingly.
---
Nitpick comments:
In `@src/cytodataframe/frame.py`:
- Around line 385-419: The _build_result_cdf function currently forwards *args
after many keyword args to the CytoDataFrame constructor which triggers B026
(star-arg unpacking after keywords); remove the *args forwarding and any unused
positional propagation (or move positional args before the explicit keywords) so
the call becomes an explicit keyword-only construction of CytoDataFrame; update
callers like __getitem__ and _return_cytodataframe if they relied on passing
extra positional args (validate they don't pass extras) and ensure
_build_result_cdf signature can omit *args (or accept only **kwargs) to avoid
the unsafe unpacking while preserving metadata assignments to cdf._custom_attrs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3fb3a13d-7af5-40c7-addd-bc765b2cd7f7
📒 Files selected for processing (4)
.pre-commit-config.yamlpyproject.tomlsrc/cytodataframe/frame.pytests/test_frame.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/test_frame.py (1)
799-801: Add bracket-slice coverage too.These tests only hit
.iloc, but the production change also rewiresCytoDataFrame.__getitem__()forcdf[x:y]/cdf[x:y:z]. A regression in the direct slice path would still pass this suite, so I'd add at least one type +_repr_html_(debug=True)assertion forcdf[0:2],cdf[1:1], andcdf[0:5:2].Also applies to: 814-832
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_frame.py` around lines 799 - 801, The tests only exercise iloc slicing but the change also affects CytoDataFrame.__getitem__ for direct bracket slices; add parallel assertions for cdf[0:2], cdf[1:1], and cdf[0:5:2] that check isinstance(..., CytoDataFrame) and call _repr_html_(debug=True) to ensure the direct-slice path is covered; update the tests near the existing iloc checks (and likewise for the block around 814-832) to include these bracket-slice assertions so regressions in CytoDataFrame.__getitem__ are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cytodataframe/frame.py`:
- Around line 459-463: The transpose handling currently only sets
cdf._custom_attrs["is_transposed"] = True when the parent isn't transposed,
which fails to flip the flag on repeated transposes (e.g., cdf.T.T); change the
logic in the block after _build_result_cdf(result) (the code checking
method_name == "transpose") to toggle the transposed flag on the resulting cdf
(e.g., set cdf._custom_attrs["is_transposed"] to the inverse of the current flag
instead of only setting True) so _generate_jupyter_dataframe_html sees the
correct transposed state.
---
Nitpick comments:
In `@tests/test_frame.py`:
- Around line 799-801: The tests only exercise iloc slicing but the change also
affects CytoDataFrame.__getitem__ for direct bracket slices; add parallel
assertions for cdf[0:2], cdf[1:1], and cdf[0:5:2] that check isinstance(...,
CytoDataFrame) and call _repr_html_(debug=True) to ensure the direct-slice path
is covered; update the tests near the existing iloc checks (and likewise for the
block around 814-832) to include these bracket-slice assertions so regressions
in CytoDataFrame.__getitem__ are caught.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 252f24c8-1e5b-4d56-a13a-9e9129d3ca76
📒 Files selected for processing (3)
pyproject.tomlsrc/cytodataframe/frame.pytests/test_frame.py
✅ Files skipped from review due to trivial changes (1)
- pyproject.toml
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/cytodataframe/frame.py (2)
4376-4381: Consider simplifying the conditional logic.The condition
if self._custom_attrs["is_transposed"] or bounding_box_externally_joineddetermines whether to reusedataor callself.copy(). However, at this point in the code:
- If
is_transposedis true,datawas already assigned at line 4360- If
bounding_box_externally_joinedis true,datawas assigned via join at lines 4369-4373The condition correctly avoids redundant
self.copy()calls whendatais already initialized, but the pattern is repeated three times (lines 4376-4381, 4398-4404, 4428-4435). Consider extracting a helper flag likedata_already_initializedset after eachdata = ...assignment to make the logic clearer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cytodataframe/frame.py` around lines 4376 - 4381, The conditional that decides whether to reuse `data` or call `self.copy()` is repeated and unclear; introduce a boolean flag (e.g., `data_already_initialized`) set to True immediately after each `data = ...` assignment (the assignments tied to `self._custom_attrs["is_transposed"]`, the join that sets `data` when `bounding_box_externally_joined`, and any other place `data` is assigned) and then replace the three repeated ternary checks with a single check `if data_already_initialized: use data else: data = self.copy()` so that the logic around `data` reuse in the blocks referencing `data` is explicit and only calls `self.copy()` when needed.
392-393: Consider removing the defensivedatacheck.The
data is None and "data" in kwargscheck appears unreachable sincedatais a required positional parameter with no default value. IfdataisNone, it was explicitly passed asNone. Popping from kwargs would only matter if someone called_build_result_cdf(None, data=some_df), which is an unusual pattern.If this is intentional for a specific use case (e.g., pandas internal calls), consider adding a brief comment explaining when this scenario occurs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cytodataframe/frame.py` around lines 392 - 393, The conditional check `if data is None and "data" in kwargs: data = kwargs.pop("data")` in _build_result_cdf is unnecessary because `data` is a required positional parameter and this branch is effectively unreachable; remove that `if` block (or if you intentionally expect callers to pass `None` positionally and a `data` kwarg, keep it but add a concise comment above the line explaining that specific pandas/internal calling pattern) so _build_result_cdf no longer pops `data` from kwargs unexpectedly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/cytodataframe/frame.py`:
- Around line 4376-4381: The conditional that decides whether to reuse `data` or
call `self.copy()` is repeated and unclear; introduce a boolean flag (e.g.,
`data_already_initialized`) set to True immediately after each `data = ...`
assignment (the assignments tied to `self._custom_attrs["is_transposed"]`, the
join that sets `data` when `bounding_box_externally_joined`, and any other place
`data` is assigned) and then replace the three repeated ternary checks with a
single check `if data_already_initialized: use data else: data = self.copy()` so
that the logic around `data` reuse in the blocks referencing `data` is explicit
and only calls `self.copy()` when needed.
- Around line 392-393: The conditional check `if data is None and "data" in
kwargs: data = kwargs.pop("data")` in _build_result_cdf is unnecessary because
`data` is a required positional parameter and this branch is effectively
unreachable; remove that `if` block (or if you intentionally expect callers to
pass `None` positionally and a `data` kwarg, keep it but add a concise comment
above the line explaining that specific pandas/internal calling pattern) so
_build_result_cdf no longer pops `data` from kwargs unexpectedly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 90cc0305-2d5e-4f72-a629-74495ed002a6
📒 Files selected for processing (2)
src/cytodataframe/frame.pytests/test_frame.py
✅ Files skipped from review due to trivial changes (1)
- tests/test_frame.py
Description
This PR addresses an issue with CytoDataFrame when accessing slices (
CDF[x:x:x], orCDF[x:x]).Closes #204
What kind of change(s) are included?
Checklist
Please ensure that all boxes are checked before indicating that this pull request is ready for review.
Summary by CodeRabbit
Bug Fixes
Chores
Tests