docs: document restricted diagram operations (new in 2.2) by dimitri-yatsenko · Pull Request #155 · datajoint/datajoint-docs

dimitri-yatsenko · 2026-02-22T17:55:37Z

Summary

Add Operational Methods section to diagram.md spec: cascade(), restrict(), delete(), drop(), preview(), restriction propagation rules, OR-vs-AND convergence
Add Graph-Driven Diagram Operations section to whats-new-22.md: motivation, preview-then-execute pattern, two propagation modes, architecture note
Add Diagram-Level Delete section to delete-data.md: build-preview-execute workflow, when to use
Add version admonition in data-manipulation.md noting graph-driven cascade internals
Cross-references between all four files

Test plan

mkdocs build succeeds with no new warnings
Review rendered pages for diagram.md, whats-new-22.md, delete-data.md, data-manipulation.md
Verify cross-reference links resolve correctly

🤖 Generated with Claude Code

- Add Operational Methods section to diagram.md spec: cascade(), restrict(), delete(), drop(), preview(), prune(), restriction propagation rules, OR-vs-AND convergence - Add Graph-Driven Diagram Operations section to whats-new-22.md: motivation, preview-then-execute pattern, two propagation modes, pruning empty tables - Add Diagram-Level Delete section to delete-data.md: build-preview-execute workflow, when to use - Add prune() to read-diagrams how-to - Add version admonition in data-manipulation.md noting graph-driven cascade internals - Cross-references between all files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add dry_run parameter to delete() and drop() signatures in diagram.md - Fix trailing slashes in cross-reference paths across 3 files - Convert inline version markers to proper admonitions in read-diagrams.ipynb - Normalize table-cell version markers to consistent *(New in X.Y)* format Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…stors Both cascade() and restrict() propagate downstream only from the seed table. Ancestors of the seed are excluded. Document this in the diagram spec (cascade and restrict method descriptions) and the whats-new explanation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Ancestors remain in the diagram but receive no restrictions and are unaffected by delete/preview. Previous wording ("excluded") was imprecise — they're not removed from the graph, just not operated on. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update diagram.md and whats-new-22.md to reflect that cascade() returns a trimmed Diagram containing only seed + descendants, while restrict() keeps the full graph intact for chaining. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Lead each description with its purpose rather than using parallel structure. cascade() prepares a delete (one-shot, trims graph, OR). restrict() selects a data subset (chainable, preserves graph, AND). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Describe how cascade propagates restrictions upward from part to master, then back downstream to all sibling parts, deleting the entire compositional unit. Updated in both diagram.md and master-part.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

dimitri-yatsenko · 2026-03-10T21:04:40Z

src/explanation/whats-new-22.md

 # What's New in DataJoint 2.2

-DataJoint 2.2 introduces **isolated instances** and **thread-safe mode** for applications that need multiple independent database connections—web servers, multi-tenant notebooks, parallel pipelines, and testing.
+DataJoint 2.2 introduces **isolated instances**, **thread-safe mode**, and **graph-driven diagram operations** for applications that need multiple independent database connections, explicit cascade control, and operational use of the dependency graph.


Restructure this sentence. Thread-safe mode is for applications that handle multiple connections but graph driven operations are for general use, even for single-threaded operations.

Fixed. Restructured to: "introduces isolated instances and thread-safe mode for applications that need multiple independent database connections, and graph-driven diagram operations that replace the legacy error-driven cascade with a reliable, inspectable approach for all users."

dimitri-yatsenko · 2026-03-10T21:07:29Z

src/explanation/whats-new-22.md

+
+The diagram supports two restriction propagation modes designed for fundamentally different tasks.
+
+**`cascade()` prepares a delete.** It takes a single restricted table expression, propagates the restriction downstream through all descendants, and **trims the diagram** to the resulting subgraph — ancestors and unrelated tables are removed entirely. Convergence uses OR: a descendant row is marked for deletion if *any* ancestor path reaches it, because if any reason exists to remove a row, it should be removed. `cascade()` is one-shot and is always followed by `preview()` or `delete()`.


Here include a description of how cascade behaves when it encounters a part table whose master is not yet included in the cascade.

Added a paragraph after cascade() describing part table behavior: with "enforce" (default), delete() raises an error if part rows would be deleted without their master; with "cascade", the restriction propagates upward from part to master, then back downstream to all sibling parts — deleting the entire compositional unit.

dimitri-yatsenko · 2026-03-10T21:09:08Z

src/explanation/whats-new-22.md

+
+**`cascade()` prepares a delete.** It takes a single restricted table expression, propagates the restriction downstream through all descendants, and **trims the diagram** to the resulting subgraph — ancestors and unrelated tables are removed entirely. Convergence uses OR: a descendant row is marked for deletion if *any* ancestor path reaches it, because if any reason exists to remove a row, it should be removed. `cascade()` is one-shot and is always followed by `preview()` or `delete()`.
+
+**`restrict()` selects a data subset.** It propagates a restriction downstream but **preserves the full diagram**, allowing `restrict()` to be called again from a different seed table. This makes it possible to build up multi-condition subsets incrementally — for example, restricting by species from one table and by date from another. Convergence uses AND: a descendant row is included only if *all* restricted ancestors match, because an export should contain only rows satisfying every condition. After chaining restrictions, use `prune()` to remove empty tables and `preview()` to inspect the result.


Also, describe how part tables are restricted when reached through a path that did not include its master.

Covered in the same paragraph — describes the upward propagation from part to master when the part is reached through a path that didn't include its master.

dimitri-yatsenko · 2026-03-10T21:10:08Z

src/explanation/whats-new-22.md

+
+**`restrict()` selects a data subset.** It propagates a restriction downstream but **preserves the full diagram**, allowing `restrict()` to be called again from a different seed table. This makes it possible to build up multi-condition subsets incrementally — for example, restricting by species from one table and by date from another. Convergence uses AND: a descendant row is included only if *all* restricted ancestors match, because an export should contain only rows satisfying every condition. After chaining restrictions, use `prune()` to remove empty tables and `preview()` to inspect the result.
+
+The two modes are mutually exclusive on the same diagram. This prevents accidental mixing of incompatible semantics — a delete diagram should never be reused for subsetting, and vice versa.


Helpfully, datajoint will raise an error when user attempts to mix cascade and restrict or when he attempts to cascade more than once.

Added: "DataJoint raises an error if you attempt to mix cascade() and restrict(), or if you call cascade() more than once."

dimitri-yatsenko · 2026-03-10T21:11:34Z

src/how-to/delete-data.md

+!!! version-added "New in 2.2"
+    Diagram-level delete was added in DataJoint 2.2.
+
+For complex scenarios — previewing the blast radius, working across schemas, or understanding the dependency graph before deleting — use `dj.Diagram` to build and inspect the cascade before executing.


can the same thing be accomplished with table.delete(dry_run=True)?

Yes — table.delete(dry_run=True) returns the same affected row counts without deleting anything. Added a note at the top of the section mentioning this simpler alternative before introducing the diagram-level API.

dimitri-yatsenko · 2026-03-10T21:16:12Z

src/reference/specs/diagram.md

+| `"ignore"` | Allow deleting parts without masters |
+| `"cascade"` | Propagate restriction upward from part to master, then re-propagate downstream to all sibling parts |
+
+With `"cascade"`, the restriction flows **upward** from a part table to its master: the restricted part rows identify which master rows are affected, those masters receive a restriction, and that restriction propagates back downstream through the normal cascade — deleting the entire compositional unit (master + all parts), not just the originally matched part rows.


If I understand this correctly, this may result in a restriction on the part table with a subquery in its where clause that references the same part table. This may produce an error in MySQL when attempting to delete from this table. Do we have a test that covers this part-to-master propagation in delete?

Yes, there are tests covering this. In test_cascading_delete.py:

test_delete_parts() — deletes a Part table (Website) with part_integrity="cascade", triggering upward propagation to its master (Profile)

test_delete_parts_complex() — cascades from A with part_integrity="cascade" through complex master/part relationships, verifying correct row counts

test_delete_parts_error() — verifies that part_integrity="enforce" raises DataJointError when parts would be deleted without their master

On the self-referencing subquery concern: the implementation uses (master_ft.proj() & child_ft.proj()).to_arrays() to materialize the restriction into a concrete tuple list before applying it to the master. This avoids a live subquery referencing the same table. The propagated_edges and visited_masters sets prevent the restriction from cycling back to the part table on subsequent passes. That said, this is worth monitoring — if the tuple list is very large, it could hit MySQL packet limits. We could add a targeted test for that edge case.

dimitri-yatsenko · 2026-03-10T21:19:30Z

src/reference/specs/diagram.md

+diag.restrict(table_expr)
+```
+
+Select a subset of data for export or inspection. Starting from a restricted table expression, propagate the restriction downstream through all descendants using **AND** semantics — a descendant row is included only if *all* restricted ancestors match. The full diagram is preserved (ancestors, unrelated tables) so that `restrict()` can be called again from a different seed table, building up a multi-condition subset incrementally.


How does the restriction of parts restrict their masters when not reached through the master? Should part integrity be accounted in diagram restriction as well?

In cascade() mode: when a part table is reached through a path that did not include its master, the part_integrity setting controls the behavior:

"enforce" (default): the cascade proceeds, but delete() runs a post-check — if part rows were deleted without the master also being deleted, it rolls back and raises an error.

"cascade": the restriction propagates upward from the part to its master (using to_arrays() to materialize the join), then back downstream to all sibling parts.

"ignore": the part is deleted without checking the master.

For restrict() mode: part_integrity is not currently applied. restrict() is designed for subsetting/export, not delete, so the master-part integrity constraint is less relevant — you are selecting data, not removing it. If we wanted to support "include the master whenever any part matches" in restrict mode, that would be a feature addition. Worth discussing but probably not needed for the initial release.

dimitri-yatsenko · 2026-03-10T21:21:44Z

src/reference/specs/diagram.md

+```python
+diag.delete(transaction=True, prompt=None, dry_run=False)
+```


Should delete be removed or hidden from diagram and only executed from table? Generally, delete should not be called unless cascade has been applied? Or does this also allow for unrestricted delete from all tables in the diagram? I guess this could go either way. What is the advantage of keeping the delete in dj.Diagram if it's always called with a single seed table?

Diagram.delete() requires cascade() first — it raises DataJointError("No cascade restrictions applied. Call cascade() first.") if called on an unrestricted diagram. So it cannot do an unrestricted delete of all tables.

The advantage of keeping delete() on Diagram rather than only on Table:

Multi-schema cascades: Table.delete() constructs a diagram from _from_table(self), which only includes the table and its descendants. If you need to cascade across schemas or from a specific subgraph, the diagram API lets you construct the graph first with dj.Diagram(schema1) + dj.Diagram(schema2), then cascade().

Preview workflow: The diagram API enables cascade() → preview() → delete() as a three-step pattern. Table.delete(dry_run=True) provides the same counts but the diagram also supports visualization of the cascade subgraph.

Future operations: restrict() → prune() → export() — the diagram is the foundation for data subsetting, which does not start from a single table.

That said, for the common case of "delete rows from one table", Table.delete() is simpler and delegates to Diagram internally. The diagram-level API is the power-user interface.

Done. delete() and drop() have been removed from Diagram's public API. All mutation logic (transaction management, SQL execution, prompts) now lives in Table.delete() and Table.drop(). Diagram remains purely a graph computation and inspection tool: cascade(), restrict(), preview(), prune().

The design docs (docs/design/restricted-diagram.md and docs/design/thread-safe-mode.md) have also been removed from datajoint-python — their content is captured in the datajoint-docs specs.

dimitri-yatsenko · 2026-03-10T21:28:38Z

src/reference/specs/diagram.md

+diag.prune()
+```
+
+Remove tables with zero matching rows from the diagram. Without prior restrictions, removes physically empty tables. With restrictions (`cascade()` or `restrict()`), removes tables where the restricted query yields zero rows.


We may want to highlight that removing from the diagram only does that -- remove from the diagram view, not affecting the tables themselves.

Good call. Updated the prune() description to: "Remove tables with zero matching rows from the diagram view. This only affects the diagram object — no tables or data are modified in the database."

Diagram is now an inspection-only tool. delete() and drop() have been moved to Table. Updated diagram spec, whats-new-22, and delete-data how-to to reflect this change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

dimitri-yatsenko changed the base branch from docs/v2.2-thread-safe to main March 6, 2026 20:56

dimitri-yatsenko force-pushed the docs/v2.2-restricted-diagram branch from 21a36c6 to 443090c Compare March 6, 2026 21:28

dimitri-yatsenko and others added 3 commits March 9, 2026 08:55

ttngu207 previously approved these changes Mar 9, 2026

View reviewed changes

dimitri-yatsenko dismissed ttngu207’s stale review via fbbafc2 March 9, 2026 21:06

dimitri-yatsenko and others added 2 commits March 9, 2026 16:07

dimitri-yatsenko commented Mar 10, 2026

View reviewed changes

docs: remove delete/drop from Diagram public API

187408b

Diagram is now an inspection-only tool. delete() and drop() have been moved to Table. Updated diagram spec, whats-new-22, and delete-data how-to to reflect this change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>


		The diagram supports two restriction propagation modes designed for fundamentally different tasks.

		`cascade()` prepares a delete. It takes a single restricted table expression, propagates the restriction downstream through all descendants, and trims the diagram to the resulting subgraph — ancestors and unrelated tables are removed entirely. Convergence uses OR: a descendant row is marked for deletion if any ancestor path reaches it, because if any reason exists to remove a row, it should be removed. `cascade()` is one-shot and is always followed by `preview()` or `delete()`.


		`cascade()` prepares a delete. It takes a single restricted table expression, propagates the restriction downstream through all descendants, and trims the diagram to the resulting subgraph — ancestors and unrelated tables are removed entirely. Convergence uses OR: a descendant row is marked for deletion if any ancestor path reaches it, because if any reason exists to remove a row, it should be removed. `cascade()` is one-shot and is always followed by `preview()` or `delete()`.

		`restrict()` selects a data subset. It propagates a restriction downstream but preserves the full diagram, allowing `restrict()` to be called again from a different seed table. This makes it possible to build up multi-condition subsets incrementally — for example, restricting by species from one table and by date from another. Convergence uses AND: a descendant row is included only if all restricted ancestors match, because an export should contain only rows satisfying every condition. After chaining restrictions, use `prune()` to remove empty tables and `preview()` to inspect the result.


		`restrict()` selects a data subset. It propagates a restriction downstream but preserves the full diagram, allowing `restrict()` to be called again from a different seed table. This makes it possible to build up multi-condition subsets incrementally — for example, restricting by species from one table and by date from another. Convergence uses AND: a descendant row is included only if all restricted ancestors match, because an export should contain only rows satisfying every condition. After chaining restrictions, use `prune()` to remove empty tables and `preview()` to inspect the result.

		The two modes are mutually exclusive on the same diagram. This prevents accidental mixing of incompatible semantics — a delete diagram should never be reused for subsetting, and vice versa.

Conversation

dimitri-yatsenko commented Feb 22, 2026

Summary

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants