The original cascade delete worked by trial-and-error: attempt DELETE on the parent, catch the FK integrity error, parse the MySQL error message to discover which child table is blocking, then recursively delete from that child first.
This approach has several problems:
- MySQL 8 with limited privileges: Returns error 1217 (
ROW_IS_REFERENCED) instead of 1451 (ROW_IS_REFERENCED_2), which provides no table name. The cascade crashes (#1110). - PostgreSQL overhead: PostgreSQL aborts the entire transaction on any error. Each failed delete attempt requires
SAVEPOINT/ROLLBACK TO SAVEPOINTround-trips. - Fragile parsing: Different MySQL versions and privilege levels produce different error message formats.
drop() already uses graph-driven traversal — walking the dependency graph in reverse topological order, dropping leaves first. The same pattern applies to cascade delete, with the addition of restriction propagation through FK attribute mappings.
dj.Diagram provides set operators for specifying subsets of tables. Per-node restrictions complete the functionality for specifying cross-sections of data — enabling delete, export, backup, and sharing.
Single class Diagram(nx.DiGraph) with all operational methods always available. Only visualization methods (draw, make_dot, make_svg, make_png, make_image, make_mermaid, save, _repr_svg_) are gated on diagram_active.
Dependencies is the canonical store of the FK graph. Diagram copies from it and constructs derived views.
self._connection # Connection
self._cascade_restrictions # dict[str, list] — per-node OR restrictions (cascade mode)
self._restrict_conditions # dict[str, AndList] — per-node AND restrictions (restrict mode)
self._restriction_attrs # dict[str, set] — restriction attribute names per node
self._part_integrity # str — "enforce", "ignore", or "cascade" (set by cascade())A diagram operates in one of three states: unrestricted (initial), cascade, or restrict. The modes are mutually exclusive. cascade is applied once; restrict can be chained.
# cascade: applied once, OR at convergence, for delete
rd = dj.Diagram(schema).cascade(Session & 'subject_id=1')
# restrict: chainable, AND at convergence, for export
rd = dj.Diagram(schema).restrict(Session & cond).restrict(Stimulus & cond2)
# Mixing raises DataJointErrorA restriction applied to one table node propagates downstream through FK edges in topological order. Each downstream node accumulates a restriction derived from its restricted parent(s).
For edge Parent → Child with attr_map:
| Condition | Child restriction |
|---|---|
Non-aliased AND parent_attrs ⊆ child.primary_key |
Copy parent restriction directly |
Aliased FK (attr_map renames columns) |
parent_ft.proj(**{fk: pk for fk, pk in attr_map.items()}) |
Non-aliased AND parent_attrs ⊄ child.primary_key |
parent_ft.proj() |
Restrictions are applied via restrict() → make_condition(), ensuring AndList and QueryExpression objects are properly converted to SQL. Direct assignment to _restriction is never used, as where_clause() would produce invalid SQL from str(AndList) or str(QueryExpression).
A child node may have multiple restricted ancestors. The combination rule depends on the operator:
Session ──→ Recording ←── Stimulus
↓ ↓
subject=1 type="visual"
Recording receives two propagated restrictions: R1 from Session, R2 from Stimulus.
cascade — OR (union): A recording is deleted if tainted by any restricted parent. Correct for referential integrity: if the parent row is being deleted, all child rows referencing it must go. Implemented by passing the full restriction list to restrict(), which creates an OrList.
restrict — AND (intersection): A recording is included only if it satisfies all restricted ancestors. Correct for subsetting: only rows matching every condition are selected. Implemented by iterating restrictions and calling restrict() for each.
| DataJoint type | Python type | SQL meaning |
|---|---|---|
| OR-combined restrictions | list |
WHERE (r1) OR (r2) OR ... |
| AND-combined restrictions | AndList |
WHERE (r1) AND (r2) AND ... |
| No restriction | empty list or AndList() |
No WHERE clause (all rows) |
A child may reference the same parent through multiple FKs (e.g., source_mouse and target_mouse both referencing Mouse). These are represented as alias nodes in the dependency graph. Multiple FK paths from the same restricted parent always combine with OR — structural, not operation-dependent.
| Mode | Behavior |
|---|---|
"enforce" |
Data-driven post-check: raises only when rows were actually deleted from a Part without its master also being deleted. Avoids false positives when a Part appears in the cascade but has zero affected rows. |
"ignore" |
Allow deleting parts without masters |
"cascade" |
Propagate restriction upward from part to master, then re-propagate downstream |
If a child table lives in a schema not loaded into the dependency graph, the graph-driven delete won't know about it. The final parent delete_quick() fails with an FK error. Error-message parsing is retained as a diagnostic fallback to produce an actionable error: "activate schema X."
Apply cascade restriction and propagate downstream. Returns a new Diagram. One-shot — cannot be called twice or mixed with restrict().
- Verify no existing cascade or restrict restrictions
- Copy diagram, seed
_cascade_restrictions[root]withlist(table_expr.restriction) - Propagate via
_propagate_restrictions(root, mode="cascade", part_integrity=part_integrity)
Apply restrict condition and propagate downstream. Returns a new Diagram. Chainable — can be called multiple times. Cannot be mixed with cascade().
- Verify no existing cascade restrictions
- Copy diagram, seed/extend
_restrict_conditions[root]withtable_expr.restriction - Propagate via
_propagate_restrictions(root, mode="restrict")
Execute cascading delete. Requires cascade() first.
- If
dry_run: returnpreview()without modifying data - Get non-alias nodes with restrictions in topological order
- If
prompt: show preview (table name + row count for each) - Start transaction
- Delete in reverse topological order (leaves first) via
_restrict_freetable()+delete_quick() - On
IntegrityError: cancel transaction, parse FK error for actionable message about unloaded schemas - Post-check
part_integrity="enforce": if any part table had rows deleted but its master did not, cancel transaction and raise - Confirm/commit, return count from the root table
Drop all tables in nodes_to_show in reverse topological order. Pre-checks part_integrity structurally (tables, not rows). If dry_run, returns row counts without dropping.
Return {full_table_name: row_count} for each node with a restriction. Requires cascade() or restrict() first. Uses _restrict_freetable() to apply restrictions with correct OR/AND semantics.
Remove tables with zero matching rows. With restrictions, removes nodes where the restricted query yields zero rows. Without restrictions, removes physically empty tables. Idempotent and chainable.
Static helper. Applies restrictions to a FreeTable using restrict() for proper SQL generation.
- cascade mode: Passes the entire restriction list to
restrict(), creating an OrList (OR semantics). - restrict mode: Iterates restrictions, calling
restrict()for each (AND semantics).
Classmethod factory for Table.delete() and Table.drop(). Creates a Diagram containing table_expr and all its descendants.
def delete(self, transaction=True, prompt=None, part_integrity="enforce", dry_run=False):
diagram = Diagram._from_table(self)
diagram = diagram.cascade(self, part_integrity=part_integrity)
return diagram.delete(transaction=transaction, prompt=prompt, dry_run=dry_run)
def drop(self, prompt=None, part_integrity="enforce", dry_run=False):
if self.restriction:
raise DataJointError("A restricted Table cannot be dropped.")
diagram = Diagram._from_table(self)
diagram.drop(prompt=prompt, part_integrity=part_integrity, dry_run=dry_run)# cascade: OR propagation for delete
rd = dj.Diagram(schema).cascade(Session & 'subject_id=1')
rd.preview() # show affected tables and row counts
rd.delete() # downstream only, OR at convergence
# restrict: AND propagation for data subsetting
rd = (dj.Diagram(schema)
.restrict(Session & 'subject_id=1')
.restrict(Stimulus & 'type="visual"'))
rd.preview() # show selected tables and row counts
# prune: remove tables with zero matching rows
rd = (dj.Diagram(schema)
.restrict(Subject & {'species': 'mouse'})
.restrict(Session & 'session_date > "2024-01-01"')
.prune())
rd.preview() # only tables with matching rows
# dry_run: preview without executing
counts = (Session & 'subject_id=1').delete(dry_run=True)
# returns {full_table_name: affected_row_count}
# Table.delete() delegates to Diagram internally
(Session & 'subject_id=1').delete()| Error-driven | Graph-driven | |
|---|---|---|
| MySQL 8 + limited privileges | Crashes (#1110) | Works — no error parsing needed |
| PostgreSQL | Savepoint overhead per attempt | No errors triggered |
| Multiple FKs to same child | One-at-a-time via retry loop | All paths resolved upfront |
| part_integrity enforcement | Post-hoc check after delete | Data-driven post-check (no false positives) |
| Unloaded schemas | Crash with opaque error | Clear error: "activate schema X" |
| Reusability | Delete-only | Delete, drop, export, prune |
| Inspectability | Opaque recursive cascade | preview() / dry_run before executing |