Skip to content

Backend query remodel: Phase 1 (N+1 + write explosion) + Phase 2 (CourseAssignment)#56

Open
espadonne wants to merge 60 commits into
Lab-Lab-Lab:mainfrom
mfwolffe:backend-remodel-phase1
Open

Backend query remodel: Phase 1 (N+1 + write explosion) + Phase 2 (CourseAssignment)#56
espadonne wants to merge 60 commits into
Lab-Lab-Lab:mainfrom
mfwolffe:backend-remodel-phase1

Conversation

@espadonne

@espadonne espadonne commented Jun 27, 2026

Copy link
Copy Markdown

Backend query remodel. Phase 1 kills read-side N+1 fan-out and write-side row/query explosion on the hot teacher/grading/assign paths (additive indexes only). Phase 2 is the structural change: one course-level CourseAssignment replaces the per-student Assignment row explosion, with per-student instrument/part resolved at read time and persisted on Submission. The JSON API contract is preserved end-to-end — the frontend needs no changes. Directives: docs/remodel_campaign.md, docs/remodel_phase2_design.md.

Phase 1 — read-side wins (1a), each guarded by a query-count test verified to fail on the old code

Endpoint Before After
AssignmentViewSet.list (teacher) 484 q @ 20 students constant
grouped/telephone list (GroupSerializer O(M²)) 1009 q @ 15 members constant
AssignmentViewSet.list (student) 145 q @ 20 assignments constant
TeacherSubmissionViewSet.recent (grading) 501 q @ 20 students constant
dashboard CSV export 9000+ (capped) 2
enrollment / roster / piece-plan / activity / grade lists linear constant

Phase 1 — write-side wins (1b/1c)

Operation Before After
assign_one_piece_activity update_or_create/student, 142 q @ 20 bulk_create, constant
assign_telephone_fixed per-row create + Part re-derive bulk_create, constant
change_piece_instrument save()/assignment, 109 q @ 20 1 UPDATE

Plus: UserViewSet update scoping fix + its N+1 (#13); removed dead PiecePlan.assign; collapsed Part.for_activity double query; planned-order subquery annotation (#23/#24).

Phase 2 — structural remodel (Option B, fully dynamic)

Steps 1–6 are additive and backfilled (new tables dual-written alongside the old path): CourseAssignment + lightweight GroupAssignment models (16e0c19), dual-write from the assign helpers (a46d24b), backfill from existing Assignment rows (c57a671), and course_assignment/enrollment/instrument/part added to Submission + ActivityProgress with backfills (0781d99, 8b69a9d). Migration 0033's live-helper call neutralized first so a fresh migrate is safe (200a34d).

Step 7 — read-path flip (contract-sensitive). List/retrieve/submissions/activity-progress now resolve from CourseAssignment (the id is course_assignment.id), scoping nested routes by the requesting enrollment. Equivalence serializers pin field-for-field parity except id; the list precomputes per-CA part/submissions/group maps to stay O(1). Late joiners (no prior row) can now submit and track progress.

  • 299570f read-resolution serializer + response-equivalence tests
  • 40057e3 student list · 9d54429 student retrieve · ee32876 submission/activity-progress routes
  • 960c5b2 teacher list → one row per CourseAssignment (verified frontend getAssignedPieces only derives the distinct (piece, activity) set; all /assignments/ redux consumers are student-facing)
  • 1891829, cfccbe3 fixed two latent test-factory bugs surfaced en route (datetime-typed DateField; non-unique usernameUNIQUE(user, course) collisions)

Step 8 — Assignment fully removed. First made dead (neither read nor written), then dropped.

  • 7dc62f7 GroupSerializer.get_membersGroupAssignment · 3853b78 recent → submission native fields · 5ea3870 ActivityViewSetCourseAssignment
  • 0dc288e per-piece instrument override moved to CourseAssignment.instrument; resolve_instrument prefers it, change_piece_instrument sets it
  • e0b2718 assign helpers stop writing Assignment · 19022e0 unassign repointed · 6cb3e9c unique(course_assignment, enrollment) on ActivityProgress · f627690 dashboards rebuilt from CourseAssignment × enrollments
  • 78b7d43 dropped dead AssignmentViewSet actions + serializers · 53fffa7 dropped Assignment model + the assignment FK columns on Submission/ActivityProgress · db0f9c8 admin repoint · edd8a33 ActivityProgressSerializer exposes course_assignment/enrollment
  • b529b90, 9a870d8 test fixtures migrated off AssignmentFactory; obsolete transitional/backfill tests deleted

Migration note: assignments/0041 drops the unique_assignment constraint before the field removals, so SQLite's table-remake doesn't reference removed fields. Dropping Assignment rows is safe (fully backfilled into CourseAssignment); Submission/ActivityProgress rows keep all data — only the redundant FK column goes.

Tests

pytest teleband66 passed, 2 failed.

  • teleband/assignments (list/retrieve flip, ordering, query counts)
  • teleband/submissions (nested routes, phase-2 fields/progress, grading query counts)
  • teleband/courses (assign/telephone/unassign write counts, change-instrument)
  • teleband/dashboards (CSV export query bound)
  • full-chain migrate on a fresh DB; makemigrations --check clean
  • black --check (CI's unpinned latest) clean

The 2 failures (test_me, admin test_add) are pre-existing on a clean main (UserSerializer groups field + admin redirect), reproduced before any changes here.

espadonne added 16 commits June 27, 2026 17:08
Teacher/student assignment list issued queries linear (group path quadratic)
in roster size. Add the missing part-tree select_related/prefetch and
submissions__attachments to a shared queryset, and memoize
GroupSerializer.get_members per group in the serializer context. Query count
is now constant in roster/group size (was 484/1009 queries at 20 students/15
group members, verified by teleband/assignments/tests/test_query_counts.py).
The grading dashboard's latest-submission-per-student endpoint serialized a
deep assignment/enrollment/part tree over an unoptimized queryset, fanning out
per submitting student (501 queries at 20 students). Add matching
select_related/prefetch_related so the count is constant, guarded by
teleband/submissions/tests/test_query_counts.py.
…ets (Phase 1a)

Add the select_related/prefetch_related each list serializer walks so query
counts stay constant in list size: EnrollmentViewSet.list (course/owner,
instrument, role, user), PiecePlanViewSet (piece/composer, activities tree),
ActivityViewSet (activity_type/category/part_type), GradeViewSet (reverse
submission one-to-ones). Guarded by enrollment and activity query-count tests.
…inate (Phase 1a)

csv_view iterated submissions/attachments as unprefetched reverse relations and
wrote str(activity)/str(piece_plan) whose __str__ walk uncovered FKs -- 9000+
queries for the full export, now 2. AssignmentListView used prefetch_related for
forward FKs (extra query each) and had no pagination. Guard nullable piece_plan
in the CSV rows. Guarded by teleband/dashboards/tests/test_query_counts.py.
The update/partial_update queryset hardcoded user__username="admin", scoping
teacher user-edits to the literal admin user's courses instead of the requesting
teacher's, and built the course list with a per-row Python loop. Replace with a
single subquery over the requester's taught courses + distinct(). Behavior
change: teachers now correctly see students in their own courses. Guarded by
teleband/users/tests/test_user_scoping.py.
…e 1b Lab-Lab-Lab#15, Lab-Lab-Lab#20)

assign_one_piece_activity ran update_or_create per student (2 queries each,
silently swallowing the unique-constraint violation on re-assign). Now it fetches
already-assigned enrollment ids once and bulk_creates only the missing rows --
query count constant in roster size (was 142 at 20 students). Idempotent
re-assign preserved. NB: kept select_related to instrument only (not user): this
helper is invoked by the live data migration assignments/0033, where eagerly
selecting users.external_id breaks a fresh migrate. ATOMIC_REQUESTS=True already
makes the assign views transactional, so no extra wrapper needed.
…b-Lab-Lab#13 done; note ATOMIC_REQUESTS + migration-0033 landmine
@espadonne espadonne changed the title Phase 1a: eliminate read-side N+1 fan-out on hot endpoints Phase 1: read N+1 + write-explosion query remodel (1a + 1b) Jun 27, 2026
espadonne added 13 commits June 27, 2026 17:57
…Lab#18, Lab-Lab-Lab#21)

assign_telephone_fixed created an AssignmentGroup + Assignment per row, re-derived
Part inside the loop, and re-evaluated activities.all() per group (143 queries at
30 students). Now resolves activities/parts once and bulk_creates the groups and
assignments -- constant in roster size. select_related(instrument) on enrollments
(not user, for migration-0033 safety). Guarded by test_telephone_write_counts.py.
…ase 1b Lab-Lab-Lab#19)

Roster import did an Enrollment.get() + create() per uploaded user (2N queries).
Now it resolves existing enrollments in one query and bulk_creates the new ones.
User creation itself stays per-row (password hashing can't be bulked). Guarded by
test_roster_import.py (creation + idempotent re-upload, no duplicate enrollments).
…b-Lab-Lab#23, Lab-Lab-Lab#24)

list() ran a separate PlannedActivity query (with its Meta ordering join) and
rebuilt the order mapping in Python. Replace with a correlated Subquery
annotation on the queryset -- one fewer query and no Meta-ordering sort.
Guarded by test_list_ordering.py (response stays sorted by plan order).
assignments/0033 imported the live assign_piece_plan helper and live model
classes to seed demo data, making a fresh migrate depend on the current schema
and helper behavior -- any helper/model change (e.g. the Phase 2 CourseAssignment
remodel) would run new code against the frozen historical schema and break the
build (we hit a variant of this in 1b via select_related('user') ->
users.external_id). The demo seed is non-essential and nothing depends on it, so
add_demos is now a documented no-op with no app-code imports. Existing DBs are
unaffected; fresh migrates just skip the demo seeding.
Additive only -- no behavior change yet. CourseAssignment is one row per
(course, activity, piece) with unique constraint; GroupAssignment links a student
to a specific CourseAssignment within a group for telephone_fixed plans. Factories
+ constraint tests included.
…e 2 step 3)

assign_one_piece_activity and assign_telephone_fixed now also create the
course-level CourseAssignment rows (one per course*activity*piece) and, for
telephone_fixed, a GroupAssignment per member. Per-student Assignment rows are
still written so the existing read path keeps working until it is flipped.
Idempotent via update_or_create / ignore_conflicts.
…ws (Phase 2 step 4)

Data migration collapsing Assignment rows by (course, activity, piece). Derives
piece from part.piece when Assignment.piece is null (legacy rows since migration
0026), preserving the CourseAssignment.piece NOT NULL invariant. Grouped rows get
GroupAssignment links. Idempotent (ignore_conflicts); reverse is a noop.
espadonne added 28 commits June 27, 2026 18:57
Resolves the student list from CourseAssignment (id = course_assignment.id),
fixing late joiners and scoping telephone_fixed groups by enrollment. View
precomputes per-CA part/submissions/group maps so the path stays constant in
assignment count; existing query-count test updated to dual-write CAs.
Faker("user_name") repeats; with django_get_or_create a repeat reuses a user
who then collides on UNIQUE(user, course). Sequence-based username guarantees
uniqueness; explicit username= still dedupes.
…by enrollment

Nested route id is now a CourseAssignment id (matching the flipped student list);
submissions and activity-progress are keyed by (course_assignment, enrollment),
scoped to the requesting student and 404 on a foreign id. Late joiners (no
Assignment row) can now submit and track progress; the legacy assignment FK is
populated when present so teacher views keep working.
Course.start_date/end_date are DateFields; utcnow() is a datetime that Postgres
truncates on write but SQLite keeps, tripping DRF's DateField on serialization.
GET /assignments/{id}/ for a student resolves the id as a CourseAssignment scoped
to their enrollment and returns the legacy AssignmentSerializer shape (id =
course_assignment.id). Teachers keep the per-student retrieve. Response-equivalence
test pins field parity except id.
Teacher list now returns one row per CourseAssignment (every assigned
(piece, activity)) instead of one per student, collapsing A*S rows to A. The
frontend teacher view only derives the distinct (piece, activity) set, so
per-student fields (instrument/submissions/group) come back null/empty. Read
serializer handles a None enrollment; part resolution stays off the N+1 path via
a shared _part_resolver. Existing list tests dual-write CAs; new test pins the
per-CA cardinality and the getAssignedPieces fields.
Group membership and submission status now resolve from GroupAssignment +
Submission's (course_assignment, enrollment) keys instead of per-student
Assignment rows. Memoized + one submitted-pairs query keeps it constant in
group size; query-count test rebuilt to render a member's group, content test
added.
The recent grading view and its serializer now resolve from the submission's own
course_assignment/enrollment/instrument/part fields. SubmissionAssignmentSerializer
rebuilds the legacy assignment shape (frontend reads only enrollment.user.name);
group/submissions come from per-(course_assignment, enrollment) context maps so
the view stays constant in roster size. Tests dual-populate submissions + assert
the consumed field.
The course's assigned activities now derive from CourseAssignment instead of
per-student Assignment rows, so it stays correct once Assignment writes stop.
change_piece_instrument now sets CourseAssignment.instrument (course-level, per
piece); resolve_instrument prefers it over the enrollment/user fallback. Restores
the override on the flipped read path (step 7 had stopped reading Assignment.
instrument). Nullable, no backfill -- null resolves to the enrollment instrument.
assign_one_piece_activity / assign_telephone_fixed now create only
CourseAssignment (+ GroupAssignment for telephone); students are implicitly
assigned and resolved at read time. Assign endpoints return a count (the frontend
ignores the body and refetches). Write-count tests assert the new behavior.
One progress row per student per CourseAssignment (the Phase 2 key). NULLs stay
distinct so legacy un-backfilled rows don't collide.
…on writes

unassign now deletes the piece's CourseAssignments (GroupAssignments cascade;
submissions still PROTECT). SubmissionViewSet/ActivityProgressViewSet no longer
set the legacy assignment FK (no new Assignment rows exist). Adds unassign test.
The superuser per-student export is reconstructed faithfully: one row per
(student, CourseAssignment) -- every enrolled student implicitly assigned each
non-grouped CourseAssignment plus their telephone GroupAssignments -- with their
submissions, in a constant number of queries. Same columns/template shape.
AssignmentViewSet drops UpdateModelMixin, the notation action, and the
Assignment-based get_queryset/retrieve teacher path (none used by the frontend);
retrieve resolves any role's id as a CourseAssignment. Removes the now-unused
AssignmentSerializer/AssignmentViewSetSerializer/AssignmentInstrument/Notation
serializers and the obsolete legacy-equivalence + get_queryset tests.
@espadonne espadonne changed the title Phase 1: read N+1 + write-explosion query remodel (1a + 1b) Backend query remodel: Phase 1 (N+1 + write explosion) + Phase 2 (CourseAssignment) Jun 28, 2026
@espadonne espadonne marked this pull request as ready for review June 28, 2026 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant