Backend query remodel: Phase 1 (N+1 + write explosion) + Phase 2 (CourseAssignment)#56
Open
espadonne wants to merge 60 commits into
Open
Backend query remodel: Phase 1 (N+1 + write explosion) + Phase 2 (CourseAssignment)#56espadonne wants to merge 60 commits into
espadonne wants to merge 60 commits into
Conversation
Teacher/student assignment list issued queries linear (group path quadratic) in roster size. Add the missing part-tree select_related/prefetch and submissions__attachments to a shared queryset, and memoize GroupSerializer.get_members per group in the serializer context. Query count is now constant in roster/group size (was 484/1009 queries at 20 students/15 group members, verified by teleband/assignments/tests/test_query_counts.py).
The grading dashboard's latest-submission-per-student endpoint serialized a deep assignment/enrollment/part tree over an unoptimized queryset, fanning out per submitting student (501 queries at 20 students). Add matching select_related/prefetch_related so the count is constant, guarded by teleband/submissions/tests/test_query_counts.py.
…ets (Phase 1a) Add the select_related/prefetch_related each list serializer walks so query counts stay constant in list size: EnrollmentViewSet.list (course/owner, instrument, role, user), PiecePlanViewSet (piece/composer, activities tree), ActivityViewSet (activity_type/category/part_type), GradeViewSet (reverse submission one-to-ones). Guarded by enrollment and activity query-count tests.
…inate (Phase 1a) csv_view iterated submissions/attachments as unprefetched reverse relations and wrote str(activity)/str(piece_plan) whose __str__ walk uncovered FKs -- 9000+ queries for the full export, now 2. AssignmentListView used prefetch_related for forward FKs (extra query each) and had no pagination. Guard nullable piece_plan in the CSV rows. Guarded by teleband/dashboards/tests/test_query_counts.py.
…_queryset (Phase 1a Lab-Lab-Lab#14)
The update/partial_update queryset hardcoded user__username="admin", scoping teacher user-edits to the literal admin user's courses instead of the requesting teacher's, and built the course list with a per-row Python loop. Replace with a single subquery over the requester's taught courses + distinct(). Behavior change: teachers now correctly see students in their own courses. Guarded by teleband/users/tests/test_user_scoping.py.
…e 1b Lab-Lab-Lab#15, Lab-Lab-Lab#20) assign_one_piece_activity ran update_or_create per student (2 queries each, silently swallowing the unique-constraint violation on re-assign). Now it fetches already-assigned enrollment ids once and bulk_creates only the missing rows -- query count constant in roster size (was 142 at 20 students). Idempotent re-assign preserved. NB: kept select_related to instrument only (not user): this helper is invoked by the live data migration assignments/0033, where eagerly selecting users.external_id breaks a fresh migrate. ATOMIC_REQUESTS=True already makes the assign views transactional, so no extra wrapper needed.
…b-Lab-Lab#13 done; note ATOMIC_REQUESTS + migration-0033 landmine
…Lab#18, Lab-Lab-Lab#21) assign_telephone_fixed created an AssignmentGroup + Assignment per row, re-derived Part inside the loop, and re-evaluated activities.all() per group (143 queries at 30 students). Now resolves activities/parts once and bulk_creates the groups and assignments -- constant in roster size. select_related(instrument) on enrollments (not user, for migration-0033 safety). Guarded by test_telephone_write_counts.py.
…ase 1b Lab-Lab-Lab#19) Roster import did an Enrollment.get() + create() per uploaded user (2N queries). Now it resolves existing enrollments in one query and bulk_creates the new ones. User creation itself stays per-row (password hashing can't be bulked). Guarded by test_roster_import.py (creation + idempotent re-upload, no duplicate enrollments).
…b-Lab-Lab#23, Lab-Lab-Lab#24) list() ran a separate PlannedActivity query (with its Meta ordering join) and rebuilt the order mapping in Python. Replace with a correlated Subquery annotation on the queryset -- one fewer query and no Meta-ordering sort. Guarded by test_list_ordering.py (response stays sorted by plan order).
assignments/0033 imported the live assign_piece_plan helper and live model
classes to seed demo data, making a fresh migrate depend on the current schema
and helper behavior -- any helper/model change (e.g. the Phase 2 CourseAssignment
remodel) would run new code against the frozen historical schema and break the
build (we hit a variant of this in 1b via select_related('user') ->
users.external_id). The demo seed is non-essential and nothing depends on it, so
add_demos is now a documented no-op with no app-code imports. Existing DBs are
unaffected; fresh migrates just skip the demo seeding.
Additive only -- no behavior change yet. CourseAssignment is one row per (course, activity, piece) with unique constraint; GroupAssignment links a student to a specific CourseAssignment within a group for telephone_fixed plans. Factories + constraint tests included.
…e 2 step 3) assign_one_piece_activity and assign_telephone_fixed now also create the course-level CourseAssignment rows (one per course*activity*piece) and, for telephone_fixed, a GroupAssignment per member. Per-student Assignment rows are still written so the existing read path keeps working until it is flipped. Idempotent via update_or_create / ignore_conflicts.
…ws (Phase 2 step 4) Data migration collapsing Assignment rows by (course, activity, piece). Derives piece from part.piece when Assignment.piece is null (legacy rows since migration 0026), preserving the CourseAssignment.piece NOT NULL invariant. Grouped rows get GroupAssignment links. Idempotent (ignore_conflicts); reverse is a noop.
Resolves the student list from CourseAssignment (id = course_assignment.id), fixing late joiners and scoping telephone_fixed groups by enrollment. View precomputes per-CA part/submissions/group maps so the path stays constant in assignment count; existing query-count test updated to dual-write CAs.
Faker("user_name") repeats; with django_get_or_create a repeat reuses a user
who then collides on UNIQUE(user, course). Sequence-based username guarantees
uniqueness; explicit username= still dedupes.
…by enrollment Nested route id is now a CourseAssignment id (matching the flipped student list); submissions and activity-progress are keyed by (course_assignment, enrollment), scoped to the requesting student and 404 on a foreign id. Late joiners (no Assignment row) can now submit and track progress; the legacy assignment FK is populated when present so teacher views keep working.
Course.start_date/end_date are DateFields; utcnow() is a datetime that Postgres truncates on write but SQLite keeps, tripping DRF's DateField on serialization.
GET /assignments/{id}/ for a student resolves the id as a CourseAssignment scoped
to their enrollment and returns the legacy AssignmentSerializer shape (id =
course_assignment.id). Teachers keep the per-student retrieve. Response-equivalence
test pins field parity except id.
Teacher list now returns one row per CourseAssignment (every assigned (piece, activity)) instead of one per student, collapsing A*S rows to A. The frontend teacher view only derives the distinct (piece, activity) set, so per-student fields (instrument/submissions/group) come back null/empty. Read serializer handles a None enrollment; part resolution stays off the N+1 path via a shared _part_resolver. Existing list tests dual-write CAs; new test pins the per-CA cardinality and the getAssignedPieces fields.
Group membership and submission status now resolve from GroupAssignment + Submission's (course_assignment, enrollment) keys instead of per-student Assignment rows. Memoized + one submitted-pairs query keeps it constant in group size; query-count test rebuilt to render a member's group, content test added.
The recent grading view and its serializer now resolve from the submission's own course_assignment/enrollment/instrument/part fields. SubmissionAssignmentSerializer rebuilds the legacy assignment shape (frontend reads only enrollment.user.name); group/submissions come from per-(course_assignment, enrollment) context maps so the view stays constant in roster size. Tests dual-populate submissions + assert the consumed field.
The course's assigned activities now derive from CourseAssignment instead of per-student Assignment rows, so it stays correct once Assignment writes stop.
change_piece_instrument now sets CourseAssignment.instrument (course-level, per piece); resolve_instrument prefers it over the enrollment/user fallback. Restores the override on the flipped read path (step 7 had stopped reading Assignment. instrument). Nullable, no backfill -- null resolves to the enrollment instrument.
assign_one_piece_activity / assign_telephone_fixed now create only CourseAssignment (+ GroupAssignment for telephone); students are implicitly assigned and resolved at read time. Assign endpoints return a count (the frontend ignores the body and refetches). Write-count tests assert the new behavior.
One progress row per student per CourseAssignment (the Phase 2 key). NULLs stay distinct so legacy un-backfilled rows don't collide.
…on writes unassign now deletes the piece's CourseAssignments (GroupAssignments cascade; submissions still PROTECT). SubmissionViewSet/ActivityProgressViewSet no longer set the legacy assignment FK (no new Assignment rows exist). Adds unassign test.
The superuser per-student export is reconstructed faithfully: one row per (student, CourseAssignment) -- every enrolled student implicitly assigned each non-grouped CourseAssignment plus their telephone GroupAssignments -- with their submissions, in a constant number of queries. Same columns/template shape.
AssignmentViewSet drops UpdateModelMixin, the notation action, and the Assignment-based get_queryset/retrieve teacher path (none used by the frontend); retrieve resolves any role's id as a CourseAssignment. Removes the now-unused AssignmentSerializer/AssignmentViewSetSerializer/AssignmentInstrument/Notation serializers and the obsolete legacy-equivalence + get_queryset tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backend query remodel. Phase 1 kills read-side N+1 fan-out and write-side row/query explosion on the hot teacher/grading/assign paths (additive indexes only). Phase 2 is the structural change: one course-level
CourseAssignmentreplaces the per-studentAssignmentrow explosion, with per-student instrument/part resolved at read time and persisted onSubmission. The JSON API contract is preserved end-to-end — the frontend needs no changes. Directives:docs/remodel_campaign.md,docs/remodel_phase2_design.md.Phase 1 — read-side wins (1a), each guarded by a query-count test verified to fail on the old code
AssignmentViewSet.list(teacher)GroupSerializerO(M²))AssignmentViewSet.list(student)TeacherSubmissionViewSet.recent(grading)Phase 1 — write-side wins (1b/1c)
assign_one_piece_activityassign_telephone_fixedchange_piece_instrumentPlus:
UserViewSetupdate scoping fix + its N+1 (#13); removed deadPiecePlan.assign; collapsedPart.for_activitydouble query; planned-order subquery annotation (#23/#24).Phase 2 — structural remodel (Option B, fully dynamic)
Steps 1–6 are additive and backfilled (new tables dual-written alongside the old path):
CourseAssignment+ lightweightGroupAssignmentmodels (16e0c19), dual-write from the assign helpers (a46d24b), backfill from existingAssignmentrows (c57a671), andcourse_assignment/enrollment/instrument/partadded toSubmission+ActivityProgresswith backfills (0781d99,8b69a9d). Migration0033's live-helper call neutralized first so a freshmigrateis safe (200a34d).Step 7 — read-path flip (contract-sensitive). List/retrieve/submissions/activity-progress now resolve from
CourseAssignment(theidiscourse_assignment.id), scoping nested routes by the requesting enrollment. Equivalence serializers pin field-for-field parity exceptid; the list precomputes per-CA part/submissions/group maps to stay O(1). Late joiners (no prior row) can now submit and track progress.299570fread-resolution serializer + response-equivalence tests40057e3student list ·9d54429student retrieve ·ee32876submission/activity-progress routes960c5b2teacher list → one row per CourseAssignment (verified frontendgetAssignedPiecesonly derives the distinct (piece, activity) set; all/assignments/redux consumers are student-facing)1891829,cfccbe3fixed two latent test-factory bugs surfaced en route (datetime-typedDateField; non-uniqueusername→UNIQUE(user, course)collisions)Step 8 — Assignment fully removed. First made dead (neither read nor written), then dropped.
7dc62f7GroupSerializer.get_members→GroupAssignment·3853b78recent→ submission native fields ·5ea3870ActivityViewSet→CourseAssignment0dc288eper-piece instrument override moved toCourseAssignment.instrument;resolve_instrumentprefers it,change_piece_instrumentsets ite0b2718assign helpers stop writingAssignment·19022e0unassignrepointed ·6cb3e9cunique(course_assignment, enrollment)onActivityProgress·f627690dashboards rebuilt fromCourseAssignment × enrollments78b7d43dropped deadAssignmentViewSetactions + serializers ·53fffa7droppedAssignmentmodel + theassignmentFK columns onSubmission/ActivityProgress·db0f9c8admin repoint ·edd8a33ActivityProgressSerializerexposescourse_assignment/enrollmentb529b90,9a870d8test fixtures migrated offAssignmentFactory; obsolete transitional/backfill tests deletedMigration note:
assignments/0041drops theunique_assignmentconstraint before the field removals, so SQLite's table-remake doesn't reference removed fields. DroppingAssignmentrows is safe (fully backfilled intoCourseAssignment);Submission/ActivityProgressrows keep all data — only the redundant FK column goes.Tests
pytest teleband→ 66 passed, 2 failed.teleband/assignments(list/retrieve flip, ordering, query counts)teleband/submissions(nested routes, phase-2 fields/progress, grading query counts)teleband/courses(assign/telephone/unassign write counts, change-instrument)teleband/dashboards(CSV export query bound)migrateon a fresh DB;makemigrations --checkcleanblack --check(CI's unpinned latest) cleanThe 2 failures (
test_me, admintest_add) are pre-existing on a cleanmain(UserSerializergroupsfield + admin redirect), reproduced before any changes here.