refactor(export): migrate from puppeteer to satori#4343
refactor(export): migrate from puppeteer to satori#4343darentanrw wants to merge 7 commits intonusmodifications:masterfrom
Conversation
|
@darentanrw is attempting to deploy a commit to the modsbot's projects Team on Vercel. A member of the Team first needs to authorize it. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #4343 +/- ##
==========================================
+ Coverage 54.52% 56.30% +1.77%
==========================================
Files 274 308 +34
Lines 6076 6962 +886
Branches 1455 1682 +227
==========================================
+ Hits 3313 3920 +607
- Misses 2763 3042 +279 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Sounds like a great idea. I'll be starting on Phase 2 of #4314 soon and we can tackle this extraction. |
|
Some thoughts:
Lastly, can we also explore if https://www.renoun.dev/screenshot works? I think it runs on the client and maybe we can use it for images and avoid a server round-trip. I'm torn between the export approaches. Decoupling the rendering has good benefits too, so I guess the question is how much the visual consistency between website and export matters. |
|
Thanks for the review! I've fixed the low hanging fruits that you commented on. I will take a further look into https://www.renoun.dev/screenshot, however a potential problem that I foresee is that it might not be easy to select only the timetable and module info as a PNG export. The current workflow with Puppeteer screenshots https://nusmods.vercel.app/timetable-only after injecting it with the timetable data. However, what is currently rendered to the user also includes the In this case of using Serverless functions for export, I believe that a cleaner approach to couple the data models is by shifting the export serverless handlers into the main website APIs (to Personally, I don't think that the visual consistency between website and export matters that much (See Visual Differences in original PR), and I think that we can make it more similar (except for fonts) - but ultimately this is a decision for the maintainers to make. |
Those are great points! I'm convinced this is a good approach. Like you said, we can also co-locate the exporting functions into the web app, which is a good idea for the future. Btw I've started extracting the types in #4346, depending on whose PR merges first, someone will have to resolve the conflicts, but it shouldn't be hard. |
Agree on this. I think this is fine. Ultimately, my biggest barrier is still the configuration/duplicated logic issues which have been lightly addressed for now.
I am agreeable to this since technically having 2 separate Vercel projects doesn't yield much benefit if that can ultimately reduce duplicated code. @ravern @leslieyip02 can weigh in on this too. |
Context
The current export service uses Puppeteer (via
puppeteer-coreand@sparticuz/chromium) to render timetable images and PDFs.This approach has several drawbacks, the biggest of which being slow performance due to cold starts and chromium load time. This has resulted in issues such as 504 Errors (#4290) where function time outs, and fixes to increase timeout to 15 seconds such as (#4255)
Implementation
Instead, I propose to replace Puppeteer with Satori + resvg-js for server-side timetable rendering:
ExportData plus modules are parsed into buildRenderableTimetable, which produces a RenderableTimetable view model.
That model is fed into TimetableImage in view.tsx, which is then rendered from JSX → SVG → PNG → PDF
Drawbacks and Limitations
Because Satori does not run a real browser, It only accepts JSX elements that are pure and stateless, and a subset of layout and styling (inline styles, no CSS classes, no full DOM). The NUSMods timetable uses SCSS, class-based styling, so it can’t be rendered directly by Satori.
view.tsxhence is a separate React tree that mirrors the timetable layout using only Satori-compatible inline styles (thanks Claude).However:
export/src/types.ts- ExportData, Module, SemTimetableConfig, etc. (fromwebsite/src/types/export.ts,types/timetables.ts,types/modules.ts)export/src/satori/theme.ts- Theme color palettes (fromwebsite/src/styles/constants.scss $nusmods-theme-colors)export/src/satori/render-model.ts— Time conversion, lesson arrangement, week formatting, color mapping (fromwebsite/src/utils/timify.ts,utils/timetables.ts,utils/colors.ts)This has potential future painful implication - as any change to the timetable UI (layout, themes, week formatting, etc.) must be reflected in both the website components and the export view.tsx / render-model.ts, increasing the chance of inconsistency.
Note: This was also done from my assumption of that you are deploying your monorepo separately onto different vercel instances - there are ways we can reduce this data duplication with packages and etc, but it will require a more significant rewrite?
Future work: Should v4 migration (#4314) be implemented and Next.js is chosen, this feature also can be implemented via Next.js ImageResponse as well.
Performance Improvements
Performance improvements of 60~70% 🥳!
Wrote a shell script, at
export/utils/endpoint-performance-test.shwhich ran 10 simulated timetables against both the old (Puppeteer) and new (Satori) endpoints. I deployed the new export service at https://nusmods-export.vercel.app, and latency was measured on an M3 MacBook Pro over Ethernet under typical usage.For more precise numbers, we should inspect Vercel function logs to compare cold starts and execution time between the two implementations.
Visual Differences
PDF Version
PDF Version
PDF Version
Misc
First PR to nusmods, any feedback will be appreciated!
I found many edge cases, but I'm sure that there are more that I have yet to consider, so would love your inputs and edits on this PR.
Thanks a lot! Especially to @ravern and @leslieyip02 for planting this idea in my head.