Cache remote actor avatars in media proxy cache#493
Conversation
Move the media proxy's cache key, fetch validation, and cache read/write logic into a reusable module. This lets future ingestion-time prefetching write exactly the same entries the request-time proxy already serves. Assisted-by: Codex:gpt-5.5
Cache remote actor avatars during account persistence when MEDIA_PROXY is set to cache. The persisted account still stores the original remote URL, but the request-time proxy can now serve the cached body immediately after Hollo sees the actor. Document the prefetch behavior in the media proxy configuration docs and changelog. Assisted-by: Codex:gpt-5.5
Run remote actor avatar cache warming asynchronously after account persistence and make cache-hit checks inspect only cache metadata instead of reading the cached media body. Assisted-by: Codex:gpt-5.5
Route remote actor avatar cache warmups through a process-local scheduler that limits concurrent fetches and deduplicates queued or active cache keys. This keeps account persistence non-blocking without allowing bulk actor persistence to start unbounded media downloads. Assisted-by: Codex:gpt-5.5
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request implements a prefetching mechanism for remote actor avatars when the media proxy is configured in cache mode. This ensures that avatars are warmed in the local cache as soon as an actor is stored or refreshed, improving subsequent rendering performance. The implementation introduces a new proxy-cache.ts utility to manage cache operations, concurrency-limited prefetching, and SSRF-aware fetching, while refactoring the existing proxy logic to utilize these shared functions. Review feedback identified potential TypeError risks in the cache-reading logic where JSON.parse could be called on null values if metadata files are missing; adding explicit null checks was recommended to improve robustness.
Treat missing, null, or malformed proxy-cache sidecar metadata as an explicit cache miss before reading the cached media body. This keeps both full cache reads and existence checks on the same validation path. fedify-dev#493 (comment) fedify-dev#493 (comment) Assisted-by: Codex:gpt-5.5
|
/gemini review |
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request implements prefetching for remote actor avatars when the MEDIA_PROXY environment variable is set to cache. It refactors the proxy caching logic into a new src/proxy-cache.ts module, which includes a background prefetch queue with concurrency control and deduplication. The changes also include comprehensive unit tests and updated documentation across several languages. I have no feedback to provide.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 32a8f99a86
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Limit remote actor avatar cache warmup to actors with an approved follow relationship to the local account by default. This keeps federation input from driving unsolicited cache writes while preserving warmup for locally relevant accounts. fedify-dev#493 (comment) Assisted-by: Codex:gpt-5.5
|
/gemini review |
|
@codex review |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Code Review
This pull request implements prefetching for remote actor avatars into the media proxy cache when a relationship exists with a local account. It refactors proxy caching into a dedicated module with a concurrency-limited queue and SSRF protection, supported by updated documentation and new tests. Feedback identifies a need for more robust error handling during metadata parsing to prevent potential crashes from corrupted JSON files.
Treat invalid JSON sidecars as cache misses inside the proxy cache metadata parser. This keeps the helper's invalid-input behavior local instead of relying on each caller's broader storage error handler. fedify-dev#493 (comment) Assisted-by: Codex:gpt-5.5
|
/gemini review |
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request implements remote actor avatar prefetching for the MEDIA_PROXY=cache mode. When a remote actor is persisted or updated, the system now checks for an approved follow relationship with a local account and, if found, schedules the actor's avatar to be prefetched into the proxy cache. The core logic is encapsulated in a new proxy-cache.ts module which handles SSRF-aware redirects, bounded concurrency for background prefetch jobs, and cache storage. The changes also include updated documentation in multiple languages, release notes, and comprehensive tests for the new prefetching and caching mechanisms. I have no feedback to provide.
|
Codex Review: Didn't find any major issues. Keep it up! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
What changed
This adds best-effort caching for remote actor avatars when
MEDIA_PROXY=cacheis enabled. Hollo still stores the original remote avatar URL on the account, but when a remote actor is stored or refreshed, the avatar is scheduled for prefetch into the same proxy cache used by the media proxy route.The proxy cache logic was factored out of src/proxy.ts into src/proxy-cache.ts so request-time caching and actor-persistence warmup share the same cache key, response validation, SSRF-aware redirect handling, size limit, content-type checks, and storage layout.
Avatar warmup is intentionally non-blocking. src/federation/account.ts schedules cache warmup after the account row is persisted, so slow or unavailable remote avatar servers do not delay inbox, search, follow, or refresh paths. The scheduler is process-local, bounded, and best-effort: it limits concurrent warmups, deduplicates queued or active cache keys, and drops excess work if the queue is full.
Why
Remote avatars were previously served through their original media URLs. That can make avatars slow to render when the remote media server is far away, and it can leave broken avatars when a remote actor changes their avatar without sending a working
Updateactivity and later removes the old file.With this change, once Hollo has seen a remote actor avatar in
MEDIA_PROXY=cachemode, the proxy can keep serving the cached copy even if the upstream file later disappears.Documentation
Updated the
MEDIA_PROXY=cachedocumentation in docs/src/content/docs/install/env.mdx, docs/src/content/docs/ja/install/env.mdx, docs/src/content/docs/ko/install/env.mdx, docs/src/content/docs/zh-cn/install/env.mdx, and docs/src/content/docs/zh-tw/install/env.mdx.Added a changelog entry in CHANGES.md.
Verification
pnpm vitest run src/proxy-cache.test.ts src/federation/account.persistence.test.ts src/proxy.test.tspnpm check && pnpm test