fix: images that fail content filter are added to failed directory by IzaakGough · Pull Request #2780 · firebase/extensions

IzaakGough · 2026-04-20T08:18:50Z

Fixes #2769 and #2762.

Refactors content-filter and adds a fix for the silent fail-open on Gemini 2.5 Flash safety refusals reported in #2769.

What changed

Bug 1 fix (🐛 [storage-resize-images] Content filter does not work #2769) — moderateImageOnce now recognises the genkit ValidationError shape Gemini 2.5 Flash produces when its input-side safety refuses on borderline imagery (empty content, no finishReason: "SAFETY" → parseSchema(null, ...) throws INVALID_ARGUMENT). Treated as an implicit block, returned directly without going through the retry queue (deterministic — retries just burn API calls).
Earlier on the branch — refactor of content-filter.ts, blocked-image routing to the failed-image path before placeholder substitution, placeholder swap on a copy of the original (so blocked originals are preserved), data-URL contentType fix.
Drops a leftover as any and clarifies the maxOutputTokens comment per Izaak's review on 🐛 [storage-resize-images] Blocked images by Vertex AI are not moved to the failed path due to bypassed catch block #2762.

Tests

2 new unit tests in __tests__/content-filter.test.ts:
- null-content ValidationError → blocks without retrying (asserts mockGenerate.toHaveBeenCalledTimes(1))
- non-null INVALID_ARGUMENT (real schema mismatch) → still propagates after retries
New env-gated live integration suite at __tests__/integration/content-filter.live.test.ts. Runs against real Vertex AI Gemini 2.5 Flash:
- safe text image + BLOCK_ONLY_HIGH → passes
- gun image + weapon-detection custom prompt → blocked
- safe image + weapon-detection custom prompt → not blocked (model doesn't over-trigger)
- 2 borderline-regression tests gated on LIVE_BORDERLINE_IMAGE_PATH (image not committed; supply your own)
Run via npm run test:live-content-filter --prefix storage-resize-images/functions (env vars: GCLOUD_PROJECT, optionally LIVE_BORDERLINE_IMAGE_PATH).
All 3 always-on live tests verified passing against izaak-testing (~9s).

Behaviour change / out of scope

The bug report also flags a "Bug 2" about the no-custom-prompt path ignoring the model's yes/no verdict. Not addressed in this PR — separate behaviour question with non-trivial blast radius for existing installs. Tracked for a future follow-up.
extension.yaml bumped to 0.3.4; CHANGELOG updated.

Test plan

npx tsc --noEmit -p tsconfig.json clean
npx jest __tests__/content-filter.test.ts — 10 pass (8 existing + 2 new)
npx jest __tests__/unit/generateResizedImageHandler.test.ts — 4 pass unchanged
npm run test:live-content-filter against izaak-testing — 3/3 pass
Bug 1 regression test against a real borderline image (set LIVE_BORDERLINE_IMAGE_PATH)
Manual repro of the original report: deploy + upload borderline image, confirm contentFilterBlocked log + placeholder substitution

gemini-code-assist

Code Review

This pull request refactors the image moderation and placeholder replacement logic, introducing a new placeholder module and updating the main handler. Several critical issues were identified: a regression prevents resized placeholders from being generated when an image is blocked, and the original blocked image is overwritten by the placeholder before it can be moved to the failed directory for review. Additionally, the content filter logic should be optimized by moving initialization and file reading outside the retry loop, and event metadata should correctly report the filter status.

…locked originals

…maxOutputTokens comment Type inference from HARM_CATEGORIES.map(...) already matches genkit's SafetySettingsSchema, so the cast was a leftover from an earlier refactor. Restore the maxOutputTokens comment Izaak flagged on PR #2762, reworded to explain why 1 token suffices for the default yes/no answer vs 100 for the JSON-schema custom-prompt path.

…ntent-filter blocks Bug: when Gemini 2.5 Flash's input-side safety refuses on a borderline image, the candidate comes back with empty content rather than finishReason="SAFETY". Genkit then runs assertValidSchema on the null output, which throws ValidationError with status "INVALID_ARGUMENT". The existing catch only recognises finishReason="blocked", so the validation error gets retried 3 times (deterministic, same result every time), then propagates as a generic filter error — the image silently passes the filter from the user's perspective. Fix: detect the genkit ValidationError shape (status=INVALID_ARGUMENT + originalMessage matching /Provided data:\s*\n+\s*null/) and treat it as an implicit block, returning false from moderateImageOnce without going through the retry queue. Also adds: - Two unit tests covering the new path: null-content → blocked (no retries), real schema mismatch → still propagates after retries. - An env-var-gated live integration suite under __tests__/integration/ with a Bug 1 regression test that requires LIVE_BORDERLINE_IMAGE_PATH to point at an image the developer/CI supplies (cannot ship borderline imagery in a public repo). - CHANGELOG entry covering the fix and the placeholder/data-URL fixes already on this branch; bump extension.yaml to 0.3.4. Reported by Izaak Gough (#2762).

… real Gemini behaviour Three small fixes after running the suite against a real Vertex project: - Switch the safe-image fixture from test-image.png (a tiny black square) to test-jpg.jpg (the word "test" rendered as text). The black-square fixture has no real content for Gemini to evaluate and was flaking the no-prompt sanity test. - Relax that sanity test from BLOCK_LOW_AND_ABOVE to BLOCK_ONLY_HIGH. The no-prompt path uses maxOutputTokens=1, which leaves the model little room to indicate "this is fine" — under the strictest threshold, even synthetic text can be flagged. The other live tests (which use a custom prompt + JSON schema + 100 tokens) keep BLOCK_LOW_AND_ABOVE and exercise the production-relevant threshold. - Bake NODE_OPTIONS=--experimental-vm-modules into the npm script so genkit's dynamic ESM imports load under Jest's CJS environment. Verified all 3 always-runnable tests pass against izaak-testing.

…hten tests after review - isNullContentSafetyRefusal: also require detail.errors to be a populated array. Previously only the message regex + status guarded against false-positives; the structural check pins the detector to genkit's ValidationError contract (status + detail shape) so unrelated INVALID_ARGUMENT errors don't accidentally read as content blocks. - content-filter.test.ts: switch hand-crafted error mocks to the real ValidationError class imported from @genkit-ai/core/schema. The test now fails loudly if genkit changes the error shape, instead of false- passing against a stale mock. - Bump the negative test (real schema mismatch) from maxAttempts=2 to 3, asserting toHaveBeenCalledTimes(3) — exercises the full retry path rather than just proving "retried once". - CHANGELOG: clarify the bug only affects installs that configure a custom filter prompt (the only path that sets output.schema and so the only path where genkit's parseSchema(null,...) ran). Verified: 14 unit + handler tests pass; 3/3 live tests pass against izaak-testing.

cabljac · 2026-05-08T09:43:25Z

/gemini review

gemini-code-assist

Code Review

This pull request updates the storage-resize-images extension to version 0.3.4, introducing several fixes for the content filtering logic. Key changes include handling Gemini safety refusals that previously caused silent failures or unnecessary retries, ensuring blocked images are routed correctly before placeholder substitution, and using the correct content type for moderation requests. Feedback was provided regarding a typo in the model name (gemini-2.5-flash should likely be gemini-1.5-flash) and a suggestion to use a more robust check for ValidationError by inspecting the data property instead of relying on a regex against the error message.

gemini-code-assist · 2026-05-08T09:45:30Z

+  const e = error as {
+    status?: string;
+    originalMessage?: string;
+    detail?: { errors?: unknown };
+  };
+  if (e.status !== "INVALID_ARGUMENT") return false;
+  const errors = e.detail?.errors;
+  if (!Array.isArray(errors) || errors.length === 0) return false;
+  return (
+    typeof e.originalMessage === "string" &&
+    /Provided data:\s*\n+\s*null/.test(e.originalMessage)
+  );


The current implementation relies on a regex check against the originalMessage string to identify safety refusals, which is brittle and may break if the underlying library updates its error message formatting. A more robust approach is to check the data property of the error object, which Genkit's ValidationError sets to null when the model returns empty content due to safety refusals.

const e = error as { status?: string; data?: unknown; detail?: { errors?: unknown }; }; if (e.status !== "INVALID_ARGUMENT") return false; return ( e.data === null && Array.isArray(e.detail?.errors) && e.detail.errors.length > 0 );

…on to cover empty-object case The previous fix only matched when Gemini's response yielded `null` data (parseSchema's most-documented failure mode). Verified live against a real borderline image, that's incomplete: Gemini 2.5 Flash also produces ValidationError with `data: {}` when input-side safety refuses and emits non-JSON or empty refusal text that genkit's extractJson() coerces to an empty object. Same root cause, different shape — and likely more variants in future model versions. Drop the message-format regex and rely on the structural signature alone (status=INVALID_ARGUMENT + populated detail.errors). We control both the prompt and the schema in this code path, so any ValidationError from `ai.generate` originates from the model's response; treating the whole class as an implicit block is correct because the alternative (propagate as filterErrored) silently fails open on borderline NSFW content, which is the original bug. - Rename `isNullContentSafetyRefusal` → `isModerationSchemaRefusal`. - Drop the `/Provided data:\s*\n+\s*null/` regex. - Add unit tests for: empty-object data variant, type-mismatch (also blocks under the new policy with rationale comment), and a sanity check that non-ValidationError errors still propagate via retries. - Live test file: derive contentType from file extension via a small guessContentType helper, so LIVE_BORDERLINE_IMAGE_PATH works with png/webp/gif/jpeg without forcing the developer to convert. Verified live against izaak-testing with a borderline-NSFW stock photo: regression test resolves to false in 1.87s (single API call, no retries), down from 7.8s with 2 wasted retries before the previous narrow detector.

gemini-code-assist Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread storage-resize-images/functions/src/index.ts Outdated

Comment thread storage-resize-images/functions/src/index.ts

Comment thread storage-resize-images/functions/src/content-filter.ts Outdated

Comment thread storage-resize-images/functions/src/index.ts Outdated

IzaakGough requested a review from cabljac April 20, 2026 08:24

cabljac marked this pull request as ready for review April 21, 2026 15:40

IzaakGough mentioned this pull request Apr 21, 2026

🐛 [storage-resize-images] Blocked images by Vertex AI are not moved to the failed path due to bypassed catch block #2762

Open

cabljac reviewed Apr 27, 2026

View reviewed changes

Comment thread storage-resize-images/functions/src/content-filter.ts Outdated

cabljac reviewed Apr 27, 2026

View reviewed changes

Comment thread storage-resize-images/functions/src/content-filter.ts

cabljac reviewed Apr 27, 2026

View reviewed changes

Comment thread storage-resize-images/functions/src/content-filter.ts

cabljac requested a review from a team as a code owner May 8, 2026 08:52

cabljac marked this pull request as draft May 8, 2026 08:53

cabljac marked this pull request as ready for review May 8, 2026 08:54

IzaakGough and others added 11 commits May 8, 2026 10:21

fix: images that fail content filter are added to failed directory

ff84eac

refactor: clean up filtering logic 1

273743b

refactor: clean up filtering logic 2

0f78817

refactor: cleanup content-filter

47253fb

fix(storage-resize-images): resize placeholders without overwriting b…

8c38e2d

…locked originals

fix(storage-resize-images): reuse moderation setup and correct data URLs

040bd8c

tests: fix tests and use fs instead of fs-extra

23e30c0

cabljac force-pushed the fixcontentfilt branch from db4def8 to 5007635 Compare May 8, 2026 09:30

cabljac marked this pull request as draft May 8, 2026 09:31

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: images that fail content filter are added to failed directory#2780

fix: images that fail content filter are added to failed directory#2780
IzaakGough wants to merge 12 commits intonextfrom
fixcontentfilt

IzaakGough commented Apr 20, 2026 •

edited by cabljac

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cabljac commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

IzaakGough commented Apr 20, 2026 • edited by cabljac Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Tests

Behaviour change / out of scope

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cabljac commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

IzaakGough commented Apr 20, 2026 •

edited by cabljac

Loading