Skip to content

Google GenAI generateContentStream aggregation silently drops inlineData output parts (images, audio) #1690

@braintrust-bot

Description

@braintrust-bot

Summary

The Google GenAI streaming aggregation code drops inlineData parts from the response output. When Gemini models generate images natively via generateContent with responseModalities: ['IMAGE'] (or return audio/other binary data), the inlineData parts in streamed chunks are silently lost because the aggregation loop does not handle them.

Non-streaming generateContent calls are unaffected — the plugin logs the full raw response as output.

What instrumentation is missing

In js/src/instrumentation/plugins/google-genai-plugin.ts, the aggregateGenerateContentChunks function (lines 749–778) processes parts from streamed chunks:

for (const part of candidate.content.parts) {
  if (part.text !== undefined) {
    // handled ✓
  } else if (part.functionCall) {
    // handled ✓
  } else if (part.codeExecutionResult) {
    // handled ✓
  } else if (part.executableCode) {
    // handled ✓
  }
  // inlineData → falls through, silently dropped ✗
}

The vendored type GoogleGenAIPart in js/src/vendor-sdk-types/google-genai.ts already declares the inlineData field (line 53), but the aggregation code never handles it. Any inlineData part in a streamed chunk is silently excluded from the aggregated output span.

Impact

  • Native image generation via Gemini models (gemini-2.0-flash, etc.) with streaming produces spans where generated images are missing from the output
  • Braintrust docs state "Streaming responses are fully supported — Braintrust automatically collects streamed chunks and logs the complete response as a single span," but this is not the case for image/audio output
  • Users who stream generateContent calls with responseModalities: ['IMAGE', 'TEXT'] will see text in their spans but not the generated images

Braintrust docs status

unclear — Braintrust docs at https://www.braintrust.dev/docs/instrument/wrap-providers list @google/genai as supported and claim full streaming support, but do not specifically address image output in streamed responses.

Upstream reference

  • Google GenAI native image generation: https://ai.google.dev/gemini-api/docs/image-generation
  • generateContent with responseModalities: ['IMAGE'] returns inlineData parts containing generated images
  • This is a stable feature available on Gemini 2.0 Flash and later models

Local files inspected

  • js/src/instrumentation/plugins/google-genai-plugin.ts (lines 749–778: aggregateGenerateContentChunks part processing loop)
  • js/src/vendor-sdk-types/google-genai.ts (line 53: inlineData field on GoogleGenAIPart)
  • js/src/wrappers/google-genai.ts (wrapper proxies generateContentStream to channel)
  • e2e/scenarios/google-genai-instrumentation/ (no test cases with image output in streamed responses)

Note

This is distinct from #1673 (models.generateImages() not instrumented), which covers the dedicated Imagen API. This issue is about the standard generateContent/generateContentStream API producing image output that gets lost specifically in the streaming aggregation path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bot-automationIssues generated by an agent automation

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions