Skip to content

Google GenAI groundingMetadata not captured in streaming aggregation or span metadata #1700

@braintrust-bot

Description

@braintrust-bot

Summary

When Google Search grounding is enabled via tools: [{ googleSearch: {} }], the @google/genai SDK returns groundingMetadata on the response containing search citations, source URIs, and confidence scores. The current Google GenAI instrumentation plugin does not capture this metadata in either the streaming aggregation or the span metadata, so grounding details are silently lost in traced spans.

Non-streaming calls pass through the raw response object as output, so groundingMetadata is incidentally preserved there — but it is never extracted into span metadata where it would be queryable, and it is completely lost in streaming.

What is missing

  • Streaming aggregation (js/src/instrumentation/plugins/google-genai-plugin.ts, aggregateGenerateContentChunks): Only extracts candidates, usageMetadata, text, functionCall, codeExecutionResult, executableCode, and thought from chunks. groundingMetadata is not accumulated or forwarded.
  • Metadata extraction (extractMetadata): Only captures model, config, and tools from the request params. Does not extract groundingMetadata from the response.
  • Vendor SDK types (js/src/vendor-sdk-types/google-genai.ts): GoogleGenAIGenerateContentResponse has no explicit groundingMetadata field (only a catch-all [key: string]: unknown).
  • E2E tests: No scenario uses googleSearch tool configuration or validates grounding metadata.

Upstream reference

  • Google AI Gemini grounding docs: https://ai.google.dev/gemini-api/docs/grounding
  • groundingMetadata response fields include:
    • searchEntryPoint — rendered content for the search widget
    • groundingChunks — array of { web: { uri, title } } source documents
    • webSearchQueries — the search queries the model issued
    • groundingSupports — text segments with confidence scores and chunk indices
  • Available on models like gemini-2.0-flash and gemini-2.5-pro when grounding is enabled.

Braintrust docs status

The Braintrust Google GenAI integration page documents generateContent and generateContentStream but does not mention grounding metadata (not_found).

Precedent in this repo

The Python SDK has an equivalent open issue: braintrustdata/braintrust-sdk-python#153.

Local files inspected

  • js/src/instrumentation/plugins/google-genai-plugin.ts — streaming aggregation and metadata extraction
  • js/src/instrumentation/plugins/google-genai-channels.ts — channel definitions
  • js/src/vendor-sdk-types/google-genai.ts — response type definitions
  • js/src/wrappers/google-genai.ts — wrapper proxy
  • e2e/scenarios/google-genai-instrumentation/scenario.impl.mjs — e2e test scenarios

Metadata

Metadata

Assignees

No one assigned

    Labels

    bot-automationIssues generated by an agent automation

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions