Skip to content

Latest commit

 

History

History
1205 lines (957 loc) · 42.9 KB

File metadata and controls

1205 lines (957 loc) · 42.9 KB

BigQuery Agent Analytics vs Cloud Logging: A Side-by-Side Comparison

TL;DR Executive Summary

BigQuery Agent Analytics Plugin = Intelligence Layer (analytics, AI, trends) Cloud Trace/Logging = Observation Layer (debugging, visualization, live ops)

Question You're Asking Use This Why
"What's our token spend this week?" BigQuery Plugin Structured data, SQL aggregations
"Why is this call hanging?" Cloud Trace Gantt chart, span timeline
"Are our agent responses high quality?" BigQuery Plugin AI-powered evaluation at scale
"Show me the last 5 errors" Cloud Logging Live Tail, quick filtering

Bottom Line: Use both. They're complementary, not competing.

  • BigQuery Plugin for business intelligence and AI-powered insights
  • Cloud Trace for operational debugging and latency analysis

Key Differentiator: The BigQuery Plugin provides a stable, agent-aware schema (a contract). Cloud Logging/Trace provides raw OTel data where field locations may change without notice.


This demo compares two approaches for logging and analyzing ADK agent telemetry:

  1. BigQuery Agent Analytics Plugin - Purpose-built for deep agent analytics + AI-powered insights
  2. Cloud Logging via OpenTelemetry - Standard observability integration

When to Use What

Use Case Best Tool Why
"Debug this hanging call right now" Cloud Trace Gantt chart visualization, Live Tail
"What's our average failure rate?" BigQuery Plugin SQL aggregations, historical analysis
"Evaluate agent response quality at scale" BigQuery Plugin Native AI integration with structured data
"Where did the 5 seconds go?" Cloud Trace Span timeline visualization
"How many tokens did we use this week?" BigQuery Plugin Structured token fields, easy aggregation

Key Insight: Different Types of "Real-Time"

Both approaches have fast ingestion, but time-to-insight differs:

Approach Ingestion Latency Best For
BigQuery Plugin 1-2 seconds Analytical real-time: "Average failure rate of last 1,000 calls"
Cloud Logging/Trace ~1 second Operational real-time: "Live Tail" - watching one session's logs

Choose based on your question:

  • "What's happening right now in this session?" → Cloud Logging/Trace
  • "What's the trend across all sessions?" → BigQuery Plugin

The Key Advantage: Schema Stability

The strongest argument for the BigQuery Plugin is schema stability:

Aspect BigQuery Plugin Cloud Logging/Trace
Schema Agent-aware, documented contract Raw OTel-to-JSON blobs
Tool name location Always content.tool Maybe jsonPayload.message, maybe jsonPayload.attributes.tool_name
Breaking changes Versioned schema (v2) Format may change without notice
Query reliability Stable JSON paths Regex parsing, fragile

The BigQuery Plugin provides a contract. Cloud Logging provides raw data.


AI-Powered Analytics: Native vs High-Friction

BigQuery's integration with Vertex AI Gemini provides native, low-friction AI analytics. Cloud Logging can technically access AI via Log Analytics + BigQuery, but with significant friction:

AI Capability BigQuery Plugin Cloud Logging Notes
LLM-as-Judge Evaluation ✅ Native ⚠️ High friction Requires Log Analytics + missing response content
Jailbreak Detection ✅ Native ⚠️ High friction User messages often redacted in OTel
Tool Failure Root Cause ✅ Native ⚠️ Possible Error text available, but no structured context
Sentiment Analysis ✅ Native ❌ Missing data Response content not in OTel spans
Memory Extraction ✅ Native ❌ Missing data Conversation content not captured
Anomaly Detection ✅ Native ⚠️ Basic alerting Log-based metrics only

Note on Log Analytics: Cloud Logging's Log Analytics feature creates BigQuery-linked datasets, enabling SQL queries on logs. However, this doesn't solve the missing data problem - tokens, full responses, and structured agent events are still not captured in OTel logs.

See bq_ai_powered_analytics.sql for complete examples.

Example: LLM-as-Judge for Response Quality

SELECT
  session_id,
  JSON_EXTRACT_SCALAR(content, '$.response') AS agent_response,
  AI.GENERATE_TEXT(
    MODEL `project.dataset.gemini_model`,
    CONCAT(
      'Evaluate this agent response (1-10) for: helpfulness, accuracy, clarity. ',
      'Return JSON. Response: ', JSON_EXTRACT_SCALAR(content, '$.response')
    )
  ).ml_generate_text_llm_result AS evaluation
FROM `project.dataset.agent_events_v2`
WHERE event_type = 'LLM_RESPONSE';

Example: Automated Jailbreak Detection

SELECT
  user_id,
  JSON_EXTRACT_SCALAR(content, '$.text') AS user_message,
  AI.GENERATE_TEXT(
    MODEL `project.dataset.gemini_model`,
    CONCAT(
      'Is this a jailbreak attempt? Return JSON: ',
      '{"is_jailbreak": boolean, "risk_level": "none|low|medium|high"}. ',
      'Message: ', JSON_EXTRACT_SCALAR(content, '$.text')
    )
  ).ml_generate_text_llm_result AS safety_check
FROM `project.dataset.agent_events_v2`
WHERE event_type = 'USER_MESSAGE_RECEIVED';

Example: Tool Failure Root Cause Analysis

SELECT
  JSON_EXTRACT_SCALAR(content, '$.tool') AS tool_name,
  error_message,
  AI.GENERATE_TEXT(
    MODEL `project.dataset.gemini_model`,
    CONCAT(
      'Analyze this tool failure. Return: root cause category, ',
      'specific cause, suggested fix, is_transient. ',
      'Tool: ', JSON_EXTRACT_SCALAR(content, '$.tool'),
      ' Error: ', error_message
    )
  ).ml_generate_text_llm_result AS root_cause
FROM `project.dataset.agent_events_v2`
WHERE event_type = 'TOOL_ERROR';

Example: Memory Extraction from Conversations

SELECT
  session_id,
  AI.GENERATE_TEXT(
    MODEL `project.dataset.gemini_model`,
    CONCAT(
      'Extract from this conversation: user_preferences, entities, ',
      'facts_learned, action_items. Return JSON. ',
      'Conversation: ', conversation_text
    )
  ).ml_generate_text_llm_result AS extracted_memory
FROM session_conversations;

Real Demo Results

The following results are from actual demo runs on project test-project-0728-467323:

Token Usage Analysis (BigQuery)

SELECT
  COUNT(*) as total_llm_calls,
  SUM(CAST(JSON_EXTRACT_SCALAR(content, "$.usage.total") AS INT64)) as total_tokens,
  ROUND(AVG(CAST(JSON_EXTRACT_SCALAR(content, "$.usage.total") AS INT64)), 1) as avg_tokens_per_call,
  SUM(CAST(JSON_EXTRACT_SCALAR(content, "$.usage.prompt") AS INT64)) as total_prompt_tokens,
  SUM(CAST(JSON_EXTRACT_SCALAR(content, "$.usage.completion") AS INT64)) as total_completion_tokens
FROM `test-project-0728-467323.agent_analytics_demo.agent_events_v2`
WHERE event_type = "LLM_RESPONSE";

Result:

+-----------------+--------------+---------------------+---------------------+-------------------------+
| total_llm_calls | total_tokens | avg_tokens_per_call | total_prompt_tokens | total_completion_tokens |
+-----------------+--------------+---------------------+---------------------+-------------------------+
|              16 |         7307 |               456.7 |                7010 |                     297 |
+-----------------+--------------+---------------------+---------------------+-------------------------+

Tool Failure Rate (BigQuery)

SELECT
  JSON_EXTRACT_SCALAR(content, "$.tool") as tool_name,
  COUNTIF(event_type = "TOOL_COMPLETED") as successes,
  COUNTIF(event_type = "TOOL_ERROR") as failures,
  COUNT(*) as total_calls,
  ROUND(100.0 * COUNTIF(event_type = "TOOL_ERROR") / NULLIF(COUNT(*), 0), 1) as failure_rate_pct
FROM `test-project-0728-467323.agent_analytics_demo.agent_events_v2`
WHERE event_type IN ("TOOL_COMPLETED", "TOOL_ERROR")
GROUP BY tool_name
ORDER BY total_calls DESC;

Result:

+---------------------+-----------+----------+-------------+------------------+
|      tool_name      | successes | failures | total_calls | failure_rate_pct |
+---------------------+-----------+----------+-------------+------------------+
| get_weather         |         4 |        0 |           4 |              0.0 |
| calculate_trip_cost |         2 |        0 |           2 |              0.0 |
| flaky_api_call      |         2 |        0 |           2 |              0.0 |
| analyze_sentiment   |         2 |        0 |           2 |              0.0 |
+---------------------+-----------+----------+-------------+------------------+

LLM Latency Analysis (BigQuery)

SELECT
  ROUND(AVG(CAST(JSON_EXTRACT_SCALAR(latency_ms, "$.total_ms") AS FLOAT64)), 0) as avg_latency_ms,
  ROUND(AVG(CAST(JSON_EXTRACT_SCALAR(latency_ms, "$.time_to_first_token_ms") AS FLOAT64)), 0) as avg_ttft_ms,
  MIN(CAST(JSON_EXTRACT_SCALAR(latency_ms, "$.total_ms") AS INT64)) as min_latency_ms,
  MAX(CAST(JSON_EXTRACT_SCALAR(latency_ms, "$.total_ms") AS INT64)) as max_latency_ms
FROM `test-project-0728-467323.agent_analytics_demo.agent_events_v2`
WHERE event_type = "LLM_RESPONSE";

Result:

+----------------+-------------+----------------+----------------+
| avg_latency_ms | avg_ttft_ms | min_latency_ms | max_latency_ms |
+----------------+-------------+----------------+----------------+
|          749.0 |       749.0 |            493 |           1476 |
+----------------+-------------+----------------+----------------+

Session Analytics (BigQuery)

Result:

+------------------+--------------+-----------+------------+---------------+--------------+
|    session_id    | total_events | llm_calls | tool_calls | user_messages | duration_sec |
+------------------+--------------+-----------+------------+---------------+--------------+
| bq-demo-f9403d20 |           46 |         8 |          5 |             4 |            6 |
| bq-demo-273c5890 |           46 |         8 |          5 |             4 |            5 |
+------------------+--------------+-----------+------------+---------------+--------------+

Structured Tool Results (BigQuery)

Sample tool execution data showing full structured content:

{
  "tool_name": "get_weather",
  "full_content": {
    "result": {"city": "Tokyo", "condition": "Partly cloudy", "humidity": 65, "temp_c": 22},
    "tool": "get_weather"
  }
}

{
  "tool_name": "calculate_trip_cost",
  "full_content": {
    "result": {
      "city": "Tokyo",
      "daily_breakdown": {"food": 70, "hotel": 150, "transport": 40},
      "days": 5,
      "hotel_class": "mid-range",
      "total_estimate_usd": 1300
    },
    "tool": "calculate_trip_cost"
  }
}

Cloud Logging Equivalent?

For the same analytics in Cloud Logging, you would need to:

  1. Token Usage: Not available - OTel spans don't capture token counts
  2. Tool Failure Rate: Parse log text manually, export to BigQuery
  3. Latency: Available in Cloud Trace spans, but not aggregatable via log queries
  4. Session Analytics: Requires log export + external processing
  5. AI-Powered Analysis: High friction - requires Log Analytics setup, missing structured data

Quick Start

Option 1: BigQuery Agent Analytics

# Set environment variables
export PROJECT_ID="test-project-0728-467323"
export BQ_DATASET_ID="agent_analytics_demo"
export BQ_TABLE_ID="agent_events_v2"

# Run the demo
python run_with_bq_analytics.py

Option 2: Cloud Logging

# Install OTel dependencies
pip install opentelemetry-exporter-gcp-logging \
            opentelemetry-exporter-gcp-monitoring

# Set environment and run
export PROJECT_ID="test-project-0728-467323"
./run_with_cloud_logging.sh

Feature Comparison

Feature BigQuery Analytics Cloud Logging (OTel)
Latency Near real-time (1-2 sec) Near real-time (~1 sec)
Token Usage Tracking Full support (7,307 tokens tracked) Not available
Tool Failure Rates SQL ready (4 tools, 10 calls) Manual log parsing
Latency Metrics TTFT + total (avg 749ms) Trace spans only
AI-Powered Analytics Full Gemini integration Not available
LLM-as-Judge Native SQL support High friction (missing response content)
Jailbreak Detection Automated with AI High friction (user msgs often redacted)
Root Cause Analysis AI-powered Manual inspection
Multi-Modal Content Full support (images, GCS) Very limited (~256KB)
Query Language SQL + AI functions Log query (filters only)
Distributed Tracing Trace IDs for correlation Winner: Gantt chart visualization

AI-Powered Analytics Deep Dive

1. LLM-as-Judge: Customer-Facing Evaluation

Automatically evaluate every agent response for quality:

-- Score responses on helpfulness, accuracy, clarity
SELECT
  session_id,
  AI.GENERATE_TEXT(
    MODEL `project.dataset.gemini_model`,
    CONCAT('Rate 1-10: helpfulness, accuracy, clarity. Response: ', response)
  ) AS evaluation
FROM agent_responses;

Use Cases:

  • Quality assurance at scale
  • A/B testing agent prompts
  • Identifying underperforming responses
  • Customer satisfaction prediction

2. Jailbreak & Safety Detection

Scan all user inputs for potential attacks:

-- Detect jailbreak attempts with risk scoring
SELECT
  user_id,
  AI.GENERATE_TEXT(
    MODEL `project.dataset.gemini_model`,
    CONCAT('Jailbreak attempt? Risk level? Message: ', user_message)
  ) AS safety_analysis
FROM user_messages;

Use Cases:

  • Real-time security monitoring
  • User risk profiling
  • Compliance auditing
  • Attack pattern analysis

3. Tool Failure Root Cause Analysis

Let AI diagnose tool failures:

-- Automatic root cause categorization
SELECT
  tool_name,
  AI.GENERATE_TEXT(
    MODEL `project.dataset.gemini_model`,
    CONCAT('Root cause? Fix suggestion? Error: ', error_message)
  ) AS diagnosis
FROM tool_errors;

Use Cases:

  • Automated incident triage
  • Failure pattern identification
  • Proactive maintenance
  • SLA tracking

4. Sentiment Analysis at Scale

Monitor agent tone across all interactions:

-- Track sentiment trends
SELECT
  DATE(timestamp),
  COUNTIF(sentiment = 'POSITIVE') AS positive,
  COUNTIF(sentiment = 'NEGATIVE') AS negative
FROM (
  SELECT timestamp, AI.GENERATE_TEXT(...) AS sentiment
  FROM responses
)
GROUP BY DATE(timestamp);

Use Cases:

  • Customer experience monitoring
  • Agent performance tracking
  • Escalation prediction
  • Brand protection

5. Memory & Profile Extraction

Build user profiles from conversation history:

-- Extract preferences and entities
SELECT
  user_id,
  AI.GENERATE_TEXT(
    MODEL `project.dataset.gemini_model`,
    CONCAT('Extract: preferences, entities, facts. Conversation: ', history)
  ) AS user_profile
FROM conversation_histories;

Use Cases:

  • Personalization
  • Long-term memory systems
  • User segmentation
  • Recommendation engines

Detailed Analysis: Use Case Comparison

1. Token Usage Analysis

BigQuery Analytics: Excellent

SELECT
  CAST(JSON_EXTRACT_SCALAR(content, "$.usage.total") AS INT64) as total_tokens,
  CAST(JSON_EXTRACT_SCALAR(content, "$.usage.prompt") AS INT64) as prompt_tokens,
  CAST(JSON_EXTRACT_SCALAR(content, "$.usage.completion") AS INT64) as response_tokens
FROM `project.dataset.agent_events_v2`
WHERE event_type = "LLM_RESPONSE";

Demo Output: 16 LLM calls, 7,307 total tokens, 456.7 avg per call

Cloud Logging: Not Supported - Token usage isn't captured in OTel spans.

Winner: BigQuery Analytics


2. Tool Failure Rate Analysis

BigQuery Analytics: Excellent + AI-Powered Root Cause

-- Basic failure rate
SELECT tool_name, COUNTIF(event_type = "TOOL_ERROR") / COUNT(*) AS failure_rate
FROM events GROUP BY tool_name;

-- AI-powered root cause (EXCLUSIVE to BigQuery)
SELECT tool_name, AI.GENERATE_TEXT(...) AS root_cause
FROM tool_errors;

Cloud Logging: Manual parsing only, no AI analysis

Winner: BigQuery Analytics


3. Distributed Tracing

BigQuery Analytics: Trace IDs for correlation

SELECT bq.event_type, bq.trace_id, bq.span_id
FROM `project.dataset.agent_events_v2` bq
WHERE trace_id IS NOT NULL;

Cloud Logging: Excellent - Full Cloud Trace visualization

Winner: Cloud Logging (for visualization only)


4. Real-time Debugging

Both approaches are near real-time:

Aspect BigQuery Cloud Logging
Write latency 1-2 seconds ~1 second
Query latency Sub-second Sub-second
Live tail Via streaming gcloud logging tail

Verdict: TIE - Both suitable for operational monitoring


Pros and Cons Summary

BigQuery Agent Analytics Plugin

Pros:

  • Near real-time (1-2 sec, configurable)
  • AI-powered analytics (Gemini integration)
  • Full token usage tracking
  • Time-to-first-token (TTFT) metrics
  • Structured tool event data
  • Multi-modal content support
  • Large content via GCS offloading
  • Powerful SQL + AI functions
  • Cost-effective at scale
  • ML-ready data

Cons:

  • Additional setup (BQ dataset, IAM)
  • No trace visualization (trace IDs only)
  • BQ storage costs for high-volume

Cloud Logging via OpenTelemetry

Pros:

  • Simple setup (--otel_to_cloud)
  • Excellent distributed tracing (Cloud Trace)
  • Native alerting integration
  • OTel ecosystem compatibility

Cons:

  • No AI-powered analytics
  • No structured token usage
  • No TTFT metrics
  • Limited multi-modal support
  • No SQL aggregations
  • No jailbreak detection
  • No automated root cause analysis

Recommendation Matrix

Use Case Recommended Why
Token cost tracking BigQuery Structured token counts
Tool reliability BigQuery SQL + AI root cause
LLM-as-Judge evaluation BigQuery Only option
Jailbreak detection BigQuery Only option
Automated root cause BigQuery Only option
Sentiment analysis BigQuery Only option
Memory extraction BigQuery Only option
Multi-modal analytics BigQuery Full content + GCS
Distributed tracing Cloud Logging Visual timeline
Quick prototyping Cloud Logging Zero config

Hybrid Approach (Best of Both)

For production systems, use both approaches together:

from google.adk.apps import App
from google.adk.plugins.bigquery_agent_analytics_plugin import (
    BigQueryAgentAnalyticsPlugin,
    BigQueryLoggerConfig,
)

bq_plugin = BigQueryAgentAnalyticsPlugin(
    project_id="test-project-0728-467323",
    dataset_id="agent_analytics_demo",
    config=BigQueryLoggerConfig(
        enabled=True,
        batch_size=1,  # Near real-time writes
    ),
)

app = App(
    name="production_agent",
    root_agent=agent,
    plugins=[bq_plugin],
)

# Run with both: BQ plugin + Cloud Logging
# adk web ./agent_dir --otel_to_cloud

This provides:

  • AI-powered analytics via BigQuery + Gemini
  • Near real-time data in both systems
  • Trace visualization via Cloud Logging
  • Deep analytics via BigQuery SQL

Can Cloud Logging Achieve the Same Results?

Question: Can I export Cloud Logging to BigQuery and get the same analytics?

Answer: Partially, with 10x the effort and ~20% of the capabilities.

Option 3: Cloud Logging Export to BigQuery

We provide setup_cloud_logging_export.sh to demonstrate this approach:

# Run the setup script (creates sink, IAM, views, model)
export PROJECT_ID="test-project-0728-467323"
./setup_cloud_logging_export.sh

Effort Comparison

Aspect BigQuery Plugin Cloud Logging Export
Setup Code 10 lines Python 150+ lines shell script
Setup Time 5 minutes 30+ minutes
IAM Config Auto (plugin handles) Manual (sink SA, Vertex AI)
Schema Structured, stable Unstructured, fragile
Maintenance None Ongoing (log format changes)
Data Latency 1-2 seconds 1-5 minutes
Query Complexity Simple JSON paths Regex + nested JSON parsing

Setup Code Comparison

BigQuery Agent Analytics Plugin (10 lines):

from google.adk.plugins.bigquery_agent_analytics_plugin import (
    BigQueryAgentAnalyticsPlugin,
    BigQueryLoggerConfig,
)

plugin = BigQueryAgentAnalyticsPlugin(
    project_id="my-project",
    dataset_id="agent_analytics",
    config=BigQueryLoggerConfig(enabled=True),
)
app = App(name="my_agent", root_agent=agent, plugins=[plugin])

Cloud Logging Export (150+ lines of shell):

# Step 1: Create BigQuery dataset
bq mk --dataset project:cloud_logging_export

# Step 2: Create Log Sink
gcloud logging sinks create adk-to-bq \
  bigquery.googleapis.com/projects/xxx/datasets/xxx \
  --log-filter='logName="projects/xxx/logs/adk-otel"'

# Step 3: Get sink service account
SINK_SA=$(gcloud logging sinks describe adk-to-bq --format='value(writerIdentity)')

# Step 4: Grant IAM permissions
gcloud projects add-iam-policy-binding $PROJECT \
  --member="$SINK_SA" --role="roles/bigquery.dataEditor"

# Step 5: Create normalized view (fragile, needs maintenance)
bq query "CREATE VIEW normalized_events AS SELECT ... complex JSON parsing ..."

# Step 6: Create Gemini connection
bq mk --connection --connection_type=CLOUD_RESOURCE gemini_conn

# Step 7: Grant Vertex AI permissions
gcloud projects add-iam-policy-binding $PROJECT \
  --member="serviceAccount:$CONN_SA" --role="roles/aiplatform.user"

# Step 8: Create remote model
bq query "CREATE MODEL gemini_model REMOTE WITH CONNECTION ..."

Real Data Comparison (Actual Demo Output)

BigQuery Plugin - LLM Response Event (ACTUAL DATA):

{
  "timestamp": "2026-01-06 09:36:07",
  "event_type": "LLM_RESPONSE",
  "agent": "telemetry_demo_agent",
  "session_id": "bq-demo-f9403d20",
  "user_id": "demo-user",
  "invocation_id": "e-68cf3690-f3af-4028-9073-1d44d4d87d23",
  "trace_id": "e-78f9ab79-96fa-426c-9c32-c21eff08460a",
  "span_id": "b5f79274-f288-4d0e-b9d9-ca301b00534f",
  "content": {
    "response": "The sentiment is positive.",
    "usage": {"completion": 17, "prompt": 556, "total": 573}
  },
  "latency_ms": {"time_to_first_token_ms": 550, "total_ms": 550},
  "status": "OK"
}

Cloud Logging Export - Same Event (STRUCTURE FROM OTEL SPEC):

{
  "timestamp": "2026-01-06T09:36:07Z",
  "logName": "projects/xxx/logs/adk-otel",
  "severity": "INFO",
  "trace": "projects/xxx/traces/xxx",
  "spanId": "xxx",
  "textPayload": "LLM response completed",
  "resource": {"type": "global"}
}
// NO token usage, NO session_id, NO user_id, NO latency, NO response content

BigQuery Plugin - Tool Event (ACTUAL DATA):

{
  "event_type": "TOOL_COMPLETED",
  "content": {
    "tool": "analyze_sentiment",
    "result": {
      "sentiment": "positive",
      "word_count": 6,
      "positive_indicators": 1
    }
  },
  "latency_ms": {"total_ms": 0}
}

{
  "event_type": "TOOL_STARTING",
  "content": {
    "tool": "analyze_sentiment",
    "args": {"text": "This trip was amazing and wonderful!"}
  }
}

Cloud Logging Export - Same Event:

// Tool arguments: NOT CAPTURED
// Tool results: NOT CAPTURED
// Tool name: Must parse from text (fragile)

BigQuery Plugin - User Message (ACTUAL DATA):

{
  "event_type": "USER_MESSAGE_RECEIVED",
  "session_id": "bq-demo-f9403d20",
  "user_id": "demo-user",
  "content": {
    "text_summary": "Analyze the sentiment of: This trip was amazing!"
  }
}

Cloud Logging Export - Same Event:

// User message content: NOT CAPTURED (PII protection in OTel)
// Session ID: NOT CAPTURED
// User ID: NOT CAPTURED

Data Completeness Comparison

Data Field BigQuery Plugin Cloud Logging Export
Token usage (prompt/completion) {"completion":17,"prompt":556,"total":573} Not in OTel logs (only in Trace spans)
Tool name "tool": "analyze_sentiment" Parse from text (fragile schema)
Tool arguments "args": {"text": "..."} In Trace spans, may be truncated (128KB limit)
Tool results "result": {"sentiment": "positive"} In Trace spans, may be truncated
Session ID "session_id": "bq-demo-f9403d20" As conversation.id in Trace spans
User ID "user_id": "demo-user" Not captured in OTel
Invocation ID "invocation_id": "e-68cf..." Not captured in OTel
Latency (total_ms) "total_ms": 550 In Trace spans (span duration)
Time-to-first-token "time_to_first_token_ms": 550 Not captured in OTel
Response content Full text captured May be truncated (OTel attribute limits)
User message "text_summary": "..." Often redacted (Google's OTel instrumentation)
Event types 10+ types (LLM_RESPONSE, TOOL_COMPLETED, etc.) Severity only
Trace ID Yes Yes
Span ID Yes Yes

Result: Cloud Logging export captures ~15-20% of the data that the BigQuery plugin captures.

Critical Architectural Limitation: OTel Spans vs Logs

IMPORTANT DISCOVERY: ADK emits telemetry data as OTel spans (traces), NOT as log records.

This has critical implications for the Cloud Logging approach:

┌─────────────────────────────────────────────────────────────────────────┐
│                    ADK Telemetry Architecture                           │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ADK Agent                                                              │
│      │                                                                  │
│      ▼                                                                  │
│  OTel Tracer (spans)  ──────►  Cloud Trace  ──────►  (Needs export)     │
│      │                              │                                   │
│      │                              └───► Telemetry API Required (403)  │
│      ▼                                                                  │
│  OTel Logger (logs)   ──────►  Cloud Logging  ──────►  Limited data     │
│      │                                                                  │
│      └──► ADK does NOT emit log records for agent events!               │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

What This Means:

  1. Cloud Logging Exporter (enable_cloud_logging=True) captures Python log records, NOT agent telemetry
  2. Cloud Trace Exporter (enable_cloud_tracing=True) captures agent spans BUT:
    • Requires Telemetry API enabled (403 error without it)
    • Goes to Cloud Trace, NOT Cloud Logging
    • Trace → BigQuery export is a separate, complex pipeline
  3. Disabling Cloud Trace = No agent telemetry captured at all

Demo Verification (Successful Run with Cloud Trace):

# Working configuration (requires Telemetry API enabled):
otel_hooks = get_gcp_exporters(
    enable_cloud_tracing=True,   # Requires Telemetry API
    enable_cloud_logging=True,   # Only captures Python logs, not spans
)
gcp_resource = get_gcp_resource(project_id=PROJECT_ID)
maybe_set_otel_providers([otel_hooks], otel_resource=gcp_resource)

What Cloud Trace Actually Captures (Verified):

Trace ID: 68cd2682f74e0142a76135faa95316c1

Span: call_llm
  gen_ai.usage.input_tokens: 284
  gen_ai.usage.output_tokens: 5
  gcp.vertex.agent.session_id: trace-capture-b219d27f

Span: execute_tool get_weather
  gen_ai.tool.name: get_weather
  gen_ai.tool.call.id: adk-3735b2d5-3b5b-4abd-987b-12351a194c31
  gcp.vertex.agent.tool_call_args: {"city": "Tokyo"}
  gcp.vertex.agent.tool_response: {"city": "Tokyo", "temp_c": 22, ...}

Cloud Trace captures more than expected:

  • ✅ Span names and hierarchy
  • ✅ Duration/latency (excellent for "where did the time go?" questions)
  • ✅ Token usage (input_tokens, output_tokens)
  • ✅ Tool name, arguments, and responses
  • ✅ Session ID (as conversation.id)
  • ✅ LLM requests and responses (full JSON)
  • Gantt chart visualization (Cloud Trace UI is excellent for debugging)

Important Constraint: OTel Attribute Size Limits

Cloud Trace and OTel have size limits that affect large payloads:

  • Total span size: ~128KB
  • Individual attributes: May be truncated if too large
  • Multi-modal content (images): Will NOT fit in span attributes

The BigQuery Plugin handles this by:

  • Using GCS for large payloads (automatic offloading)
  • Storing multi-modal content references
  • No truncation of response content

What's Still Missing vs BigQuery Plugin:

  • ❌ Structured event types (LLM_RESPONSE, TOOL_COMPLETED, etc.)
  • ❌ User ID tracking (not in OTel instrumentation)
  • ❌ Invocation ID tracking
  • ❌ Time-to-first-token (TTFT) metrics
  • ❌ SQL query interface (must use Trace API or build ETL)
  • ❌ Native AI analytics (requires ETL to BigQuery first)
  • ❌ Schema stability (span attribute names may change)

To get agent telemetry via Cloud Trace → BigQuery, you would need:

  1. Enable Telemetry API (telemetry.googleapis.com) ✓ Done
  2. Enable Cloud Trace export (enable_cloud_tracing=True) ✓ Done
  3. Configure GCP resource with project_id ✓ Done
  4. Create a Cloud Trace → BigQuery export pipeline (separate from log export)
  5. Parse span attributes from trace data (complex JSON structure)

This is even more complex than the log export approach documented below.

Additional Complexity: API & Setup Requirements

BigQuery Plugin - Just works with 10 lines of Python:

  • BigQuery API (usually already enabled)
  • No additional configuration needed

Cloud Trace via OTel - Requires multiple steps (actual setup performed):

# 1. Enable Telemetry API (required for Cloud Trace export)
gcloud services enable telemetry.googleapis.com --project=test-project-0728-467323

# 2. Configure OTel with GCP resource (required for project_id attribute)
gcp_resource = get_gcp_resource(project_id=PROJECT_ID)
maybe_set_otel_providers([otel_hooks], otel_resource=gcp_resource)

Full list of requirements:

  • Cloud Logging API enabled
  • Telemetry API enabled (telemetry.googleapis.com) - 403 error without this
  • OTel providers configured with GCP resource - 400 error without project_id
  • Environment variables set (OTEL_SERVICE_NAME, etc.)
  • Authentication and IAM permissions configured

Cloud Logging to BigQuery Export - Even more setup:

  • All of the above, plus:
  • Log Sink creation with correct filter
  • IAM binding for sink service account
  • BigQuery dataset and table management
  • View creation for data normalization (fragile, needs maintenance)
  • 1-5 minute export delay (vs 1-2 seconds for BQ plugin)

This demonstrates the operational complexity difference.

AI Analytics Capability Comparison

AI Capability BigQuery Plugin Cloud Logging Export
LLM-as-Judge Full (has responses) High friction (responses may be truncated)
Jailbreak Detection Full (has user msgs) High friction (msgs often redacted)
Root Cause Analysis Full (has tool data) Possible (error text available)
Sentiment Analysis Full (has responses) High friction (responses may be truncated)
Memory Extraction Full (has content) High friction (content often incomplete)
Anomaly Detection Full metrics Limited (counts only)

Query Complexity Comparison

Token Usage - BigQuery Plugin (simple):

SELECT JSON_EXTRACT_SCALAR(content, '$.usage.total') AS tokens
FROM agent_events_v2 WHERE event_type = 'LLM_RESPONSE';

Token Usage - Cloud Logging Export:

-- Token data not in OTel logs (only in Trace spans, requires ETL to BigQuery)

Tool Failure Rate - BigQuery Plugin (simple):

SELECT tool_name,
  COUNTIF(event_type = 'TOOL_COMPLETED') AS success,
  COUNTIF(event_type = 'TOOL_ERROR') AS failure
FROM agent_events_v2 GROUP BY tool_name;

Tool Failure Rate - Cloud Logging Export (complex, incomplete):

-- Can only count errors, not successes
SELECT
  REGEXP_EXTRACT(textPayload, r'tool[:\s]+(\w+)') AS maybe_tool,  -- Fragile!
  COUNT(*) AS error_count
FROM exported_logs
WHERE severity = 'ERROR'
GROUP BY maybe_tool;
-- Missing: success count, actual tool name, failure rate calculation

When to Use Cloud Logging Export

Cloud Logging export to BigQuery is useful for:

  • Basic error monitoring and alerting
  • Log volume tracking
  • Compliance/audit log retention
  • Trace correlation (with Cloud Trace data)

It is NOT suitable for:

  • Token usage tracking (data not available)
  • Tool performance analytics (data not available)
  • AI-powered analysis (content not available)
  • Session analytics (data not available)

Bottom Line

Approach Setup Effort Data Coverage AI Capabilities
BigQuery Plugin Low (5 min) 100% Full
Cloud Logging Export High (30+ min) ~20% ~10%

Recommendation: Use the BigQuery Agent Analytics Plugin for agent analytics. Use Cloud Logging for what it's designed for: operational monitoring and trace visualization.


Option 4: Cloud Trace → BigQuery (Full Pipeline)

Since Cloud Trace captures more data than Cloud Logging, we also provide a complete Cloud Trace to BigQuery export pipeline. This demonstrates the FULL effort required to achieve comparable analytics to the BigQuery plugin.

Setup Steps (Actually Executed)

# Step 1: Enable required APIs
gcloud services enable telemetry.googleapis.com --project=test-project-0728-467323
gcloud services enable cloudtrace.googleapis.com --project=test-project-0728-467323
gcloud services enable bigquery.googleapis.com --project=test-project-0728-467323

# Step 2: Run setup script (creates dataset, table, model)
./setup_trace_to_bigquery.sh

# Step 3: Run the agent to generate traces
python run_cloud_logging_demo.py

# Step 4: Run ETL to export traces to BigQuery (MANUAL!)
python export_traces_to_bigquery.py --trace-ids="68cd2682f74e0142a76135faa95316c1"

# Step 5: Query the exported data
bq query "SELECT * FROM cloud_trace_export.agent_traces"

Actual Export Output

============================================================
Cloud Trace to BigQuery Export
============================================================
Project: test-project-0728-467323
Target:  cloud_trace_export.agent_traces
============================================================

NOTE: This manual ETL is required because Cloud Trace
      does NOT have native BigQuery export.

      The BigQuery Agent Analytics Plugin does this
      AUTOMATICALLY with zero additional code.

============================================================
Exporting 2 traces to BigQuery...
  [1/2] Fetching trace: 68cd2682f74e0142a76135faa95316c1
  [2/2] Fetching trace: ec0edc35e54e28a3891eb33f6da42a97
  Found 6 spans, loading to BigQuery...
  Successfully exported 6 spans to test-project-0728-467323.cloud_trace_export.agent_traces

Query Results (Actual Data)

Token Usage (Cloud Trace Export):

SELECT
  COUNT(*) AS total_llm_calls,
  SUM(input_tokens) AS total_input_tokens,
  SUM(output_tokens) AS total_output_tokens
FROM `test-project-0728-467323.cloud_trace_export.agent_traces`
WHERE input_tokens IS NOT NULL;
+-----------------+--------------------+---------------------+
| total_llm_calls | total_input_tokens | total_output_tokens |
+-----------------+--------------------+---------------------+
|               2 |                590 |                  28 |
+-----------------+--------------------+---------------------+

Tool Details (Cloud Trace Export):

SELECT tool_name, tool_args, tool_response, duration_ms
FROM `test-project-0728-467323.cloud_trace_export.agent_traces`
WHERE tool_name IS NOT NULL;
+-------------+-------------------+--------------------------------------------------+-------------+
|  tool_name  |     tool_args     |                  tool_response                   | duration_ms |
+-------------+-------------------+--------------------------------------------------+-------------+
| get_weather | {"city": "Tokyo"} | {"city": "Tokyo", "temp_c": 22, "condition":...} |       0.443 |
+-------------+-------------------+--------------------------------------------------+-------------+

Complete Effort Comparison (All Three Approaches)

Aspect BigQuery Plugin Cloud Trace Export Cloud Logging Export
Setup Code 10 lines Python 150 lines shell + 200 lines Python ETL 150 lines shell
Setup Time 5 minutes 45+ minutes 30+ minutes
APIs Required BigQuery (usually enabled) Telemetry API + BigQuery Logging API + BigQuery
Ongoing ETL None (automatic) Manual script (cron job) Automatic (sink)
Data Latency 1-2 seconds Manual trigger 1-5 minutes
Token Usage ✅ Native fields ✅ After ETL ❌ Not captured
Tool Args/Results ✅ Native fields ✅ After ETL ❌ Not captured
Session ID ✅ Native field ✅ After ETL ❌ Not captured
User ID ✅ Native field ❌ Not available ❌ Not captured
Event Types ✅ 10+ types ❌ Span names only ❌ Severity only
AI Analytics ✅ With Gemini ✅ With Gemini ✅ With Gemini
Real-time ✅ Yes ❌ No (manual ETL) ⚠️ Delayed

Files Created for Cloud Trace Export

File Lines Description
setup_trace_to_bigquery.sh ~150 Setup script for dataset, table, model
export_traces_to_bigquery.py ~200 ETL script to fetch and load traces
trace_export_queries.sql ~150 SQL queries for exported trace data
Total ~500 lines Additional code just for export

The Fundamental Problem

Even with the Cloud Trace → BigQuery pipeline working:

  1. No Native Export: Cloud Trace has NO built-in BigQuery export
  2. Manual ETL Required: Must run export_traces_to_bigquery.py periodically
  3. Not Real-Time: Data only appears after ETL runs
  4. Missing Fields: No user_id, no event types, no invocation_id
  5. Maintenance Burden: ETL script must be scheduled and monitored

The BigQuery Agent Analytics Plugin eliminates ALL of this complexity.


Files in This Demo

File Description
Agent & Runners
agent.py Shared agent with 4 tools
run_with_bq_analytics.py Python runner with BQ plugin
run_cloud_logging_demo.py Python runner for Cloud Trace/Logging demo
BigQuery Plugin
bq_analysis_queries.sql 30+ SQL queries for BQ plugin data
bq_ai_powered_analytics.sql AI-powered queries (Gemini)
Cloud Trace Export
setup_trace_to_bigquery.sh Setup script for trace export pipeline
export_traces_to_bigquery.py ETL script to export traces to BigQuery
trace_export_queries.sql SQL queries for exported trace data
Cloud Logging Export
setup_cloud_logging_export.sh Cloud Logging to BQ export setup
cloud_logging_export_queries.sql Queries for exported logs (limited)
Documentation
README.md This comparison document

Conclusion

Approach Best For Setup Effort Data Coverage AI Capabilities
BigQuery Plugin Analytical intelligence, AI insights 10 lines Python 100% Native
Cloud Trace Operational debugging, latency analysis Enable API ~80% Via ETL only
Cloud Trace → BQ Hybrid: viz + analytics 500+ lines code ~80% Native (after ETL)
Cloud Logging → BQ Basic error monitoring 150 lines shell ~20% High friction

The Right Tool for the Right Job:

  • "Debug this hanging call" → Cloud Trace (Gantt charts are unbeatable)
  • "What's our token spend trend?" → BigQuery Plugin (structured data, SQL)
  • "Evaluate response quality at scale" → BigQuery Plugin (native AI)
  • "Where did the 5 seconds go?" → Cloud Trace (span timeline)
  • "Monitor agent performance" → BigQuery Plugin (aggregations, dashboards)

Summary: Trade-offs Analysis

Criterion BigQuery Plugin Cloud Trace Winner Notes
Setup complexity 10 lines Python Enable APIs BigQuery Cloud Trace needs Telemetry API
Schema stability Versioned contract OTel attribute names BigQuery Key advantage for production
Token tracking ✅ Native fields ✅ In spans Tie Both capture, BQ easier to query
Large payloads ✅ GCS offload ⚠️ 128KB limit BigQuery Critical for multi-modal
AI analytics ✅ Native ⚠️ After ETL BigQuery Friction vs native
Trace visualization IDs only ✅ Gantt charts Cloud Trace Cloud Trace wins here
Analytical queries ✅ SQL REST API BigQuery BigQuery for aggregations
Live debugging Good ✅ Excellent Cloud Trace Live Tail is powerful
User ID tracking ✅ Native ❌ Not captured BigQuery Important for user analytics
Maintenance ✅ None ✅ None Tie ETL approach needs cron

Key Takeaways:

  1. Schema Stability is the Killer Feature - The BigQuery Plugin provides an agent-aware, documented contract. Cloud Logging/Trace gives you raw OTel blobs where field locations may change without notice.

  2. "High Friction" ≠ "Impossible" - Cloud Logging can technically do AI analytics via Log Analytics + BigQuery, but the missing/truncated data makes it impractical for LLM-as-Judge or sentiment analysis.

  3. Cloud Trace Wins for Debugging - The Gantt chart visualization is unmatched for answering "where did the latency come from?" Don't dismiss Cloud Trace entirely.

  4. OTel Size Limits Matter - The 128KB span limit means multi-modal content and large responses may be truncated. The BigQuery Plugin handles this with GCS offloading.

  5. Use Both in Production - BigQuery Plugin for analytics + AI insights, Cloud Trace for operational debugging. They're complementary, not competing.