Skip to content

Latest commit

 

History

History
338 lines (268 loc) · 8.68 KB

File metadata and controls

338 lines (268 loc) · 8.68 KB

Event Callback Flow in Live Assistant

Architecture Overview

flowchart TB
    subgraph "Server Side (Python/FastAPI)"
        S1[Gemini Live API]
        S2[FastAPI WebSocket Handler]
        S3[runner.run_live Generator]

        S1 -->|Audio Events| S3
        S3 -->|event.model_dump_json| S2
        S2 -->|JSON over WebSocket| WS
    end

    subgraph "Client Side (React/TypeScript)"
        WS[WebSocket Connection]
        ADK[ADKStreamingClient]
        CB[onEvent Callback]
        UI[React Component]
        AP[AudioPlayer]

        WS -->|MessageEvent| ADK
        ADK -->|Parse JSON| ADK
        ADK -->|Call callback| CB
        CB -->|Process event| UI
        CB -->|Audio data| AP
    end

    style S1 fill:#e1f5ff
    style WS fill:#fff3e0
    style CB fill:#f3e5f5
    style AP fill:#e8f5e9
Loading

Detailed Flow: Step-by-Step

sequenceDiagram
    participant Server as FastAPI Server
    participant WS as WebSocket
    participant Client as ADKStreamingClient
    participant Callback as onEvent Callback
    participant React as React Component
    participant Player as AudioPlayer

    Note over React: 1. Component Mount (useEffect)
    React->>Client: new ADKStreamingClient(config)
    React->>Client: client.onEvent(callbackFunction)
    Note over Client: Store callback in<br/>this.onEventCallback

    Note over React: 2. User Clicks Connect
    React->>Client: client.connect()
    Client->>WS: new WebSocket(url)
    WS->>Server: WebSocket Connection

    Note over Server: 3. Server Sends Event
    Server->>Server: event.model_dump_json()
    Server->>WS: JSON text message

    Note over WS: 4. WebSocket Receives Message
    WS->>Client: ws.onmessage(MessageEvent)

    Note over Client: 5. Parse & Trigger Callback
    Client->>Client: JSON.parse(event.data)
    Client->>Callback: this.onEventCallback?.(parsedData)

    Note over Callback: 6. Callback Execution
    Callback->>React: setEvents(add to log)

    alt Event has audio content
        Callback->>Callback: Check event.content.parts
        Callback->>Callback: Find audio inlineData
        Callback->>Player: audioPlayer.playAudio(base64)
        Player->>Player: Decode & Queue Audio
    end

    alt Event has text content
        Callback->>React: setCurrentTranscript(text)
    end

    alt Event has transcription
        Callback->>React: setCurrentTranscript(transcription)
    end
Loading

Component Lifecycle

flowchart TD
    A[Component Mount] --> B[useEffect Runs]
    B --> C[Create AudioPlayer Instance]
    B --> D[Create ADKStreamingClient Instance]
    D --> E[Register onEvent Callback]

    E --> F{User Action?}

    F -->|Connect Button| G[client.connect]
    G --> H[WebSocket Opens]
    H --> I[WebSocket.onmessage Ready]

    F -->|Disconnect Button| J[client.disconnect]
    J --> K[Cleanup Resources]

    I --> L[Server Sends Events]
    L --> M[ws.onmessage Triggered]
    M --> N[Parse JSON]
    N --> O[Call Registered Callback]

    O --> P[Process Event in Component]
    P --> Q{Event Type?}

    Q -->|Audio| R[audioPlayer.playAudio]
    Q -->|Text| S[Update Transcript State]
    Q -->|Transcription| T[Update UI State]

    style A fill:#e1f5ff
    style E fill:#f3e5f5
    style O fill:#fff3e0
    style R fill:#e8f5e9
Loading

Callback Registration & Execution

flowchart LR
    subgraph "1. Registration Phase"
        R1[React Component]
        R2[client.onEvent]
        R3[Store in onEventCallback]

        R1 -->|Passes function| R2
        R2 -->|Saves reference| R3
    end

    subgraph "2. WebSocket Setup"
        W1[ws.onmessage = handler]
        W2[Handler calls onEventCallback]

        W1 --> W2
    end

    subgraph "3. Execution Phase"
        E1[Server sends data]
        E2[ws.onmessage fires]
        E3[JSON.parse data]
        E4[onEventCallback?. data]
        E5[React callback executes]

        E1 --> E2
        E2 --> E3
        E3 --> E4
        E4 --> E5
    end

    R3 -.Link.- W2
    W2 -.Same function.- E4
Loading

Data Flow: Audio Event Example

flowchart TD
    subgraph Server
        S1[Gemini Returns Audio]
        S2[Blob: data=bytes<br/>mime_type='audio/pcm;rate=24000']
        S3[model_dump_json]
        S4[Base64 Encode Bytes]
        S5["JSON: {content: {parts: [{inlineData: {data: 'DgASABAA...', mimeType: 'audio/pcm'}}]}}"]

        S1 --> S2
        S2 --> S3
        S3 --> S4
        S4 --> S5
    end

    subgraph WebSocket
        WS1[Send Text Message]
        WS2[Receive in Browser]
    end

    subgraph "Client: ADKStreamingClient"
        C1[ws.onmessage]
        C2[JSON.parse event.data]
        C3[this.onEventCallback data]
    end

    subgraph "Client: React Callback"
        CB1[Receives: event object]
        CB2[Check: event.content?.parts]
        CB3[Find: part.inlineData?.mimeType]
        CB4[Extract: part.inlineData.data]
        CB5[Call: audioPlayer.playAudio base64]
    end

    subgraph "AudioPlayer"
        AP1[base64ToInt16Array]
        AP2[int16ToFloat32]
        AP3[createBuffer]
        AP4[Queue & Play]
    end

    S5 --> WS1
    WS1 --> WS2
    WS2 --> C1
    C1 --> C2
    C2 --> C3
    C3 --> CB1
    CB1 --> CB2
    CB2 --> CB3
    CB3 --> CB4
    CB4 --> CB5
    CB5 --> AP1
    AP1 --> AP2
    AP2 --> AP3
    AP3 --> AP4

    style S5 fill:#e1f5ff
    style C3 fill:#f3e5f5
    style CB5 fill:#fff3e0
    style AP4 fill:#e8f5e9
Loading

Code Mapping

1. Callback Registration (Line ~467 in page.tsx)

clientRef.current.onEvent((event: ADKEvent) => {
  // This function is stored in ADKStreamingClient.onEventCallback
  setEvents((prev) => [
    ...prev,
    { type: "received", data: event, time: new Date() },
  ]);

  if (event.content?.parts) {
    // Process audio, text, etc.
  }
});

What happens:

  • React passes a function to client.onEvent()
  • ADKStreamingClient stores it in this.onEventCallback

2. WebSocket Message Handler (Line ~315 in page.tsx)

this.ws.onmessage = (event: MessageEvent) => {
  try {
    const data = JSON.parse(event.data); // Parse JSON from server
    this.onEventCallback?.(data); // Call the registered callback!
  } catch (error) {
    console.error("Failed to parse message:", error);
    this.onErrorCallback?.(error as Error);
  }
};

What happens:

  • WebSocket receives a message from server
  • Parses JSON string into JavaScript object
  • Calls the callback function that was registered earlier
  • The ?. operator ensures it only calls if callback exists

3. Callback Execution (Line ~470 in page.tsx)

if (event.content?.parts) {
  for (const part of event.content.parts) {
    // Check for audio (camelCase)
    if (part.inlineData?.mimeType?.startsWith("audio/") && speakerEnabled) {
      const audioData = part.inlineData.data;
      if (audioData) {
        audioPlayerRef.current?.playAudio(audioData); // Play audio!
      }
    }

    // Check for text
    if (part.text) {
      setCurrentTranscript((prev) => prev + part.text); // Update UI!
    }
  }
}

What happens:

  • Function receives the parsed event object
  • Checks for different types of content
  • Updates React state (triggers re-render)
  • Calls AudioPlayer to play audio

Key Concepts

Optional Chaining (?.)

this.onEventCallback?.(data); // Only calls if onEventCallback is not null/undefined

This is equivalent to:

if (this.onEventCallback !== null && this.onEventCallback !== undefined) {
  this.onEventCallback(data);
}

Closure & Reference

The callback function "closes over" the React component's scope:

  • Has access to setEvents, setCurrentTranscript, audioPlayerRef
  • Even though it's called from inside ADKStreamingClient
  • This is how it can update the React component's state

Event Flow Summary

  1. Setup: Component registers callback → Stored in client
  2. Connection: WebSocket connects → Ready to receive
  3. Message: Server sends JSON → WebSocket receives
  4. Parse: JSON string → JavaScript object
  5. Callback: Stored function is called with parsed data
  6. Process: Callback updates state, plays audio, etc.
  7. Render: React re-renders with new state

Types of Events Handled

Event Type Property Path Action
Audio event.content.parts[].inlineData.data Decode base64 → Play audio
Text event.content.parts[].text Append to transcript
Input Transcription event.inputTranscription.text Log user's speech
Output Transcription event.outputTranscription.text Update transcript display