feat(google): add RealtimeSession.sendText() for realtime text input#1724
Open
tianqiwuben wants to merge 1 commit into
Open
feat(google): add RealtimeSession.sendText() for realtime text input#1724tianqiwuben wants to merge 1 commit into
tianqiwuben wants to merge 1 commit into
Conversation
generateReply() throws on gemini-3.1 live models because they don't support
mid-session client content updates (midSessionChatCtxUpdate is false), leaving
no public way to inject a text turn programmatically. Add a sendText() method
that delivers the text via the Live API's realtime input
(send_realtime_input({ text })), which the model treats as a completed user
turn and responds to. Works across all live models. Under manual activity
detection the text is wrapped in activityStart/activityEnd so it forms a
complete turn, unless an activity is already open via startUserActivity().
🦋 Changeset detectedLatest commit: 1981237 The changes in this PR will be included in the next version bump. This PR includes changesets to release 34 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a public
sendText(text)method to the Google plugin's realtimeRealtimeSession, so callers can inject a text turn into a live Gemini session.This fills a gap on
gemini-3.1live models:generateReply()throws on them because they don't support mid-session client content updates (midSessionChatCtxUpdateisfalse), andsendClientEventis private — so there is currently no public way to programmatically send a text turn to a 3.1 live session.sendText()delivers the text via the Live API's realtime input (send_realtime_input({ text })), which the model treats as a completed user turn and responds to. It works across all live models, not just 3.1.Motivating use case: server-driven nudges (e.g. "the caller has been silent — check if they're still there") on a
gemini-3.1-flash-live-previewvoice agent, wheregenerateReply()is unavailable.Changes Made
plugins/google/src/realtime/realtime_api.ts: addRealtimeSession.sendText(text: string). It honors the same pending-tool guard aspushAudio, and under manual activity detection wraps the text inactivityStart/activityEndso it forms a complete turn (unless an activity is already open viastartUserActivity()).plugins/google/src/realtime/realtime_api.test.ts: tests for the default single-event path, the pending-blocking-tools no-op, and the manual-activity-detection wrapping..changeset/: minor bump for@livekit/agents-plugin-google.Pre-Review Checklist
turbo run build(incl. tsc typecheck), pluginlint, and vitest pass locallyTesting
vitest run plugins/google/src/realtime/realtime_api.test.ts→ 10 passed)restaurant_agent.ts/realtime_agent.ts— not exercised (additive method, no change to existing paths)Additional Notes
Lint shows 3 pre-existing
@typescript-eslint/no-explicit-anywarnings inrealtime_api.ts(lines ~1347/1376/1384) that are unrelated to this change.api:checkalso fails on a clean checkout (missing release tags across existing exports), so it's not affected by this PR.Alternative considered: making
generateReply()route through realtime input on 3.1. That would fix the standardAgentSession.generateReply({ userInput })path, butinstructions(a model-role steer) doesn't map cleanly onto user-side realtime input, so a dedicatedsendText()is the smaller, clearer change. Happy to follow up ongenerateReplyif preferred.