Releases: AgoraIO/agora-agents-python
v2.1.1
v2.1.0
Added
- Turn detection language — AgentKit now manages Agora interaction language through
turn_detection.language, validates it against the supported BCP-47 language list, and sends the defaulten-USwhen no language is provided. - Provider parameter parity — ASR, LLM, MLLM, TTS, and avatar wrappers expose typed provider parameters plus passthrough fields where the generated core supports additional properties.
Changed
- Generated core refresh — Regenerated core types from the v2.1 API schema.
- Deepgram TTS passthrough —
DeepgramTTSnow usesadditional_paramsfor passthrough fields and flattens them intotts.params; the removed nestedparams.paramsshape is no longer documented or emitted. - OpenAI TTS — Docs and tests now reflect the generated core shape, including
instructionsandspeedundertts.params. - TTS provider docs — Updated TTS provider reference tables to match implemented wrapper fields and generated core params.
Fixed
- Managed-provider validation — AgentKit validation now distinguishes preset-backed providers from BYOK providers so required provider fields are only required when credentials are caller-supplied.
- Language placement — Provider-specific STT language values remain under
asr.params, while Agora interaction language is emitted separately asturn_detection.language.
v2.0.0
Added
- Type aliases —
AsrConfig(=SttConfig),is_avatar_token_managed, think type aliases (ThinkOnListeningAction, etc.), and think value constants. XaiGrok— New MLLM wrapper for xAI Grok (mllm.vendor:"xai"), including Realtime API URL, voice, language, sample rate, modalities, messages, and MLLM turn detection support.GenericAvatar— New generic avatar wrapper (vendor: "generic") for custom avatar providers.- Avatar token enrichment —
AgentSession.start()now fills missing generic avataragora_appidandagora_channelfrom the session and generates missing avataragora_tokenvalues for HeyGen, LiveAvatar, and Generic avatars using each avatar'sagora_uid. - Turn pagination —
AgentSession.get_turns()andAsyncAgentSession.get_turns()now acceptpage_indexandpage_size. Newget_all_turns()helpers fetch and combine all pages. - Greeting interruption control — LLM vendor
greeting_configsnow accepts the typedLlmGreetingConfigsshape, including v2.7interruptable. - Type alias parity — Added public aliases for v2.7 generated types such as
LlmConfig,TtsConfig,SttConfig,MllmConfig,AvatarConfig,AgentConfigUpdate,ConversationTurns,ConversationHistory,SessionInfo,Labels,SpeakPriority, andFillerWordsContentSelectionRule.
Changed
- ConvoAI token options —
generate_convo_ai_token()now accepts an integeruidand handles the internal token string conversion for users, agents, and avatars. - Avatar token generation — Removed the dedicated
generate_avatar_rtc_token()wrapper; avatar RTC tokens use the existing ConvoAI token helper. - Avatar token gating — Session enrichment uses
is_avatar_token_managed(vendor-only); UID checks remain in session logic. XaiGrokis the primary xAI MLLM class — Matches the product name (xAI Grok) and the TypeScript/Go SDKs.- Package version — Bumped to
v2.0.0to match the Fern-generated SDK headers. - PyPI distribution rename — The published package name is now
agora-agents(formerlyagora-agent-server-sdk). The Python import path remainsagora_agent. - RTM data channel default — When
advanced_features.enable_rtm=True, AgentKit now defaultsparameters.data_channelto"rtm"unless the caller explicitly sets a data channel. - Agent-level LLM overrides — In the standard ASR + LLM + TTS pipeline, agent-level
greeting,failure_message, andmax_historynow override vendor defaults, matching the TypeScript SDK. In MLLM mode, agent-levelgreetingandfailure_messagefill only missing fields. - MLLM core alignment — MLLM wrappers no longer expose or emit unsupported
predefined_toolsormax_historyfields because they are not present in the generated v2.7 coremllmtype. - MLLM without TTS — MLLM sessions no longer require separate TTS, STT, or LLM vendor configuration.
- Avatar pipeline support — Avatar vendors are now explicitly limited to the cascading ASR + LLM + TTS pipeline. Combining
with_avatar()withwith_mllm()is rejected atAgent.to_properties()andAgentSession.start()(matching the TypeScript SDK), with a disabled avatar (enable=False) still permitted alongside MLLM. - VertexAI parity —
VertexAI.to_config()now spreadsadditional_paramsfirst so explicitmodel,project_id,location, andadc_credentials_stringfields always win, matching the TypeScript and Gemini Live wrappers. - Pagination guard parity —
AgentSession.get_all_turns()andAsyncAgentSession.get_all_turns()now raiseRuntimeErrorif the server's pagination metadata is missing (page_index/total_pages/is_last_page) or if the next page does not advance, matching the TypeScript SDK.
Migration notes
-
PyPI package rename — Install
agora-agentsinstead ofagora-agent-server-sdk(pip install agora-agentsorpoetry add agora-agents). The import path is unchanged (from agora_agent import ...). The legacy PyPI distribution name remains available as a compatibility shim that re-exports the public API fromagora-agents. -
Deprecated aliases — Use
LiveAvatarAvatarinstead ofHeyGenAvatar,is_avatar_token_managedinstead ofis_rtc_avatar, andThinkOn*/ThinkResponseinstead ofAgentThinkRequestOn*/AgentThinkResponse. -
think()default — The server default foron_listening_actionchanged frominjecttointerruptin API v2.7. Passon_listening_action="inject"explicitly to preserve the old behavior. -
Turn analytics pagination — Sessions with more than 50 turns must request additional pages via
get_turns(page_index=..., page_size=...)or useget_all_turns(). -
Error reasons — API v2.7 adds status codes
401,429, and500;InvalidRequestis split intoInvalidRequestBody,MissingRequiredField, andInvalidFieldValue, with new reasons such asServiceNotEnabled,AccountSuspended, andResourceAllocationFailed. -
Event
112— Webhook event112 turns finishedcan be used as an alternative batch delivery path for post-session turn data.
v1.4.1
Fixed
- Release workflow — Publish to PyPI with the
PYPI_API_TOKENsecret.
v1.4.0
Added
DeepgramTTS— New TTS vendor wrapper for Deepgram (Beta). Acceptsapi_key,model,base_url,sample_rate,params, andskip_patterns.Agent.with_tools(enabled=True)— Dedicated builder method to enable MCP tool invocation (advanced_features.enable_tools). Replaces the rawwith_advanced_features(AdvancedFeatures(enable_tools=True))call.- LLM vendors:
headersfield — All four LLM vendors (OpenAI,AzureOpenAI,Anthropic,Gemini) now accept an optionalheaders: Dict[str, str]parameter. Use this to pass custom HTTP headers to the LLM provider (e.g., tenant identifiers, routing headers). AgentSession.think()/AsyncAgentSession.think()— Send a custom instruction to a running agent through theagent_managementAPI.Agent.with_interruption()— Configure the new top-levelinterruptionobject for unified interruption control.- MLLM turn detection —
OpenAIRealtime,GeminiLive, andVertexAInow acceptturn_detection, which maps tomllm.turn_detectionand overrides top-level turn detection for MLLM sessions. audio_scenarioAgentKit support —SessionParamsand AgentKit request construction now expose the top-levelparameters.audio_scenariofield.- MLLM vendor parity —
GeminiLiveis documented and exposed as the direct Google Gemini Live API wrapper.
Fixed
- MiniMax TTS preset stripping — When a MiniMax reseller preset is inferred (
minimax_speech_2_6_turboorminimax_speech_2_8_turbo), thegroup_idandurlfields are now correctly stripped fromtts.paramsalongsidekeyandmodel. Previously they were forwarded to the API, causing request failures. - MLLM enable flag —
Agent.with_mllm()now setsmllm.enable = Trueand removes the deprecatedadvanced_features.enable_mllmflag from generated requests. - MLLM wrapper shape — MLLM vendors no longer emit removed fields such as
style; docs and tests now reflect the v2.6 MLLM contract. - Preset-backed OpenAI TTS —
OpenAITTSno longer requiresapi_keywhen a reseller preset supplies credentials server-side. - AgentKit parity coverage — Added regression coverage for interruption, MLLM turn detection, Deepgram TTS, LLM headers, and deprecated MLLM flag cleanup.
v1.3.2
v1.3.2
v1.3.0
Added
AgentSession— Addedget_turns()for turn analytics in both sync and async sessions.Agent/AgentSession— Added session-levelpresetandpipeline_idsupport, including preset normalization and automatic inference for supported reseller-backed models.AgentKit— Added preset constants and helper utilities for discoverable preset usage.AgentKit— Added missing public vendor surface forGeminiLive,LiveAvatarAvatar, andAnamAvatar.- Tests — Added AgentKit parity and vendor regression coverage for presets, session behavior, and wrapper mappings.
Changed
OpenAI/OpenAITTS/MiniMaxTTS— Relaxed no-key preset paths so reseller-backed usage can be expressed without forcing credential fields.GeminiLive— Aligned wrapper output with the Agora low-level MLLM contract and keptmessagesat the top level.Avatarwrappers — Updated avatar handling forLiveAvatarandAnam, including sample-rate validation behavior.
Fixed
AgentKitMLLM — Removed unsupported wrapper-only fields so the Python surface stays aligned with the generated Agora API contract.pydantic_utilities— Updated Pydantic compatibility handling for Python 3.14-safe operation.- Mypy/test packaging — Added explicit test package markers to avoid duplicate module resolution during type checking.
v1.2.0
Fixed
AresSTT— Removed redundantlanguagekey from theparamsdict. Language is now emitted only at the top level.paramsis only included whenadditional_paramsis provided.OpenAIRealtime/VertexAI(MLLM) — Agent-levelgreeting,failure_message, andmax_historyoverrides are now correctly applied when the agent is in MLLM mode. Previously these values were silently dropped.VertexAI(MLLM) —messagesis now correctly placed insideparams(required by the Gemini Live API). Previously it was emitted at the top level and silently ignored.
Changed
OpenAITTS— Renamed constructor parameterkey→api_keyto match the Agora server API expectation.⚠️ Breaking change.CartesiaTTS— Renamed constructor parameterkey→api_key. Voice is now serialized as{"mode": "id", "id": "<voice_id>"}instead of a flatvoice_idstring.⚠️ Breaking change.HeyGenAvatar— Removed legacy fieldsavatar_name,voice_id,language,version. Addedagora_token,avatar_id,enable,disable_idle_timeout,activity_idle_timeout. The config now includes a top-levelenablefield (defaultstrue).⚠️ Breaking change.
Added
OpenAITTS— New optional parameters:response_format(str, e.g."pcm") andspeed(float).CartesiaTTS—voice_iduser-facing field is preserved; voice is serialized to the required nested object format automatically.RimeTTS— New optional parameters:lang(str),sampling_rate(int, serialized assamplingRate),speed_alpha(float, serialized asspeedAlpha).OpenAIRealtime— New optional parameters:predefined_tools(List[str]),failure_message(str),max_history(int).VertexAI(MLLM) — New optional parameters:predefined_tools(List[str]),failure_message(str),max_history(int).HeyGenAvatar— New fields:agora_token(str, optional),avatar_id(str, optional),enable(bool, optional, defaultTrue),disable_idle_timeout(bool, optional),activity_idle_timeout(int, optional).
1.1.0
[v1.1.0] — 2026-03-17
Added
MurfTTSvendor
Fixed
MiniMaxTTS: added requiredgroup_id,url, and correctly nestedvoice_setting.voice_id— previously missing, requiring users to bypass the SDK entirelySarvamTTS: corrected schema tokey+speaker+target_language_code(was incorrectly usingapi_key,voice_id,model)- All LLM vendors: added
max_historyfield for conversation history caching AzureOpenAILLM: addedparamsescape hatch for passing arbitrary API parametersAnthropicLLM: addedurlfor custom endpoints andparamsescape hatchGeminiLLM: addedurlfor custom endpoints andparamsescape hatch; named model params (temperature,top_p,top_k,max_output_tokens) now take precedence overparamsdictSpeechmaticsSTT,SarvamSTT: added optionalmodelfield
1.0.0
[v1.0.0] — 2026-03-11
Initial stable release of the Agora Agent Server SDK for Python.
Added
Agentbuilder with fluent API (.with_llm(),.with_tts(),.with_stt(),.with_mllm(),.with_avatar())AgentSessionandAsyncAgentSessionfor synchronous and async session lifecycle management- Automatic token generation — pass
app_id+app_certificateand tokens are handled internally - Token utilities:
generate_rtc_token,generate_convo_ai_token,expires_in_hours,expires_in_minutes - Turn detection configuration via
TurnDetectionConfigwith nestedStartOfSpeechConfigandEndOfSpeechConfig - SAL (Selective Attention Locking) via
SalConfigwithSalMode - Filler words support:
FillerWordsConfig,FillerWordsTrigger,FillerWordsContent - Session parameters:
SessionParams,SilenceConfig,FarewellConfig,ParametersDataChannel - Geofencing via
GeofenceConfig - Advanced features (MLLM mode) via
AdvancedFeatures - Type-safe constants:
DataChannel,SilenceActionValues,SalModeValues,GeofenceArea,FillerWordsSelectionRule,TurnDetectionTypeValues - Vendor integrations:
- LLM:
OpenAI,AzureOpenAI,Anthropic,Gemini,VertexAI - MLLM:
OpenAIRealtime - TTS:
ElevenLabsTTS,MicrosoftTTS,OpenAITTS,CartesiaTTS,GoogleTTS,AmazonTTS,HumeAITTS,RimeTTS,FishAudioTTS,MiniMaxTTS,SarvamTTS - STT:
DeepgramSTT,MicrosoftSTT,OpenAISTT,GoogleSTT,AmazonSTT,AssemblyAISTT,AresSTT,SarvamSTT,SpeechmaticsSTT - Avatar:
HeyGenAvatar,AkoolAvatar
- LLM: