Note: This repo was cloned from the Neural Maze's Ava the Whatsapp Agent Course repo. I have added functionality for telegram since whatsapp does not allow you to set up an account without a number solely for it and a registered business. Telegram's BotFather is much easier to work with. Go visit the link above for more details on the course.
Ava is a LangGraph based multimodal agent deployed on GCP cloud run. It supports I/O in text, image and audio. It also has long term memory (Qdrant vector DB) and short term memory (AsyncSqlLite as supported by LangGraph).
While I chose to deploy Ava on Telegram, the code has interfaces for chainlit and whatsapp as well. You could of course write your own adapter code for any other messaging service.
Ava uses the following models from the mentioned inference providers:
- Text Model: "llama-3.3-70b-versatile"
- Small Text Model: "llama-3.1-8b-instant"
- Speech to Text (STT): "whisper-large-v3-turbo"
- Image to Text (ITT): "meta-llama/llama-4-scout-17b-16e-instruct"
- Text to Speech (TTS): "eleven_flash_v2_5"
- Text to Image (TTI): "black-forest-labs/FLUX.1-krea-dev"
Apart from tinkering with the prompts and graph of the agent, an ambitious direction can be to connect via MCP servers the bot to other apps, making Ava more of an assistant.
Another possible direction is to make Ava an omnichannel bot. That is, you can continue the conversation from any device that you have Ava present on.
Apart from Chainlit, Whatsapp and Telegram, Discord and Slack can be potential candidates.