Skip to content

Ai-pre/MUTON-Android

Repository files navigation

MUTON-Android

MUTON-Android is the Android client for MUTON, a real-time multimodal dialogue assistance system for hearing-impaired users. The app captures camera frames and microphone audio, streams them to the MUTON backend, and displays subtitles, visual emotion cues, and multimodal summaries in a mobile interface.

Contents

Overview

The Android app is the user-facing part of MUTON. While the backend handles STT, face/audio processing, and Qwen2.5-Omni based summary generation, this client focuses on real-time capture, request synchronization, and presenting the result in a practical conversation flow.

In Graduation Project 2, the app was refined for a more stable live demo. It now discovers the active backend address dynamically, keeps the UI compatible with backend model changes, and routes conversation record summaries through the server so API keys are not stored inside the APK.

App Features

  • Camera frame capture for visual context
  • Microphone audio capture for streaming STT
  • Real-time requests to the MUTON backend
  • Subtitle display from finalized speech utterances
  • Visual emotion display from face-frame analysis
  • Multimodal summary display from backend fusion analysis
  • Conversation record screens and server-side record summary requests
  • Dynamic backend URL loading from the main MUTON repository

Installation

Open the project in Android Studio.

Recommended project configuration:

  • compileSdk: 35
  • minSdk: 28
  • targetSdk: 35
  • package namespace: com.example.myapplication

Build the app module from Android Studio, or use Gradle from the project root:

./gradlew :app:assembleDebug

On Windows PowerShell:

.\gradlew.bat :app:assembleDebug

Backend Connection

The app reads the active backend URL from the main MUTON repository:

https://raw.githubusercontent.com/Ai-pre/MUTON/server_main/backend_url.json

The backend URL file points to the current Cloudflare Tunnel address. This keeps the Android app stable even when the server tunnel changes between demos.

Runtime relationship:

  • Android sends audio chunks to /process_audio_chunk.
  • Android sends camera frames to /process_video_chunk.
  • Android requests multimodal summaries from /get_fusion_analysis.
  • Android requests conversation record summaries from /summarize_conversation_record.

Running On Device

Before launching the app, make sure the backend server is running and the current Cloudflare Tunnel URL has been published to backend_url.json.

The device or emulator must allow:

  • Camera access
  • Microphone access
  • Network access

For the live demo, a physical Android device is recommended because camera, microphone, and network timing are closer to the intended use case.

Screens

MUTON Android main screen

MUTON Android live screen

Related Repository

Backend runtime, API implementation, model training scripts, dataset processing, and wiki documentation are maintained in the main MUTON repository:

Project Structure

MUTON-Android/
  app/
    src/main/java/com/example/myapplication/
      MainActivity.kt
      OpenAiSummaryService.kt
      ConversationRecordStore.kt
      RecordDetailActivity.kt
      HomeActivity.kt
      SettingsActivity.kt
    src/main/res/
  gradle/
  build.gradle.kts
  settings.gradle.kts

About

For MUTON apps

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages