Skip to content

Commit 55aab02

Browse files
New design all-in-one-screen and readme updated!
1 parent 01d1e01 commit 55aab02

9 files changed

Lines changed: 508 additions & 267 deletions

File tree

README.md

Lines changed: 97 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,138 @@
1-
# Speech to Sign Language Translator
1+
# SoundSigns: Speech to Sign Language Translator
22

33
## Overview
44

5-
A real-time speech-to-sign language translation application that converts spoken words into ISL (International Sign Language) gloss and provides a visual interface.
5+
SoundSigns is a comprehensive web-based application that translates spoken English into International Sign Language (ISL) in real-time. The system captures speech, converts it to text, translates it into ISL gloss, and displays the translation through a 3D animated avatar using pre-rendered video clips.
66

77
## Features
88

9-
- Live audio transcription
10-
- Real-time ISL gloss translation
11-
- Responsive and modern UI design
12-
- Microphone recording controls
9+
- **Real-time Speech Recognition**: Browser-based speech-to-text conversion using Web Speech API
10+
- **ISL Gloss Translation**: Converts English text to International Sign Language gloss using ChatGPT API
11+
- **3D Avatar Animation**: Visual sign language representation through pre-rendered MP4 video clips
12+
- **Video Assembly**: Seamless concatenation of individual sign videos into coherent sentences
13+
- **Interactive Interface**: Clean, responsive UI with microphone controls and video playback
14+
- **Multi-format Support**: Covers alphabet letters (A-Z), numbers (0-9), and common vocabulary
15+
- **Download Functionality**: Save translated sign language videos for offline use
16+
- **Cross-browser Compatibility**: Works on modern browsers supporting Web Speech API
17+
18+
## Architecture
19+
20+
The application follows a modular three-tier architecture:
21+
22+
- **Frontend**: React.js with Tailwind CSS handling user interaction and video processing
23+
- **Backend**: Flask server managing API communications and text-to-gloss conversion
24+
- **Dataset**: Curated collection of ~150 pre-rendered ISL sign videos
1325

1426
## Prerequisites
1527

1628
- Python 3.8+
1729
- Node.js 14+
1830
- OpenAI API Key
31+
- Modern web browser with Web Speech API support (Chrome, Edge recommended)
1932

20-
## Backend Setup
33+
## Installation
2134

22-
1. Install Python dependencies:
35+
### Backend Setup
2336

37+
1. Install Python dependencies:
2438
```bash
2539
pip install sounddevice numpy openai flask flask-cors python-dotenv
2640
```
2741

28-
2. Create a .env file inside the backend/ folder and add your OpenAI API key:
29-
42+
2. Create a `.env` file in the `backend/` directory:
3043
```bash
3144
OPENAI_API_KEY=your_openai_key_here
3245
```
3346

34-
## Frontend Setup
47+
**Security Note**: Never commit the `.env` file to version control.
3548

36-
1. Initialize:
49+
### Frontend Setup
3750

51+
1. Navigate to frontend directory and install dependencies:
3852
```bash
3953
cd frontend
4054
npm install
4155
```
4256

4357
## Running the Application
4458

45-
1. Run the development server in first terminal:
46-
59+
1. Start the frontend development server:
4760
```bash
61+
cd frontend
4862
npm run dev
4963
```
5064

51-
2. Start the Python backend from root in another terminal:
52-
65+
2. In a separate terminal, start the backend server from the project root:
5366
```bash
54-
cd ..
5567
py backend/transcription.py
5668
```
5769

58-
## Technologies Used:
70+
3. Access the application at `http://localhost:3000` (or the port specified by your dev server)
71+
72+
## Usage
73+
74+
1. **Voice Input**: Click the microphone button and speak clearly in English
75+
2. **Transcription**: View the real-time speech-to-text conversion
76+
3. **Translation**: See the ISL gloss translation displayed
77+
4. **Video Playback**: Watch the 3D avatar perform the signed translation
78+
5. **Controls**: Use play, replay, and download buttons to control video playback
79+
80+
## Project Structure
81+
82+
```
83+
project-root/
84+
├── backend/
85+
│ ├── .env # Environment variables (not in version control)
86+
│ └── transcription.py # Flask server and API logic
87+
├── frontend/
88+
│ ├── src/
89+
│ │ ├── components/ # React components
90+
│ │ └── App.jsx # Main application file
91+
│ └── assets/
92+
│ └── videos/ # Pre-rendered sign language videos
93+
│ ├── letters/ # A-Z alphabet signs
94+
│ ├── numbers/ # 0-9 numerical signs
95+
│ └── words/ # Common vocabulary signs
96+
```
97+
98+
## Technologies Used
99+
100+
- **Frontend**: React.js, Tailwind CSS, Web Speech API
101+
- **Backend**: Python, Flask, Flask-CORS
102+
- **Translation**: OpenAI GPT-3.5-turbo API
103+
- **Video Processing**: Browser-based video concatenation
104+
- **Dataset**: Pre-rendered MP4 videos with 3D ISL avatar
105+
106+
## System Requirements
107+
108+
- **Browser**: Chrome, Edge, or other browsers with Web Speech API support
109+
- **Microphone**: Required for speech input
110+
- **Internet Connection**: Required for OpenAI API access
111+
112+
## Known Limitations
113+
114+
- Limited vocabulary dataset (~150 signs)
115+
- Words not in dataset are finger-spelled letter by letter
116+
- Translation accuracy depends on ChatGPT's ISL gloss generation
117+
- Requires quiet environment for optimal speech recognition
118+
- System latency of 3-5 seconds for complete translation process
119+
120+
## Contributing
121+
122+
This project was developed as a capstone project by Ahmad Ataba and Waseem Saleem under the supervision of Dr. Reuven Cohen at Braude College.
123+
124+
## Dataset Attribution
125+
126+
The sign language video dataset is sourced from the open-source "Text-Speech to Sign Language Generator" project by JS-Coderr (2024), available on GitHub.
127+
128+
## License
129+
130+
This project uses open-source components and datasets. Please refer to individual component licenses for specific terms.
131+
132+
## Support
133+
134+
For technical issues or questions about the application, please refer to the project documentation or contact the development team.
135+
136+
---
59137

60-
- Frontend: React, Tailwind CSS
61-
- Backend: Python, Flask
62-
- Translation: OpenAI GPT-4
138+
**Note**: This application is designed for educational and accessibility purposes. For critical communication needs, professional sign language interpretation is recommended.

frontend/src/App.jsx

Lines changed: 49 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,71 @@
1-
import { useSpeechRecognition } from './hooks/useSpeechRecognition.js';
2-
import { ThemeProvider } from './contexts/ThemeContext.jsx';
3-
import Header from './components/Header.jsx';
4-
import BackgroundElements from './components/BackgroundElements.jsx';
5-
import MicrophoneSection from './components/MicrophoneSection.jsx';
6-
import LiveTranscriptSection from './components/LiveTranscriptSection.jsx';
7-
import ISLGlossSection from './components/ISLGlossSection.jsx';
8-
import ISLVideoSection from './components/ISLVideoSection.jsx';
9-
import FeatureHighlights from './components/FeatureHighlights.jsx';
10-
import Footer from './components/Footer.jsx';
1+
import { useSpeechRecognition } from "./hooks/useSpeechRecognition.js";
2+
import { ThemeProvider } from "./contexts/ThemeContext.jsx";
3+
import Header from "./components/Header.jsx";
4+
import BackgroundElements from "./components/BackgroundElements.jsx";
5+
import LiveTranscriptSection from "./components/LiveTranscriptSection.jsx";
6+
import ISLGlossSection from "./components/ISLGlossSection.jsx";
7+
import ISLVideoSection from "./components/ISLVideoSection.jsx";
8+
// FeatureHighlights now integrated into Footer
9+
import Footer from "./components/Footer.jsx";
1110

1211
function AppContent() {
13-
const {
14-
transcript,
15-
isl,
16-
isRecording,
17-
isProcessing,
18-
toggleRecording
19-
} = useSpeechRecognition();
12+
const { transcript, isl, isRecording, isProcessing, toggleRecording } =
13+
useSpeechRecognition();
2014

2115
return (
22-
<div className="min-h-screen bg-gradient-to-br from-blue-50 via-indigo-25 to-sky-50 dark:from-slate-900 dark:via-slate-800 dark:to-slate-900 font-inter transition-colors duration-300">
16+
<div className="min-h-screen xl:h-screen xl:overflow-hidden bg-gradient-to-br from-blue-50 via-indigo-25 to-sky-50 dark:from-slate-900 dark:via-slate-800 dark:to-slate-900 font-inter transition-colors duration-300 flex flex-col">
2317
<BackgroundElements />
24-
<Header />
2518

26-
{/* Main Content */}
27-
<main className="relative container mx-auto px-4 sm:px-6 py-6 sm:py-8">
28-
<MicrophoneSection
19+
{/* Fixed Header - minimal height with small gap */}
20+
<div className="flex-shrink-0 mb-2">
21+
<Header
2922
isRecording={isRecording}
30-
isProcessing={isProcessing}
3123
toggleRecording={toggleRecording}
24+
isProcessing={isProcessing}
3225
/>
26+
</div>
3327

34-
{/* Live Results Section */}
35-
<div className="max-w-6xl mx-auto mb-6 sm:mb-8">
36-
<div className="grid grid-cols-1 xl:grid-cols-2 gap-4 sm:gap-6">
37-
{/* Left Column: Live Transcript and ISL Gloss */}
38-
<div className="space-y-4 sm:space-y-6">
28+
{/* Main Content - takes remaining space and fits between header and footer */}
29+
<main className="flex-1 relative container mx-auto px-2 xl:overflow-hidden min-h-0">
30+
{/* Main Content Grid - responsive width */}
31+
<div className="xl:h-full h-full mx-auto px-1 max-w-7xl xl:max-w-none xl:w-full">
32+
{/* Mobile: Stack all sections vertically - scrollable */}
33+
<div className="xl:hidden flex flex-col space-y-4 overflow-y-auto h-full py-2">
34+
<div className="min-h-[300px] flex-shrink-0">
35+
<ISLVideoSection isl={isl} transcript={transcript} />
36+
</div>
37+
<div className="min-h-[200px] flex-shrink-0">
3938
<LiveTranscriptSection transcript={transcript} />
39+
</div>
40+
<div className="min-h-[200px] flex-shrink-0">
4041
<ISLGlossSection isl={isl} />
4142
</div>
42-
43-
{/* Right Column: ISL Video Translation */}
44-
<div className="xl:col-span-1">
43+
</div>
44+
45+
{/* Desktop: 50/50 split layout that fits in viewport */}
46+
<div className="hidden xl:grid xl:grid-cols-2 gap-4 xl:h-full">
47+
{/* Desktop: Left Half - ISL Video Translation - Full 50% */}
48+
<div className="h-full w-full">
4549
<ISLVideoSection isl={isl} transcript={transcript} />
4650
</div>
51+
52+
{/* Desktop: Right Half - Live Transcript and ISL Gloss stacked - Full 50% */}
53+
<div className="flex flex-col xl:h-full w-full gap-4">
54+
<div className="flex-1 min-h-0 w-full">
55+
<LiveTranscriptSection transcript={transcript} />
56+
</div>
57+
<div className="flex-1 min-h-0 w-full">
58+
<ISLGlossSection isl={isl} />
59+
</div>
60+
</div>
4761
</div>
4862
</div>
49-
50-
<FeatureHighlights />
5163
</main>
5264

53-
<Footer />
65+
{/* Compact Footer with small gap */}
66+
<div className="flex-shrink-0 mt-2">
67+
<Footer />
68+
</div>
5469
</div>
5570
);
5671
}

0 commit comments

Comments
 (0)