A High-Performance Python Proxy Server
Converts the Google AI Studio web interface into an OpenAI-compatible API
🔄 Multi-Worker Concurrency •
🖼️ Imagen 3 Image Generation •
🎨 Nano Banana Image Generation
🎬 Veo 2 Video Generation •
🎤 Gemini 2.5 TTS Speech Synthesis
- OpenAI Compatible API: Fully compatible with OpenAI format
/v1/chat/completionsendpoint - Multi-Worker Concurrency: Supports multi-account concurrent processing for improved throughput and stability
- TTS Speech Generation: Supports Gemini 2.5 TTS models for single/multi-speaker audio generation
- Image Generation: Supports Imagen 3 and Gemini 2.5 Flash (Nano Banana) image generation
- Video Generation: Supports Veo 2 video generation, including image-to-video
- Smart Model Switching: Dynamically switch models in AI Studio via the
modelfield - Anti-Fingerprint Detection: Uses Camoufox browser to reduce detection risk
- GUI Launcher: Feature-rich web launcher for simplified configuration and management
- Modular Architecture: Clear module separation design for easy maintenance
- Modern Toolchain: uv dependency management + full type support
- Python: 3.12 (recommended)
- Dependency Management: uv
- Operating System: Windows, macOS, Linux
- Memory: 2GB+ available memory recommended
- Network: Stable internet connection to access Google AI Studio
git clone https://github.com/Mag1cFall/AIStudio2API.git
cd AIStudio2APIThen double-click setup.bat to run it. The script will automatically complete all installation steps.
Windows (PowerShell):
.\setup.batLinux:
chmod +x setup.sh
./setup.shWindows (PowerShell):
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"macOS / Linux:
curl -LsSf https://astral.sh/uv/install.sh | shExpected output:
PS C:\Users\2\Desktop\AIStudio2API> powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
Downloading uv 0.9.11 (x86_64-pc-windows-msvc)
Installing to C:\Users\2\.local\bin
uv.exe
uvx.exe
uvw.exe
everything's installed!
To add C:\Users\2\.local\bin to your PATH, either restart your shell or run:
set Path=C:\Users\2\.local\bin;%Path% (cmd)
$env:Path = "C:\Users\2\.local\bin;$env:Path" (powershell)
Please add it to your environment variables according to your path.
git clone https://github.com/Mag1cFall/AIStudio2API.git
cd AIStudio2APIuv sync
uv run camoufox fetch
uv run playwright install firefoxNote: The Camoufox browser (approximately 600MB) will be automatically downloaded during installation. This is a core component for anti-fingerprint detection. First-time installation may take some time, please be patient.
-
Start the GUI:
uv run python src/app_launcher.py
-
Configure Proxy (recommended):
- Check "Enable Browser Proxy" in the GUI
- Enter your proxy address (e.g.,
http://127.0.0.1:7890)
-
Start Headed Mode for Authentication:
- Click "Start Headed Mode (New Terminal)"
- Type
Nin the terminal to get a new authentication file - The browser will automatically open and navigate to AI Studio
- Manually log in to your Google account
- Ensure you're on the AI Studio homepage
- Press Enter in the terminal to save authentication info
-
After Authentication:
- Authentication info will be saved automatically
- You can close the headed mode browser and terminal
After authentication is saved, you can use headless mode:
-
Start the GUI:
uv run python src/app_launcher.py
-
Click "Start Headless Mode" or "Virtual Display Mode"
-
The API service will run in the background, default port
2048
start_cmd.bat: Direct command-line startup.
start_webui.bat: Starts the web interface, auto-redirects or visit http://127.0.0.1:9000.
Wait for ℹ️ INFO | --- Queue Worker Started --- to appear before using the API.
After starting the service, use the OpenAI-compatible API:
curl -X POST http://localhost:2048/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-pro",
"messages": [
{"role": "user", "content": "Hello, world!"}
]
}'Using Cherry Studio as an example:
- Open Cherry Studio settings
- Add a new model in the "Connection" section:
- API Host:
http://127.0.0.1:2048/v1/ - Model Name:
gemini-2.5-pro(or other AI Studio supported models) - API Key: Leave empty or enter any character like
123
- API Host:
Supports Gemini 2.5 Flash/Pro TTS models for single-speaker or multi-speaker audio generation:
curl -X POST http://localhost:2048/generate-speech \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-flash-preview-tts",
"contents": "Hello, this is a test.",
"generationConfig": {
"responseModalities": ["AUDIO"],
"speechConfig": {
"voiceConfig": {
"prebuiltVoiceConfig": {"voiceName": "Kore"}
}
}
}
}'curl -X POST http://localhost:2048/generate-speech \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-flash-preview-tts",
"contents": "Joe: How are you?\nJane: I am fine, thanks!",
"generationConfig": {
"responseModalities": ["AUDIO"],
"speechConfig": {
"multiSpeakerVoiceConfig": {
"speakerVoiceConfigs": [
{"speaker": "Joe", "voiceConfig": {"prebuiltVoiceConfig": {"voiceName": "Kore"}}},
{"speaker": "Jane", "voiceConfig": {"prebuiltVoiceConfig": {"voiceName": "Puck"}}}
]
}
}
}
}'Available Voices: Zephyr, Puck, Charon, Kore, Fenrir, Leda, Orus, Aoede, Callirrhoe, Autonoe, Enceladus, Iapetus, and 18 more voices.
Endpoints:
POST /generate-speechPOST /v1beta/models/{model}:generateContent(compatible with official API)
Response Format: Audio data is returned as Base64-encoded WAV format in candidates[0].content.parts[0].inlineData.data.
curl -X POST http://localhost:2048/generate-image \
-H "Content-Type: application/json" \
-d '{
"prompt": "A beautiful sunset over mountains",
"model": "imagen-3.0-generate-002",
"number_of_images": 1,
"aspect_ratio": "16:9"
}'Endpoint: POST /generate-image
curl -X POST http://localhost:2048/generate-video \
-H "Content-Type: application/json" \
-d '{
"prompt": "A drone flying over a forest",
"model": "veo-2.0-generate-001",
"aspect_ratio": "16:9",
"duration_seconds": 5
}'Endpoint: POST /generate-video
curl -X POST http://localhost:2048/nano/generate \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-flash-image",
"contents": [{"parts": [{"text": "A cute cat wearing a tiny hat"}]}]
}'Endpoint: POST /nano/generate
Detailed Documentation: See Media Generation Guide
AIStudio2API/
├── src/ # Source code directory
│ ├── app_launcher.py # GUI launcher
│ ├── launch_camoufox.py # Command-line launcher
│ ├── server.py # Main server
│ ├── manager/ # WebUI manager package
│ ├── api/ # API processing modules
│ ├── browser/ # Browser automation modules
│ ├── config/ # Configuration management
│ ├── models/ # Data models
│ ├── tts/ # TTS Speech Generation modules
│ ├── media/ # Media Generation modules (Imagen/Veo/Nano)
│ ├── proxy/ # Streaming proxy
│ ├── worker/ # Multi-Worker management module
│ ├── gateway.py # Multi-Worker load balancing gateway
│ └── static/ # Static resources
├── data/ # Runtime data directory
│ ├── auth_profiles/ # Authentication files
│ ├── certs/ # Certificate files
│ └── key.txt # API keys
├── camoufox/ # Camoufox scripts
├── docker/ # Docker configuration
├── docs/ # Detailed documentation
├── logs/ # Log files
├── start_webui.bat # WebUI startup script
├── start_cmd.bat # Command-line startup script
├── setup.bat # Windows installation script
└── setup.sh # Linux/macOS installation script
Copy and edit the environment configuration file:
cp .env.example .env
# Edit .env file for custom configuration- FastAPI Service: Default port
2048 - Camoufox Debug: Default port
40222 - Streaming Proxy: Default port
3120
Supports accessing AI Studio through proxy:
- Enable "Browser Proxy" in the GUI
- Enter proxy address (e.g.,
http://127.0.0.1:7890) - Click "Test" button to verify proxy connection
- Authentication files are stored in
data/auth_profiles/directory - Supports saving and switching multiple authentication files
- Manage through the "Manage Auth Files" feature in the GUI
- Installation Guide
- Environment Configuration
- Authentication Setup
- API Usage Guide
- Multi-Worker Concurrency Mode
- Troubleshooting
This project uses Camoufox browser to avoid detection as an automation script. Camoufox is based on Firefox and disguises device fingerprints by modifying the underlying implementation.
- Client-Managed History: Proxy doesn't support in-UI editing; clients need to maintain full chat history
- Parameter Support: Supports
temperature,max_output_tokens,top_p,stopparameters - Authentication Expiry: Authentication files may expire; re-authentication required
If you see Port 30XX (host 0.0.0.0) is currently in use on startup but can't find the occupying process in Task Manager, this is usually caused by Windows Hyper-V/WSL2/Docker NAT service randomly reserving port ranges.
⚠️ All commands below must be run in Administrator PowerShell or CMD
netsh interface ipv4 show excludedportrange protocol=tcpIf your Worker ports (e.g., 3001-3008) fall within the Start Port and End Port range shown, this is the issue.
net stop winnat
net start winnatAfter restart, run step 1 again. The port ranges usually change and release your needed ports.
While ports are free, permanently mark commonly used development ports as administrator-reserved to prevent Windows from occupying them again:
netsh int ipv4 add excludedportrange protocol=tcp startport=3000 numberofports=20 store=persistentOn success, entries with * marker will appear in the list, indicating permanent protection.
For more troubleshooting solutions, see Troubleshooting Guide.
Issues and Pull Requests are welcome!
- ✅ TTS Support: Adapted
gemini-2.5-flash/pro-preview-ttsspeech generation models - ✅ Media Generation: Supports Imagen 3, Veo 2, Nano Banana image/video generation
- Unified Click Logic: Extract
_safe_clickmethod to globaloperations.py, unify click operations across all controllers - Documentation: Update and optimize documentation in
docs/directory - One-Click Deployment: Provide fully automated install and launch scripts for Windows/Linux/macOS
- Docker Support: Provide standard Dockerfile and Docker Compose orchestration files
- Go Refactoring: Migrate core proxy service to Go for improved concurrency and reduced resource usage
- CI/CD Pipeline: Establish GitHub Actions automated testing and build release process
- Unit Testing: Increase test coverage for core modules (especially browser automation)
- ✅ Multi-Worker Load Balancing: Support multi-Google account rotation pool for higher concurrency limits