← Back to Documentation
Sokuji supports multiple AI providers for real-time speech translation. Each provider offers different capabilities, models, and pricing structures to suit various use cases.
Setup Instructions
To use any AI provider, obtain an API key from the provider's website and configure it in Sokuji's settings panel. Each provider requires different authentication methods and has varying rate limits.
OpenAI
Real-time Audio API
GPT-4o Realtime Preview models
8 premium voice options (Alloy, Echo, Shimmer, etc.)
Advanced turn detection modes
Built-in noise reduction
60+ languages supported
Template mode for custom prompts
OpenAI Platform
Setup Tutorial
Google Gemini
Gemini Live API
Gemini 2.0 Flash Live models
30 unique voice personalities
Automatic turn detection
35+ languages with regional variants
Built-in transcription
High token limits (8192)
Google AI Studio
Setup Tutorial
PalabraAI
WebRTC Translation Service
Real-time WebRTC translation
60+ source languages
40+ target languages
Low latency streaming
Automatic audio processing
Specialized for live translation
Palabra.ai Website
Setup Tutorial
CometAPI
OpenAI-Compatible API
OpenAI-compatible provider with same features
OpenAI Realtime API compatibility
Same voice and model options as OpenAI
Alternative pricing structure
Full feature parity
Drop-in replacement for OpenAI
CometAPI Website
Setup Tutorial
Choosing a Provider
OpenAI: Best for high-quality voice synthesis and advanced features
Gemini: Great for multilingual support and automatic processing
PalabraAI: Optimized for real-time translation with minimal latency
CometAPI: Cost-effective alternative to OpenAI with identical functionality
Need Help?
For setup guides, troubleshooting, and provider comparisons, visit our GitHub repository or check the provider-specific documentation links above.
© 2025 Kizuna AI Lab. All rights reserved.