Supported AI Providers

Sokuji supports multiple AI providers for real-time speech translation. Each provider offers different capabilities, models, and pricing structures to suit various use cases.

Setup Instructions

To use any AI provider, obtain an API key from the provider's website and configure it in Sokuji's settings panel. Each provider requires different authentication methods and has varying rate limits.

OpenAI

Real-time Audio API

GPT-4o Realtime Preview models
8 premium voice options (Alloy, Echo, Shimmer, etc.)
Advanced turn detection modes
Built-in noise reduction
60+ languages supported
Template mode for custom prompts

OpenAI Platform Setup Tutorial

Google Gemini

Gemini Live API

Gemini 2.0 Flash Live models
30 unique voice personalities
Automatic turn detection
35+ languages with regional variants
Built-in transcription
High token limits (8192)

Google AI Studio Setup Tutorial

PalabraAI

WebRTC Translation Service

Real-time WebRTC translation
60+ source languages
40+ target languages
Low latency streaming
Automatic audio processing
Specialized for live translation

Palabra.ai Website Setup Tutorial

CometAPI

OpenAI-Compatible API

OpenAI-compatible provider with same features

OpenAI Realtime API compatibility
Same voice and model options as OpenAI
Alternative pricing structure
Full feature parity
Drop-in replacement for OpenAI

CometAPI Website Setup Tutorial

Choosing a Provider

OpenAI: Best for high-quality voice synthesis and advanced features

Gemini: Great for multilingual support and automatic processing

PalabraAI: Optimized for real-time translation with minimal latency

CometAPI: Cost-effective alternative to OpenAI with identical functionality

Need Help?

For setup guides, troubleshooting, and provider comparisons, visit our GitHub repository or check the provider-specific documentation links above.