Voice capabilities
| Feature | Technology | Status |
|---|---|---|
| Speech-to-text | Azure Speech, Google Cloud STT, Deepgram, OpenAI Whisper, Groq Whisper, Faster Whisper (local) | Available |
| Text-to-speech | Qwen3-TTS via Kokoro pipeline (local) | Available |
| Voice Agent Mode | Gemini Live API (real-time bidirectional) | Available |
| Speaker verification | Voice biometrics | In development |
Enabling voice
Voice interaction is available in the mobile app and web chat. Enable it in AI Settings > Voice.Mobile app
The mobile app is designed voice-first. Tap the microphone button to start speaking, or enable always-listening mode for hands-free operation.Web chat
In the web chat atchat.kombify.io, click the microphone icon next to the text input field.
Use cases
- Hands-free monitoring — “What is the status of my servers?”
- Quick commands — “Restart the Traefik container on server-1”
- Troubleshooting — “Why is my NAS running slow?”
- Smart home control — “Turn off the office lights” (via Smart Home Companion)
Voice processing happens locally in self-hosted mode. In SaaS mode, audio is processed securely and not stored after transcription. With Faster Whisper (STT) and Qwen3-TTS, you can run a fully local voice pipeline with no cloud dependencies.
Further reading
Companions
Configure which Companion responds to voice commands
Mobile app
Set up the mobile app for voice-first interaction
SpeechKit
Technical details on the STT/TTS framework powering voice
