How to use Whisper as a Speech Detector in Ozeki Voice Keyboard

This page provides comprehensive guides for setting up Whisper Speech Detector with Ozeki Voice Keyboard across different platforms including Windows, Windows with WSL, and Ubuntu Linux.

How to set up Whisper Speech Detector on Windows

This tutorial covers installing FFmpeg via winget, creating a Python 3.11 Conda environment, and running the Whisper server with agent-cli. You will configure Ozeki Voice Keyboard to use the local server for speech-to-text transcription by entering the API URL and model name in the Voice settings.

How to set up Whisper Speech Detector on Windows

How to set up Whisper Speech Detector on Windows using WSL

The guide walks through installing WSL with Ubuntu, updating packages, and installing required dependencies including Python, FFmpeg, and NVIDIA CUDA toolkit. You will create a Python virtual environment, install agent-cli with the faster-whisper backend, and start the Whisper server. Finally, configure Ozeki Voice Keyboard to connect to the WSL-hosted Whisper server for offline voice transcription.

How to set up Whisper Speech Detector on Windows WSL

How to set up Whisper Speech Detector on Ubuntu Linux

This guide demonstrates creating a Python 3.12 Conda environment on Ubuntu, installing vLLM with audio support, and starting the Whisper server. The server exposes an OpenAI-compatible endpoint accessible over the network. You will configure Ozeki Voice Keyboard to send audio to the Ubuntu machine for transcription, allowing you to offload speech processing to a dedicated Linux machine on your network.

How to set up Whisper Speech Detector on Ubuntu Linux

More information