Local Whisper AI Integration Example
Local Whisper AI Integration: Example Setup
This guide provides a practical demonstration of setting up an on-premises Whisper AI instance on the same server as your PBX. The example uses Ubuntu 24.10, with the service endpoint located at http://127.0.0.1:8000/transcribe/.
Key Notes
- The
whisper_api.pyscript includes embedded username and password settings for security. - The example uses the
tinymodel for transcription. You can switch to thelargemodel, but this will require more VRAM allocation. Refer to the OpenAI Whisper GitHub repository for details.
Step 1: Update System and Install Dependencies
Run the following commands to update your system and install the required dependencies:
sudo apt update && sudo apt upgrade -y
sudo apt install ffmpeg git python3-pip python3.12-venv -y
Step 2: Set Up a Python Virtual Environment
Create and activate a Python virtual environment:
python3 -m venv whisper-env
source whisper-env/bin/activate
Step 3: Install Whisper and Required Libraries
Install Whisper and the necessary Python libraries:
pip install openai-whisper
pip install fastapi uvicorn
pip install python-multipart
Step 4: Create the whisper_api.py Script
Create a file named whisper_api.py and add the following code. Update the USERNAME and PASSWORD variables as needed:
from fastapi import FastAPI, UploadFile, File, Depends, HTTPException, status
from fastapi.security import HTTPBasic, HTTPBasicCredentials
import whisper
import shutil
app = FastAPI()
security = HTTPBasic()
model = whisper.load_model("tiny") # Change to "small", "medium", or "large" if needed
# Define username and password
USERNAME = "user"
PASSWORD = "password"
# Authentication function
def authenticate(credentials: HTTPBasicCredentials = Depends(security)):
if credentials.username != USERNAME or credentials.password != PASSWORD:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid credentials",
headers={"WWW-Authenticate": "Basic"},
)
return credentials.username
@app.post("/transcribe/")
async def transcribe_audio(file: UploadFile = File(...), user: str = Depends(authenticate)):
with open(file.filename, "wb") as buffer:
shutil.copyfileobj(file.file, buffer)
result = model.transcribe(file.filename)
return {"user": user, "transcription": result["text"]}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="127.0.0.1", port=8000)
Step 5: Run the API Service in the Background
Start the API service in the background using nohup:
nohup python whisper_api.py > whisper.log 2>&1 &
Step 6: Monitor Logs
To monitor the logs, use the following command:
cat whisper.log
nohup: ignoring input
INFO: Started server process [1442923]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
/root/whisper-env/lib/python3.12/site-packages/whisper/transcribe.py:126: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
INFO: 127.0.0.1:49828 - "POST /transcribe/ HTTP/1.1" 200 OK