Text to speech

Generate natural speech from text with any TTS model on the network. The endpoint mirrors OpenAI's speech API, so the official SDKs and curl work unchanged.

POST /v1/audio/speech

Requests route only to services a provider declared as type: tts. The response is the raw audio (just like OpenAI), and a copy is stored on inference.club so it shows up in your history.

Request

application/json:

FieldRequiredDescription
modelyesA tts model id from GET /v1/models.
inputyesThe text to synthesize.
voicenoA voice name (see voices). Defaults to the provider's default.
response_formatnowav (default) or opus.
languagenoLanguage hint, e.g. en-US (the model is multilingual).

curl

curl https://api.inference.club/v1/audio/speech \
  -H "Authorization: Bearer $INFERENCE_CLUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "magpie-tts-multilingual", "input": "Hello from inference club", "voice": "Magpie-Multilingual.EN-US.Mia" }' \
  --output speech.wav

Python (openai SDK)

from openai import OpenAI

client = OpenAI(base_url="https://api.inference.club/v1", api_key="<your-api-key>")
with client.audio.speech.with_streaming_response.create(
    model="magpie-tts-multilingual",
    voice="Magpie-Multilingual.EN-US.Mia",
    input="Hello from inference club",
) as response:
    response.stream_to_file("speech.wav")

Response

The raw audio bytes, with Content-Type: audio/wav (or audio/ogg for Opus). Metered by the duration of the generated audio.

Voices

Voices are model-specific. List what a model offers:

GET /v1/audio/voices?model=<model-id>
{ "voices": ["Magpie-Multilingual.EN-US.Mia", "Magpie-Multilingual.EN-US.Jason", "…"] }

(/v1/audio/voices is an inference.club extension, not part of OpenAI's API.) The in-dashboard Speech playground populates a voice dropdown from this.

Notes

  • Formats: the reference provider (NVIDIA Riva / Magpie) returns WAV natively; we also offer Opus. mp3/aac/flac aren't transcoded — a request for those returns WAV.
  • Speed and other OpenAI parameters not supported by the provider are ignored.

Errors

typeWhenHTTP
missing_inputNo input text400
request_too_largeInput text over the limit413
no_providerNo online TTS provider serves the model for you404
upstream_errorThe provider's speech server failed502