Speech (Text-to-Speech) API Usage Guide
Generate lifelike speech audio from text using Radient's Speech API. Ideal for assistants, IVR systems, accessibility features, and any application that needs high-quality audio output on demand.
This guide shows how to call the endpoint, control the voice, format, and speed, and save the audio result.
Core Concepts
The Speech API uses the /v1/speech
endpoint and returns an audio file stream.
Key features:
- Multiple voices and quality levels (models
tts-1
,tts-1-hd
,gpt-4o-mini-tts
). - Choose audio format (
mp3
,opus
,aac
,flac
,wav
,pcm
) and playback speed. - Provider-aware (currently OpenAI-compatible).
Text-to-Speech
Convert text to speech and save the resulting audio file locally.
Endpoint: POST /v1/speech
- Consumes:
application/json
- Produces:
audio/mpeg
(whenresponse_format
ismp3
; content type may vary by format)
Example Request (Python using requests
):
import requests
RADIENT_API_KEY = "YOUR_RADIENT_API_KEY"
RADIENT_BASE_URL = "https://api.radient.com/v1" # Or your specific Radient API endpoint
headers = {
"Authorization": f"Bearer {RADIENT_API_KEY}",
"Content-Type": "application/json",
# "Accept": "audio/mpeg" # Optional; server will stream audio back
}
payload = {
"input": "Hello! This is a test of Radient's text-to-speech service.",
"model": "tts-1", # or "tts-1-hd", "gpt-4o-mini-tts"
"voice": "alloy", # "alloy", "ash", "ballad", "coral", "echo", "fable", "onyx", "nova", "sage", "shimmer", "verse"
"response_format": "mp3", # "mp3", "opus", "aac", "flac", "wav", "pcm"
"speed": 1.0, # between 0.25 and 4.0
"provider": "openai" # optional, currently "openai"
}
response = requests.post(f"{RADIENT_BASE_URL}/speech", headers=headers, json=payload, stream=True)
if response.status_code == 200:
# Save the streamed audio to a file
output_filename = "speech_output.mp3" # match response_format
with open(output_filename, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
if chunk:
f.write(chunk)
print(f"Saved audio to {output_filename}")
else:
# If not 200, the endpoint returns JSON error details
try:
print("Error:", response.status_code, response.json())
except Exception:
print("Error:", response.status_code, response.text)
Request Body (JSON)
Field | Type | Description | Required | Allowed / Range |
---|---|---|---|---|
input | string | The text to synthesize into speech. | Yes | 1–4096 chars |
model | string | TTS model to use. | Yes | tts-1 , tts-1-hd , gpt-4o-mini-tts |
voice | string | Voice preset. | Yes | alloy , ash , ballad , coral , echo , fable , onyx , nova , sage , shimmer , verse |
response_format | string | Output audio format. | No | mp3 , opus , aac , flac , wav , pcm |
speed | number | Playback speed multiplier. | No | 0.25–4.0 |
provider | string | Underlying provider. | No | openai |
Responses
- 200 OK: Audio stream (file). Content type corresponds to the chosen
response_format
(e.g.,audio/mpeg
for mp3). - 400 Bad Request: Invalid fields or missing parameters (JSON body with error details).
- 500 Internal Server Error: Unexpected server issue.
Tips and Best Practices
- Use
tts-1-hd
for higher fidelity when latency is less critical;tts-1
for lower-latency needs. - Select
mp3
for broad compatibility; consideropus
for very low bitrate/high quality in supported clients. - Keep
input
under ~4k characters per request; chunk longer text and combine outputs client-side. - Cache or reuse generated clips if the same text is requested frequently to reduce latency and cost.
For a full list of API endpoints and schemas, see the main API Reference.