Transcription API Usage Guide

Transcribe audio into accurate text using Radient's Transcription API. Ideal for meeting notes, call analytics, captioning, or voice-enabled applications.

This guide covers uploading audio, optional parameters, and processing the JSON response.

Core Concepts

The Transcription API uses the /v1/transcriptions endpoint and accepts multipart/form-data with the audio file and optional fields. It returns a JSON response with the transcribed text and metadata.

Key features:

Provider-aware (OpenAI).
Available models: whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe.
Optional control over model, language, temperature, and response format.
Returns structured JSON including text, status, and optional duration.

Transcribe an Audio File

Upload an audio file and receive the transcription as JSON.

Endpoint: POST /v1/transcriptions

Consumes: multipart/form-data
Produces: application/json

Example Request (Python using requests):

import requests
import os

RADIENT_API_KEY = "YOUR_RADIENT_API_KEY"
RADIENT_BASE_URL = "https://api.radient.com/v1"  # Or your specific Radient API endpoint

headers = {
    "Authorization": f"Bearer {RADIENT_API_KEY}"
}

# Local path to your audio file (e.g., .mp3, .wav, .m4a)
audio_path = "sample_audio.mp3"
if not os.path.exists(audio_path):
    raise FileNotFoundError(f"Audio file not found: {audio_path}")

# form-data fields (file is required)
files = {
    "file": open(audio_path, "rb")
}
data = {
    # Optional fields (uncomment as needed)
    # "model": "whisper-1",  # or "gpt-4o-transcribe", "gpt-4o-mini-transcribe"
    # "prompt": "Domain context to guide transcription",
    # "response_format": "json",  # e.g., "text", "json"
    # "temperature": "0.2",       # 0–2 (send as string for form-data)
    # "language": "en",           # ISO-639-1 code
    # "provider": "openai"
}

response = requests.post(
    f"{RADIENT_BASE_URL}/transcriptions",
    headers=headers,
    files=files,
    data=data,
    timeout=120
)

# Close file handle
files["file"].close()

if response.status_code == 200:
    result = response.json()
    print("Transcription Status:", result.get("status"))
    print("Provider:", result.get("provider"))
    if result.get("duration") is not None:
        print("Duration (s):", result.get("duration"))
    print("\nTranscribed Text:\n", result.get("text", ""))
else:
    try:
        print("Error:", response.status_code, response.json())
    except Exception:
        print("Error:", response.status_code, response.text)

Form Data Parameters

Field	Type	In	Description	Required	Notes/Allowed Values
`file`	file	form-data	Audio file to transcribe.	Yes	Common types: mp3, wav, m4a, etc.
`model`	string	form-data	Model to use.	No	`whisper-1`, `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`
`prompt`	string	form-data	Optional text prompt to guide transcription.	No	Useful for domain vocabulary.
`response_format`	string	form-data	Response format (e.g., `text`, `json`).	No	Default is JSON if unspecified.
`temperature`	number	form-data	Sampling temperature (0–2).	No	Stringify when sending form-data.
`language`	string	form-data	Language of the audio (ISO-639-1).	No	e.g., `en`, `fr`, `es`.
`provider`	string	form-data	Underlying provider (e.g., `openai`).	No	Provider support may vary.

Response Body (200 OK - TranscriptionResponse)

Field	Type	Description
`text`	string	The transcribed text.
`status`	string	e.g., `completed`, `failed`.
`provider`	string	Provider used (e.g., `openai`).
`duration`	number	Duration in seconds, if available.
`error`	string	Error message (only present if the transcription failed).

Example Response (JSON):

{
  "status": "completed",
  "provider": "openai",
  "duration": 42.7,
  "text": "Welcome to the Radient documentation. This is a demo recording."
}

Error Handling

400 Bad Request: Missing or invalid parameters (e.g., no file).
500 Internal Server Error: Unexpected server issue.

Check the JSON body for an error field to diagnose failures.

Tips and Best Practices

Provide language when known to improve accuracy and speed.
Use prompt to bias transcription towards domain-specific terms or names.
Keep audio reasonably clean (reduce background noise) for best results.
For long recordings, consider chunking audio client-side and merging transcripts.

For a full list of API endpoints and schemas, see the main API Reference.