Skip to main content
A voice is the speaker identity used to synthesize audio. Every text-to-speech request targets a specific voice via its voice_id.

Sources

Voices come from three sources:
  • Public catalog: curated voices that ship with Breeze. List them with GET /v1/voices.
  • Designed voices: created from a text prompt with POST /v1/voice-previews/design and finalized with POST /v1/voice-previews/{generated_voice_id}/save.
  • Cloned voices: created from audio samples with POST /v1/voice-previews/clone, previewed via GET /v1/voice-previews/{generated_voice_id}/stream, and finalized with POST /v1/voice-previews/{generated_voice_id}/save.
voice_type="default" is the official Breeze catalog. voice_type="personal" is user-saved voices. Categories are premade, generated, and cloned.

Voice settings

Each voice stores default voice_settings. Override per call by passing voice_settings in the request body.
  • guidance_scale adjusts how strongly generation follows the prompt and reference voice. Accepted values range from 1.0 to 10.0.
Update a voice’s persisted defaults with PATCH /v1/voices/{voice_id}/settings.

Browsing voices

Use the voice library to browse the public catalog and preview samples.

Voice workflows

Text to speech

Use any saved or public voice_id to generate one-shot, async, or streaming audio.

Voice Design

Create a new voice from a text description and sample script.

Voice Clone

Create a reusable voice from a consented audio sample.

Voice Library

Browse voices visually, preview samples, and copy voice IDs from Breeze Studio.