Skip to main content
The output_format query parameter selects the audio encoding. Non-streaming text to speech accepts mp3, wav, flac, pcm, aac, and opus. Streaming supports mp3 and pcm.

Supported encodings

ValueFormatNotes
mp3MP3Default. Best for general playback.
wavWAVUncompressed, lossless.
flacFLACLossless compression.
pcmRaw PCMLinear PCM for audio pipelines.
aacAACEfficient lossy codec.
opusOpusNon-streaming only.

Choosing a format

  • End-user playback in a browser or app: prefer mp3.
  • Real-time applications: use the streaming endpoint with mp3 or pcm.
  • Archival or post-processing: use wav or flac for lossless output.

Use formats with

SDK quickstart

Generate and save your first MP3 with the Python or TypeScript SDK.

Text to speech

Pass output_format on sync, async, and streaming generation requests.

Streaming

Use streaming-compatible formats for lower-latency playback.

CLI text to speech

Save generated audio from the command line while prototyping voices and formats.