Google has unveiled Gemini 3.1 Flash TTS.

Google’s latest text-to-speech model is designed to produce more natural, expressive and customizable AI-generated voices.
The Gemini 3.1 Flash TTS improves overall speech quality to sound more lifelike compared to previous versions.
It also gives users more control over how voices are delivered, including tone, pacing and style.
One of the standout features is the addition of ‘audio tags.’ These let users adjust how the AI speaks by simply adding instructions in the text such as changing emotion, speed or accent.
Gemini 3.1 Flash TTS also supports over 70 languages and can handle multi-speaker conversations.
The model is rolling out in preview for developers through the Gemini API and Google AI Studio, for enterprises via Vertex AI and for everyday users through Google Vids.
To address concerns around AI-generated content, all audio created using the model includes a hidden SynthID watermark identifies it as AI-generated.


0 Comments
Leave a Reply