Google Cloud has launched Gemini 3.1 Flash TTS, a next-generation text-to-speech model designed for high-quality, expressive, and controllable AI audio. The model supports over 70 languages and offers more than 200 audio tags, allowing developers to fine-tune tone, pacing, and emotion using natural language prompts.
It also includes 30+ prebuilt voices and enables detailed customization of accents and speaking styles.
Available through Google AI Studio and Vertex AI, the model is built for scalable enterprise use cases such as accessibility tools, audiobooks, gaming, and customer interactions. Additionally, SynthID watermarking helps identify AI-generated audio, improving transparency and trust.





