Models
August 29, 2025

Microsoft announces MAI-Voice-1, its first speech generation model

MAI-Voice-1 can currently produce one minute of natural, expressive voice in under a second on a single GPU.

MAI-Voice-1 is the most expressive and natural AI voice generation model yet, designed for efficiency and scale.

Capable of generating a full minute of humanlike audio in less than one second on a single GPU, it pushes the boundaries of real-time speech synthesis. Now live in Copilot Daily and Podcasts, it brings conversations, narration, and storytelling to life with unprecedented clarity and emotion.

Users can also experiment hands-on in Copilot Labs, exploring new ways to create immersive voice experiences. MAI-Voice-1 marks a breakthrough in speed, realism, and accessibility for next-generation AI applications.

#
Microsoft

Read Our Content

See All Blogs
Gen AI

AI Matic- Enterprise AI platform delivering AI that actually works

Akash Chandrasekar

May 8, 2026
Read more
AI system implementation

How we built a real-time AI learning engine for conversational teaching

Paushigaa S

May 6, 2026
Read more