Models
August 29, 2025

Microsoft announces MAI-Voice-1, its first speech generation model

MAI-Voice-1 can currently produce one minute of natural, expressive voice in under a second on a single GPU.

MAI-Voice-1 is the most expressive and natural AI voice generation model yet, designed for efficiency and scale.

Capable of generating a full minute of humanlike audio in less than one second on a single GPU, it pushes the boundaries of real-time speech synthesis. Now live in Copilot Daily and Podcasts, it brings conversations, narration, and storytelling to life with unprecedented clarity and emotion.

Users can also experiment hands-on in Copilot Labs, exploring new ways to create immersive voice experiences. MAI-Voice-1 marks a breakthrough in speed, realism, and accessibility for next-generation AI applications.

#
Microsoft

Read Our Content

See All Blogs
AWS

Day 4 at AWS re:Invent: Experience-Based Acceleration (EBA) partners announced and a big bang close

Deveshi Dabbawala

December 4, 2025
Read more
AWS

Privacy safe synthetic ML data generation with AWS Clean Rooms

Sharan Sundar Sankaran

December 3, 2025
Read more