Google Deepmind Rolls Out Gemini 3.1 Flash Text-To-Speech Model With Customisable Audio Tags

It enables the user to easily direct vocal style, delivery, and pace through text commands, Google said.

oogle Deepmind has rolled out its latest text-to-spech AI model Gemini 3.1 Flash TTS, which enables the user to “easily direct vocal style, delivery, and pace through text commands”, according to a post from the company on social media platform ‘X’ on Wednesday.

As show in the video, the AI model will provide advanced options for the voice being projected by the model, such as inflection and tone such as “enthusiastic”, “positive surprise”, “informative” and more.

Users also have the options to pick various accents within languages, with the English language itself having a myriad of accent options such as an American ‘Valley’ or ‘Southern’ accent, along with options such as a British accent with variations such as ‘Brixton’ and ‘RP’, as well as ‘Transatlantic’ among many others.

Leave a Comment

Your email address will not be published. Required fields are marked *