גוגל מיט א עי איי אפדעיט

כ"ט כסלו תשפ"ו

0 185

GoogleAI has announced a major update to its Gemini 2.5 audio models, bringing significant enhancements to speech-to-speech translation, text-to-speech capabilities, and native audio handling.

The company’s Gemini models now feature live speech-to-speech translation in a beta version of the Google Translate app. This update allows users to conduct real-time conversations across multiple languages, including French, German, Korean, Mandarin, and Sinhala, with translations preserving the nuances and tone of human speech. Demonstration videos show multi-speaker interactions maintaining clarity, pacing, and emotional inflection.

In addition, the Gemini 2.5 Flash and 2.5 Pro text-to-speech (TTS) models have been improved for professional and consumer applications. Enhancements include stronger adherence to style prompts, context-aware pacing, and consistent character voices across multi-speaker dialogues. These updates aim to deliver more natural and dynamic voice outputs for content creation, virtual assistants, and customer support tools.

The Gemini 2.5 Flash Native Audio update also introduces upgrades to handle complex workflows and user instructions, enabling more natural, sustained conversations with voice agents. Users can now engage in multi-turn dialogues with better comprehension of context and intent.

GoogleAI emphasizes that these updates reflect ongoing efforts to expand Gemini’s utility for both everyday users and enterprise applications, offering real-time translation, lifelike TTS, and intelligent audio interactions across languages and scenarios.