OpenAI Releases Sneak Peek of its Synthetic ‘Voice Engine’

01/04/2024
Graphic of earth with the word 'AI'

OpenAI has unveiled early findings from a limited-scale preview of its latest creation, Voice Engine. This model is engineered to craft bespoke, natural-sounding voices through text input and a 15-second audio sample.

Originally developed in late 2022, Voice Engine has been instrumental in powering preset voices within OpenAI’s text-to-speech API and various offerings like ChatGPT Voice and Read Aloud. However, the company is proceeding with caution, as it acknowledges the potential risks associated with the misuse of synthetic voices.

In a recent blog post, OpenAI emphasised its commitment to promoting responsible technology usage. It aims to initiate conversations on this matter, drawing insights from ongoing dialogues and the results of preliminary tests.

“We recognise that generating speech resembling human voices poses significant risks, particularly in an election year,” the blog states. “We are collaborating with U.S. and international partners across various sectors to ensure that their feedback shapes our development process.”

To explore Voice Engine’s potential applications, OpenAI conducted private trials with select partners. These trials informed the company’s strategies, safety measures, and perspectives on how Voice Engine could be positively utilised across different sectors. Some envisioned applications include:

  • Assisting non-readers and children with reading comprehension through emotive voices representing diverse speakers.
  • Facilitating the seamless translation of content, such as videos and podcasts, for global audiences.
  • Improving essential service delivery in remote regions.
  • Supporting individuals with speech impairments.
  • Aiding patients in regaining their voice, particularly those affected by speech-related conditions.

Continuing its streak of technological innovations, OpenAI recently showcased a series of short films leveraging its generative AI video production model, Sora. These films garnered attention for their realistic depictions of mammoths and cityscapes, with studios already embracing the technology for rapid idea implementation.

Creators highlight Sora’s ability to generate both realistic and surreal content swiftly, heralding a new era of abstract expressionism.

Latest Articles

The Latest Stories

Psychology and physiology centre stage at city STEM festival
UK to unveil new cyber skills initiative at Global Cyber Summit
Scottish Coffee Roaster Modernises with Robotics and Eco-Friendly Upgrade
72% of UK Firms Struggle with Skills Gaps, Turning to AI and Upskilling Amid £275M Productivity Loss