OpenAI Releases Sneak Peek of its Synthetic ‘Voice Engine’

01/04/2024
Graphic of earth with the word 'AI'

OpenAI has unveiled early findings from a limited-scale preview of its latest creation, Voice Engine. This model is engineered to craft bespoke, natural-sounding voices through text input and a 15-second audio sample.

Originally developed in late 2022, Voice Engine has been instrumental in powering preset voices within OpenAI’s text-to-speech API and various offerings like ChatGPT Voice and Read Aloud. However, the company is proceeding with caution, as it acknowledges the potential risks associated with the misuse of synthetic voices.

In a recent blog post, OpenAI emphasised its commitment to promoting responsible technology usage. It aims to initiate conversations on this matter, drawing insights from ongoing dialogues and the results of preliminary tests.

“We recognise that generating speech resembling human voices poses significant risks, particularly in an election year,” the blog states. “We are collaborating with U.S. and international partners across various sectors to ensure that their feedback shapes our development process.”

To explore Voice Engine’s potential applications, OpenAI conducted private trials with select partners. These trials informed the company’s strategies, safety measures, and perspectives on how Voice Engine could be positively utilised across different sectors. Some envisioned applications include:

  • Assisting non-readers and children with reading comprehension through emotive voices representing diverse speakers.
  • Facilitating the seamless translation of content, such as videos and podcasts, for global audiences.
  • Improving essential service delivery in remote regions.
  • Supporting individuals with speech impairments.
  • Aiding patients in regaining their voice, particularly those affected by speech-related conditions.

Continuing its streak of technological innovations, OpenAI recently showcased a series of short films leveraging its generative AI video production model, Sora. These films garnered attention for their realistic depictions of mammoths and cityscapes, with studios already embracing the technology for rapid idea implementation.

Creators highlight Sora’s ability to generate both realistic and surreal content swiftly, heralding a new era of abstract expressionism.

Latest Articles

The Latest Stories

New VR tech could help gamers experience ‘ludicrous speed’ without motion sickness
D2ZERO builds leadership capability
Boost in small business hiring offset by slow sales growth 
1 in 5 organisations have had company data exposed by an employee using AI tools such as ChatGPT