OpenAI Develops Voice Cloning Tool, Unavailable for Public Use…for Now
The Voice Engine, developed by Harris, has been powered by a generative AI model that has been operating under the radar for some time. Interestingly, this same model forms the basis for the voice and “read aloud” functionalities found in ChatGPT, an AI-powered chatbot developed by OpenAI, as well as the preset voices offered in OpenAI’s text-to-speech API. Moreover, Spotify has been utilizing this model since early September to provide dubbed podcasts in various languages for notable hosts like Lex Fridman.
Regarding the source of the model’s training data, Harris remained somewhat guarded, implying it to be a sensitive topic. He disclosed only that the Voice Engine model was trained using a combination of licensed and publicly available data.
Generative AI models like Voice Engine typically undergo training on a vast amount of examples, such as speech recordings, often collected from various public sources and datasets online. Many companies view the specifics of their training data as a competitive advantage and thus prefer to keep this information confidential. Additionally, disclosing details about training data could potentially lead to intellectual property-related legal challenges, providing another reason for secrecy.
OpenAI, the organization behind Voice Engine and ChatGPT, has faced legal action over allegations of intellectual property infringement. Some creators and owners have accused OpenAI of training its AI models on copyrighted content without proper credit or compensation. OpenAI has entered into licensing agreements with certain content providers, like Shutterstock and Axel Springer, and offers options for webmasters and artists to opt out of having their content included in the datasets used for training image-generating models. However, no similar opt-out mechanism exists for other OpenAI products. In a statement to the U.K.’s House of Lords, OpenAI argued that it’s challenging to develop effective AI models without incorporating copyrighted material, citing the legal doctrine of fair use as protection for transformative uses of copyrighted works in model training.