Whenever deepfakes make the news, it’s almost always for the latest terrifying way bad actors have figured out how to spawn hoaxes or cyberbully people using the AI-powered technology. However, the media industry has found some more practical (and less sinister) applications, such as using face swaps to craft more realistic visual effects, synching actors’ mouths with dialogue in dubbed films, and, now, automating voice work.
Veritone, the creator behind the world’s first operating system for artificial intelligence, aiWare, launched a new platform this week called Marvel.ai that lets content creators, celebrities, and others generate audio deepfakes of their voices to licence as they see fit.
Veritone’s pitch is that media companies and personalities can churn out monetized audio content and generate revenue without ever stepping foot in a studio — because after all, time is money. Built on its aiWare operating system, Marvel.ai produces synthetic voice clips that it claims sound like the real thing for radio spots, audiobooks, voice-overs, and localised content, among other examples.
“With complete control over their voice and its usage, any influencer, personality, or celebrity can quite literally be in multiple places at once,” the company said in a press release Friday. “This would open the door to a new level of scale that was not humanly possible before, allowing them to increase the number of projects, sponsorships, and endorsements they can do in any given year.”
The process, also known as voice cloning, uses artificial intelligence and machine learning algorithms to replicate someone’s voice based on a series of audio samples. The Marvel.ai platform itself functions as both a marketplace where customers can submit requests to use a particular voice model for auto-generated content and a self-service tool that resembles your more traditional text-to-speech reader, letting users pick from a catalogue of pre-generated voices to create a customised audio clip.
In an interview with the Verge, Veritone president Ryan Steelberg said it offers another service — a “managed white-glove approach” — where customers can submit voice clips to train Veritone’s systems and produce their own voice clone to licence.
Speaking with the outlet, Steelberg explained that while Veritone bills itself as an AI developer first and foremost, it also depends on old-school advertising and content licensing for a large chunk of its revenue. Its advertising subsidiary, Veritone One, is heavily invested in the podcasting industry and places more than 75,000 “ad integrations” with influencers every month, he said.
“It’s mostly native integrations, like product placements,” Steelberg told the Verge. “It’s getting the talent to voice sponsorships and commercials. That’s extremely effective but very expensive and time consuming.”
This advertising expertise along with technological advances in speech synthesis in recent years motivated the company to construct a better solution. Of course, whether Veritone’s platform takes off is largely dependent on how convincing its voice clones are. Steelberg shared a few examples of Marvel.ai’s finished product with the Verge, which you can check out here, and the clone sounds human enough to me. The tone is a little uncanny in a way I can’t quite put my finger on. However, I’ll be honest, it’s hard to tell if that’s just my brain convincing me it sounds off because I know it’s from a robot and not a person.
While I know I mentioned before that the media industry is finding less sinister applications of deepfake tech, there is one aspect of Marvel.ai that gives me some serious pause. Whoever owns the legal rights to a voice — i.e. not just the person behind the voice — can use the platform to create whatever audio message they please. And that raises many of the same privacy red flags and potentials for misuse that have plagued the rise of deepfake videos. Marvel.ai can even resurrect the voices of the dead using archived recordings to train its AI systems, Steelberg told the Verge.
“Whoever has the copyright to those voices, we will work with them to bring them to the marketplace,” he said. “That will be up to the rightsholder and what they feel is appropriate, but hypothetically, yes, you could have Walter Cronkite reading the nightly news again.”
From a technological standpoint, that’s obviously impressive. From a moral standpoint, it’s creepy as hell. I really don’t need to hear Cronkite and other deceased big names reanimated through AI, especially since you know it’s only a matter of time before brands start doing shit like using Princess Diana to endorse Oreos or David Bowie to convince you to subscribe to Spotify Premium. I mean, have we learned nothing from the Prince hologram fiasco?