Listen to an AI voice actor try and flirt with you-Slickmagnet
The “software” for creating synthetic voices from the “start-up” Sonantic incorporates subtle traits and attitudes such as shyness, flirting, joking, and boasting in its latest update.
When John Flynn and Zeena Qureshi talk about Sonantic, the “start-up” that started in December 2018, they usually take as a reference the relevance of CGI (“Computer Generated Images”) in the cinema of the last three decades and entertainment in general.
What CGI has done for visuals is what Sonantic’s technology is doing for audio, explains Qureshi. This technology is a “software” capable of generating voices by artificial intelligence that sound eerily human and capable of being used in a movie or a video game without the listener detecting their artificial origin. They also call it “Photoshop for the voice.”
For most, the experience of interacting with a synthetic voice does not go beyond what is offered by clearly artificial voices such as those of Alexa or Siri assistants. However, very young companies such as Descriptor Sonantic are making significant progress in audio deep fakes.
The last of them was presented by Sonantic, taking advantage of the recent Valentine’s Day. This week they have updated their technology with the ability to express a range of more subtle features and emotions and have published as an example the following video in which a voice generated by AI, on the image of an actress, speaks with the viewer and flirts with him before explaining what it is.
“We chose love as the theme, but the goal of our research is to see if we can model subtle emotions. According to The Verge, the most obvious emotions are somewhat easier to capture,” explains Flynn, CTO of Sontonic.
The technology used by the company generates the artificial voice from models of original human voices that it processes. The “software” is a text-to-speech tool that allows you to enter the dialogues that you want to reproduce with the AI voice, specifying aspects such as the speaker’s mood and the way of saying it, as well as emotions such as anger, fear, sadness, happiness, and joy.
This week’s update has allowed the addition of traits and attitudes such as shyness, flirting, joking, and bragging, offering a deeper level of customization than can be found in competitor software like Descript. For example, they were the first to add believable shouting capabilities to their text-to-speech tool more diminutive than a year ago.
Sontonic believes in creating a synthetic voice that can express subtleties such as banter or flirtation. Incorporating sounds that do not word is vital, such as breathing or chuckling. Its “software” also allows you to adjust the tone and intensity you speak.
“I think that’s the main difference: our ability to direct, control, edit and sculpt a performance,” says Flynn. “Our clients are primarily triple-A video game studios, entertainment studios, and we’re expanding into other industries.
We recently made a partnership with Mercedes (to personalize their in-car digital assistant) earlier this year.” Although focused on entertainment, the company considers that the voice market is vast and has many use cases, from advertising and call centers to robots and audiobooks.
The Sontonic platform consists of two aspects. On the one hand, there is the technology they give to game studios with which they save the enormous amounts of time that goes into recording the voices of real actors.
On the other hand, they work with professional actors to create voice models. “Every time an actor’s synthetic voice is used, they receive a share of the profits without having to do the work themselves,” Qureshi explains on the company’s blog.
The company also offers voice cloning services, something actor Val Kilmer has taken advantage of since last summer. Kilmer overcame throat cancer in 2015 that has left him with speech difficulties, but now he has a model with his voice that he can use as he wishes in his professional projects.
“We believe that the use of technology to augment the voices of actors will be the new normal in 5 years. For studios, the software offers infinite possibilities for creators and is cheaper and faster.
The software offers passive income, voice projection, and multiple opportunities for actors. It is important to us that both parties benefit from this revolution in audio technology,” Qureshi concludes.