Most of us may be experiencing the voice of artificial intelligence (AI) through personal assistants such as Siri or Alexa. Its flat intonation and mechanical delivery give you the impression that it can easily distinguish between AI-generating voices and real people. But now, scientists say that the average listener can no longer convey the difference between the real people and the “deepfark” voices.
In a new study published in Journal Plos One on September 24, researchers showed that when people hear human voices alongside the versions generated by AI of the same voice, they cannot accurately identify which is real and which is fake.
“The voices generated by AI are now around us. We’re all talking to Alexa or Siri, or an automated customer service system called out. “These things don’t sound like real human voices, but it was only a matter of time before AI technology began to produce naturalistic, human speeches.”
You might like it
The study suggested that while general voices created from scratch are not considered realistic, voice clones trained with real people’s voices proved to be as trusted as deepfake audio – as they are real counterparts.
Scientists asked study participants to give samples of 80 different voices (40 AI-generated voices and 40 real human voices) and label them what they thought they thought they were real. On average, only 41% of the voices in From-Scratch AI were misclassified as human, suggesting that in most cases it is possible to communicate to them separately from the real people.
However, in the case of human cloned AI voices, the majority (58%) were misclassified as human. With only a small number of human voices (62%) classified as human, the researchers concluded that there was no statistical difference in their ability to communicate real people’s voices apart from deepfake clones.
The results have a potentially profound impact on ethics, copyright and security, Lavan said. If criminals use AI to clone their voices, it will be much easier to bypass voice authentication protocols at the bank or trick a loved one into transferring money.
We’ve already seen a few incidents unfolding. For example, on July 9th, Sharon Brightwell was tricked out of $15,000. Brightwell hears that she thought her daughter was screaming for a phone call and says she had an accident and that she needs money from a legal representative to keep her out of prison. “No one could have convinced me that it wasn’t her,” Brightwell said of the realistic AI manufacturing of the time.
Realistic AI voices can also be used to create statements and interviews from politicians and celebrities. Fake audio may be used to trust an individual, cause anxiety, or ow social division and conflict. Con Artists recently built an AI clone of the voice of Steven Miles in Queensland. For example, he used his profile to get people to invest in Bitcoin scams.
The researchers emphasized that the audio clones used in the study were not particularly refined. They made them with commercially available software and trained them with human voice recordings for just four minutes.
“This process required minimal expertise, with minutes of audio recording and little money,” Navan said in a statement. “It shows how accessible and sophisticated AI voice technology is.”
Deepfake offers a large number of opportunities for malignant actors, but that’s not all bad news. There may be more positive opportunities associated with the power to generate AI voices on a large scale. “There may be applications to improve accessibility, education and communication, where high-quality synthetic voices can improve the user experience,” Navan said.
Source link