Voice Deepfakes

Context:

Recently, several users of the social media platform 4chan, used “speech synthesis” and “voice cloning” service provider, to make voice deepfakes of celebrities like Emma Watson, Joe Rogan, and Ben Shapiro.

7feb1

Image Source: The Hindu

About Voice Deepfakes:

A voice deepfake is one that closely mimics a real person’s voice.
The voice can accurately replicate tonality, accents, cadence, and other unique characteristics of the target person.
People use AI and robust computing power to generate such voice clones or synthetic voices.

How are Voice Deepfakes Created?

To create deepfakes one needs high end computers with powerful graphics cards, leveraging cloud computing power.
Powerful computing hardware can accelerate the process of rendering, which can take hours, days, and even weeks, depending on the process.
Besides specialised tools and software, generating deepfakes need training data to be fed to AI models.
This data are often original recordings of the target person’s voice.
AI can use this data to render an authentic sounding voice, which can then be used to say anything.

Threats arising from the use of voice deepfakes:

Attackers are using such technology to defraud users, steal their identity, and to engage in various other illegal activities like phone scams and posting fake videos on social media platforms.

Ways to detect voice deepfakes:

Detecting voice deepfakes need highly advanced technologies, software, and hardware to break down speech patterns, background noise, and other elements.
Cybersecurity tools have yet to create foolproof ways to detect audio deepfakes.
Research labs use watermarks and blockchain technologies to detect deepfake technology, but the tech designed to outsmart deepfake detectors is constantly evolving.

News Source: The Hindu