Voice deepfakes: how scammers fake voices

Modern voice synthesis technologies have reached an incredible level of realism. Just a few seconds of audio recording are now enough to create an exact digital copy of the human voice. However, the ease of creating realistic voice copies requires new approaches to protecting personal data. Let’s figure out how these systems work and what new security challenges they create.

Experts warn that malicious manipulation of audio recordings, which is technically easier to implement than creating deepfake videos, will only increase in volume. Modern speech synthesis systems blur the line between reality and fakery, creating unprecedented security threats, which is why it is important to be aware of how these technologies are developing and how they can be used by attackers.

Recently, the “FakeBoss” scheme has become increasingly popular among scammers: attackers create a fake manager’s account, usually on Telegram, and call victims using voice substitution and posing as the organization’s manager. Hackers can also use AI to fake the voices of company managers to authorize the transfer of funds. Small and medium-sized businesses, where many processes are tied to the owner, are most susceptible to such attacks.

For example, in 2020, fraudsters faked the voice of a bank director in the UAE and used it to commit a robbery. To make the case more convincing, they also sent letters on behalf of management and lawyers to the bank’s branch manager. As a result, he believed that the company was in the process of concluding a major deal and followed the instructions of the deepfake boss who called and made a series of large money transfers totaling $35 million.

One of the most high-profile cases of deepfake theft occurred in Hong Kong, where scammers faked the voice of a major company’s CEO and several of his employees using their social media accounts. They then organized a group call to scam the company’s CFO. As a result, they managed to convince him to transfer $25 million to the specified account, causing the company huge losses. This case shows that voice is no longer a reliable proof of identity.

But what if voice deepfakes were used not just to steal money, but to blackmail? Imagine criminals creating a fake audio recording of you saying something inappropriate — a confession to a crime, insults, or even threats. Then they call you and demand money to keep the recording from leaking online. This is no longer a theory, but a reality. Similar schemes have begun to appear in different countries, and victims often cannot even prove that the recording was faked. Police and courts do not yet have reliable tools to identify voice deepfakes, which means that attackers can act with near impunity.

Such a case occurred in 2023 in Arizona (USA). An anonymous caller told a woman that he had kidnapped her 15-year-old daughter and demanded a ransom. According to the woman, she clearly heard her daughter crying, screaming, and pleading in the background, but the caller refused to hand her the phone. Fortunately, before paying the ransom, the victim managed to make sure that her daughter was safe and had not been kidnapped. This was immediately reported to the police, who then identified the call as a common scam.

Voice deepfakes can be used in corporate espionage, where scammers impersonate employees’ voices and gain access to sensitive information. For example, hackers could create a recording of a “CEO” asking the IT department to give them a new password or change security settings. If the spoof is done well enough, employees won’t even know they’re being duped.

The hardest part of combating voice deepfakes is their mass distribution. One of the most dangerous scenarios is the use of voice deepfakes in phone scams. For example, you get a call “from the bank” saying that your account is blocked and that to unblock it you need to “authenticate” via voice confirmation. In reality, you are simply asked to pronounce certain words, which are then used to steal your voice and carry out fraud with your data.

Voice deepfakes are a real threat, but don’t panic ahead of time, let alone rush to comply with the voice’s demands. Remember: not every suspicious call is a deepfake, but each one requires verification. Read the following materials to learn how to protect yourself from audio fakes.