10/09 2024 338
During the National Day holiday, a prank AI dubbing of Lei Jun suddenly went viral on the internet, sparking numerous controversies. AI-generated fake audio and video content seems to have become an unclosable Pandora's box. Prior to this, the Deepfake scandal in South Korea had already garnered global attention. According to reports, the incident was initially disclosed by several South Korean women through social media platforms, who shared their experiences of having their faces swapped in inappropriate videos using Deepfake technology. Subsequently, more and more victims and incidents came to light.
According to a list circulating on social media, over a hundred primary and secondary schools, as well as international schools in South Korea, have been affected by Deepfake. South Korean officials have suggested that the number of victims could reach 220,000.
You might be wondering: Why has Deepfake, which has been universally condemned for years, become increasingly prevalent?
It's important to note that when Deepfake first emerged in 2017, it was swiftly banned globally within just a few days, including strict bans by mainstream social media platforms. Since then, many countries and regions have enacted legislation prohibiting the use of Deepfake and other AI face-swapping technologies to infringe upon others' portrait rights and privacy.
Is it because Deepfake technology is too advanced and difficult to defend against that it remains a persistent threat?
Contrary to popular belief, the reason Deepfake is so difficult to ban is precisely because it is overly simple and can cause significant damage with minimal technical resources.
These resources are readily available in today's internet environment.
We don't need to delve into human nature or motives to understand why someone would create and spread AI face-swapped videos. As long as the cost of breaking the law is low, such behavior will inevitably persist. As ordinary citizens, we may not be able to discuss how to combat Deepfake from a legislative or enforcement perspective. What we can do, however, is to examine the realities of Deepfake and consider ways to increase the difficulty of breaking the law, thereby reducing the space for such behavior to thrive.
Deepfake has been labeled by many media outlets as "humanity's most evil AI technology." While this description has some validity, it also objectively demonizes and mystifies Deepfake, associating it with keywords like hackers, ransomware, and the dark web in the minds of those who are unfamiliar with it.
In fact, the danger and destructiveness of Deepfake lie precisely in its simplicity. It requires virtually no technical expertise, and all the necessary assistance can be easily obtained in the open internet environment.
Imagine that the 220,000 victims in South Korea couldn't possibly have all been targeted by a handful of tech experts. When ordinary people can commit malicious acts anytime, anywhere, with no cost, it becomes truly difficult to contain the malice.
To explain this further, we must first understand the specific process of Deepfake. Generally speaking, using Deepfake for AI face-swapping involves the following steps:
1. Prepare Deepfake-related software or find an online AI development platform with similar capabilities.
2. Prepare the video for face replacement and segment it into individual frames.
3. Select and manipulate the faces to be replaced from the frames. Commonly referred to as 'cutting' and 'lifting' faces.
4. Overlay the prepared images and proceed with model training. For those without a technical background, pre-trained models are often required to assist in the training process.
5. Complete the training and generate the video.
From this process, we can conclude that conducting a harmful Deepfake operation requires at most four things: AI face-swapping software, pre-trained models, the target video, and photos of the victim.
The ease of obtaining these things is the core reason why Deepfake persists despite numerous bans and spreads even more widely.
Let's take a closer look at where the 'crime tools' for Deepfake come from, step by step. The purpose of this discussion is not to popularize the knowledge but to highlight the opportunities and vulnerabilities left open for Deepfake perpetrators in the internet environment. Without addressing these vulnerabilities, moral appeals or technical identification of AI face-swapped videos alone will not be enough to deter malicious actors.
Firstly, AI face-swapping necessitates photos of the victim. According to sharing on relevant technical forums, the earliest versions of Deepfake required around 50 high-resolution photos from multiple angles to achieve a relatively natural video synthesis. However, after several years of iteration, only about 20 photos are now needed.
Imagine that for those who regularly share photos on social media, having 20 photos stolen is alarmingly easy.
Coupled with an easily accessible inappropriate video, a disastrous situation can ensue.
After obtaining the victim's photo information, perpetrators need to find software capable of Deepfake. The founder of Deepfake initially shared the software and tutorials on Reddit, but it was quickly banned by the platform. In response, the author released the software's code on Github for free download and use.
By 2022, the software had been upgraded to DeepFaceLab 3.0, which could be downloaded from the author's Github page and shared through numerous QQ groups and cloud storage links in the Chinese internet environment.
Worse still, the DeepFaceLab software available for download on various channels comes with detailed Chinese instructions and operation guides. For tasks like face cutting, which can be tedious, there are even specialized software to accelerate the process.
The only potential obstacle for perpetrators is that AI face-swapping still requires a decent graphics card for training acceleration, but mid-to-high-end gaming graphics cards are more than sufficient.
The combination of free illegal software and zero technical difficulty makes Deepfake truly terrifying.
Up to this point, if someone possesses AI technical capabilities, they already have all the prerequisites for conducting a harmful Deepfake operation. However, for those less familiar with AI technology, there is one crucial requirement: obtaining pre-trained models.
Pre-trained models are a fundamental mechanism in AI development. Since most AI models share similar pre-training tasks, developers tend to pre-train common components for reuse in similar tasks. In AI face-swapping, due to the difficulty in mastering the training method, most novice-trained models exhibit unnatural face placements and significant frame drops. In such cases, pre-trained models are used to improve model accuracy while reducing training time.
Logically speaking, as non-technical individuals, it should be difficult for them to obtain pre-trained models for Deepfake, right?
Not exactly. By browsing various e-commerce platforms and second-hand marketplaces, one can easily find pre-trained models specifically for Deepfake, often referred to as 'AI Elixirs.'
On a major second-hand marketplace, searching for keywords like 'AI Elixir,' 'AI Pill,' 'AI Base Elixir,' 'DFL Elixir,' or 'DFL Magic Pill' will reveal numerous listings for Deepfake pre-trained models, often available for just a few yuan.
Due to a lack of platform oversight, such sales have persisted, forming a 'industrial chain' that fuels malicious acts. Even the DeepFaceLab software mentioned earlier is often available on second-hand marketplaces, complete with operational tutorials and guides.
Another reason why this 'AI Pill Industrial Chain' is difficult to eradicate is that once a Deepfake user becomes proficient, they can create their own pre-trained models, colloquially known as 'alchemy.' These refined 'ammunition' are then sold on second-hand platforms, continually expanding this gray industry.
By summarizing the sources supporting each step of a Deepfake perpetrator's operation, we aim to identify the crucial areas where efforts should be focused to eliminate its harm.
Current discussions on this topic often fall into a misconception: advocating for cutting-edge, high-cost methods to combat Deepfake, such as using AI technology to identify whether a video is AI-generated.
While AI algorithms can indeed be crucial in detecting AI-generated videos, especially in fraud detection, they are less effective in crimes like the South Korean Deepfake incident, where the goal is to disseminate inappropriate videos.
For forgers and disseminators, the authenticity of the video is often not a concern. Relying on sophisticated AI technology to detect Deepfake is akin to using a tiny technical sandbag to stem a deluge of human malice, merely addressing symptoms rather than the root cause.
A more effective approach might be to address the root cause by discouraging malicious actors from attempting Deepfake in the first place.
Firstly, we should exercise caution when sharing photos on social media. While it's unfair to burden victims with the responsibility of preventing crime, in this uncertain online environment, the responsibility of protecting personal privacy is becoming increasingly important.
Secondly, stricter bans on related software are necessary. Appropriate actions should be taken to prevent the spread of such software. Especially with the rise of AIGC, numerous new platforms are offering AI development functionalities to ordinary developers. We must be vigilant against similar functionalities that could potentially become new AI security vulnerabilities.
Furthermore, e-commerce and second-hand marketplaces require stricter oversight to block the sale of Deepfake software and pre-trained models. Cutting off the profit chain is often crucial in prohibiting illegal activities.
Our battle against Deepfake has only just begun. Demystifying and understanding it is the first step in resisting malice.