Artificial intelligence developed by a Samsung lab in Russia can fabricate video from a single image, including a painting. Even the Mona Lisa can be faked.
Imagine someone creating a deepfake video of you simply by stealing your Facebook profile pic. Luckily, the bad guys don’t have their hands on that tech yet.
But Samsung has figured out how to make it happen.
Software for creating deepfakes fabricated clips that make people appear to do or say things they never did usually requires big data sets of images in order to create a realistic forgery. Now Samsung has developed a new artificial intelligencesystem that can generate a fake clip by feeding it a little as one photo.
The technology, of course, can be used for fun, like bringing a classic portrait to life. The Mona Lisa, whose enigmatic smile is animated in three different videos to demonstrate the new technology, exists solely as a single still image. A Samsung artificial intelligence lab in Russia developed the technology, which was detailed in a paper earlier this week.
Here’s the downside: These kinds of techniques and their rapid development also create risks of misinformation, election tampering, and fraud, according to Hany Farid, a Dartmouth researcher who specializes in media forensics to root out deepfakes.
“Following the trend of the past year, this and related techniques require less and fewer data and are generating more and more sophisticated and compelling content,” Farid said. Even though Samsung’s process can create visual glitches, “these results are another step in the evolution of techniques… leading to the creation of multimedia content that will eventually be indistinguishable from the real thing.“
Like Photoshop for video on steroids, deepfake software produces forgeries by using machine learning to convincingly fabricate a moving, speaking human. Though computer manipulation of the video has existed for decades, deepfake systems have made doctored clips not only easier to create but also harder to detect. Think of them as photo-realistic digital puppets.
Lots of deepfakes, like the one animating the Mona Lisa, are harmless fun. The technology has made possible an entire genre of memes, including one in which Nicolas Cage’s face is placed into movies and TV shows he wasn’t in. But deepfake technology can also be insidious, such as when it’s used to graft an unsuspecting person’s face into explicit adult movies, a technique sometimes used in revenge porn.
In the paper, Samsung’s AI lab dubbed its creations “realistic neural talking heads.” The term “talking heads” refers to the genre of the video the system can create; it’s similar to those video boxes of pundits you see on TV news. The word “neural” is a nod to neural networks, a type of machine learning that mimics the human brain.
The researchers saw their breakthrough being used in a host of applications, including video games, film, and TV. “Such ability has practical applications for telepresence, including videoconferencing and multi-player games, as well as special effects industry,” they wrote.
The paper was accompanied by a video showing off the team’s creations, which also happened to be scored with a disconcertingly chill-vibes soundtrack.
Usually, a synthesized talking head requires you to train an artificial intelligence system on a large data set of images of a single person. Because so many photos of an individual were needed, deepfake targets have usually been public figures, such as celebrities and politicians.
The Samsung system uses a trick that seems inspired by Alexander Graham Bell’s famous quote, “Before anything else, preparation is the key to success.” The system starts with a lengthy “meta-learning stage” in which it watches lots of videos to learn how human faces move. It then applies what it’s learned to a single still or a small handful of pics to produce a reasonably realistic video clip.
Unlike a true deepfake video, the results from a single or small number of images fudge when reproducing fine details. For example, a fake of Marilyn Monroe in the Samsung lab’s demo video missed the icon’s famous mole, according to Siwei Lyu, a computer science professor at the University at Albany in New York who specializes in media forensics and machine learning. It also means the synthesized videos tend to retain some semblance of whoever played the role of the digital puppet. That’s why each of the moving Mona Lisa faces looks like a slightly different person.
Generally, a deepfake system aims at eliminating those visual hiccups. That requires meaningful amounts of training data from both the input video and the target person.
The few-shot or one-shot aspect of this approach is useful, Lyu said, because it means a large network can be trained on a large number of videos, which is the part that takes a long time. This kind of system can then quickly adapt to a new target person using only a few images without extensive retraining, he said. “This saves time in concept and makes the model generalizable.“
The rapid advancement of artificial intelligence means that any time a researcher shares a breakthrough in deepfake creation, bad actors can begin scraping together their own jury-rigged tools to mimic it. Samsung’s advances are likely to find their way into more people’s hands before long.
The glitches in the fake videos made with Samsung’s new approach may be clear and obvious. But they’ll be cold comfort to anybody who ends up in a deepfake generated from that one smiling photo posted to Facebook.
Joan E. SolsmanRelated posts: