DeepFake is an application that uses convolutional neural nets to generate a map of a person's face, then place that map over an actor in a video. This technology can create amazing mashups where the star in the movie is played by someone else. The media has continually tried to equate Elon Musk with Iron Man to increase clicks to their articles, so I decided that having Elon musk actually play Iron Man would be a perfect way starting point to experimenting with this technology.
Below: Elon Musk confessing that he is, in fact, Iron Man.
The first step in creating the digital representations is to generate a massive data set. In this case, I used hours of TV interviews of Elon Musk and Robert Downey Jr. and separated them into their individual frames. After cutting out the clips of the interviewer, I then used DeepFake to crop out the faces. Unfortunately, the software had trouble cropping out facial hair, causing the above video to star a bizarre mishmash of both characters. The cropped faces are then used as training data for DeepFake's convolutional neural network to map the faces onto each other. Finally, this trained network is run on each of the frames of the target video, and the frames with the spliced faces are then recompiled together into a film. From that movie file, you can re-sync the original audio, or add a new track.
Below: Progress of the CNN on mapping Elon's face onto Robert Downey Junior, and vice versa
Overall, the technology works pretty well, and is packaged in an extremely user-friendly application. A powerful GPU is still a must, as is a good data set for training, ideally with the faces in well-lit environments. In my case, the algorithm trained for 3 days straight on my 980Ti before I decided to stop training and run inference.
Below: Suiting up after crashing his McLaren F1 on Sand Hill Road
Although the lips and eyes match up properly with the audio, the voice still belongs to Robert Downey Jr. As Elon is unlikely to act out these words himself, another network will be necessary to generate speech in Elon's voice. Lyrebird is one such option, which has gained some fame by successfully emulating President Trump's voice. Convincing faked speech matched with faked faces can potentially create a highly convincing fake.
I have compiled the two above clips (and a third bonus clip) below
Note: The internet being what it is, many bad actors have used this technology to create "non-consensual pornography" wherein famous actors are spliced on to porn stars. At the same time, other communities have formed with bizarre purposes, such as having Nicholas Cage star as every actor in every movie. Like any technology, this software can be used for good, evil, or just for fun.