We’ve all seen the incredible videos of Bill Hader morphing into Tom Cruise, or Steve Buscemi suddenly emerging from Jennifer Lawrence’s face. For anyone with an eye on the world of CG animation and VFX, the potential that deepfake technology holds for filmmakers and creatives is pretty exciting. Scary, yes, but exciting.
But this machine learning technology also raises a lot of question. How does it work? What are its limitations? What are the ethical implications? And, perhaps the most pressing question for those working in advertising, is it a magic bullet that will allow us to create content more quickly and more cheaply – or is it a bit more nuanced than that?
To that end, the Oscar-winning international creative studio Framestore has pulled together an in depth exploration of deepfakes for Lions Live on June 26th. They’ll be breaking down how it works and how it can and could be applied. In order to advertise the session, they created a pair of cheeky trailers – a fluffy, blustering endorsement from UK PM Boris Johnson as well as an accusation of ‘fake news’ from US President Donald Trump.
LBB’s Laura Swinton spoke to Framestore’s co-founder and CCO Mike McGee and ECD William Bartlett to find out how the trailers were made, the practicalities of deepfake and why AI can only enhance rather than replace true creativity.
LBB> Why did you want to tackle deepfakes in your Cannes talk this year?
As you can imagine, once the pandemic began to impact film and TV production we have been inundated by requests from our clients looking for creative solutions that would allow them to create content without shooting. At the same time we were getting almost daily press requests for our team to answer questions on deepfakes, how they work and how else we can use the same technology.
So, for Lions Live we decided to entertain, educate and inspire the Lions audience and our clients by producing some content of our own that would show how deepfake works and what the impact of the technology will be in the future.
LBB> For your trailer, what was the starting point?
As Lions Live has an international audience we wanted to deepfake some really well known figures who would be instantly recognisable. It was important that everyone could see the fidelity of the images and decide how convincing they are, so we ‘cast’ two very well known leaders as our talent. Obviously we also wanted this to be entertaining by itself, not a dry technical exercise, and both characters have a certain ‘comedy’ of their own.
For the illusion to be convincing it’s not just the images that need to be perfect. Voice is equally important - so we cast a very talented impressionist, Lewis Macleod, who also happened to share certain physical characteristics with our subjects.
We also felt it was important to explore a more serious point about deepfakes. It’s obvious from the content that our film is not real, but we wanted to highlight how possible it already is to re-create very public figures in a way that is almost indistinguishable from reality. That ability is potentially open to abuse, and there is a conversation about ethics that should surround that.
LBB> I understand that there was a lot of serendipity when it came to casting and working with your actor in terms of his availability and possession of a green screen. Can you tell me about how you found him, where he was and how you worked with him?
Lewis happens to be a friend of a colleague, but is also someone we’d been keen to work with for a while - but we just hadn’t yet found the right project to collaborate on. This was the perfect opportunity.
When we contacted him he was in lockdown in Scotland, but still very busy doing voice records in his own studio. He was excited to explore the possibilities of delivering a performance remotely, using a very homemade filming set up.
Lewis had a green screen, an iPhone that could film in 4K and a laptop with Zoom - so we could remote into the improvised studio setup.
He also had a fantastic ability to ad-lib using the familiar expressions and body language of our deepfake subjects - which was very important. Just as in almost every other area of visual effects technology can only ever aid performance - it doesn’t create it.
LBB> What did you learn from the process of directing specifically for deepfake? What insights came up that you think will help in future work?
William Bartlett directed Lewis’ performance to create a teaser trailer for both Boris and Donald and a two minute sketch to open the actual presentation.
Setting aside technical considerations, performance is key to make a convincing deepfake. Being able to view performance almost in the abstract, knowing that visually it would be processed beyond recognition while still remaining what drove the final output, was key to getting the result we wanted.
Certain body gestures, head movements and facial poses from Lewis really enhanced the believability and character of both Boris and Trump - and Lewis’ ability to improv around the script, adding in characteristic verbal tics and tropes had a similar effect.
One key insight from the process is that once you have trained a network to create an effect it will continue to improve, the more it is ‘trained’. This suggests that one could maintain a continuously updated ‘Trump’ or ‘Boris’ and know they could be applied with improving results each time they were used.
LBB> In terms of VFX supervision, what considerations and preparation did you need to incorporate into the shoot to make the footage easier to combine with deepfake technology? Did you do anything extra than you would usually do with a VFX-heavy project, or was it pretty similar?
We were very specific with Lewis about the framing, lighting and what we needed from a VFX perspective. The additional considerations from a deepfake point of view were to ensure that Lewis’ character wigs did not obscure the performance or that his hands didn’t cross his face during each take - to keep things simple.
LBB> In my very rudimentary understanding, deepfakes rely on Generative Adversarial Networks, using two competing networks learning from a data set – in this case images of Boris and Trump I’d assume. Would you be able to explain roughly how it worked?
What’s fascinating about this technology is that it is founded on an almost entirely different methodology for generating visual effects.
For the full story, you’ll need to watch our presentation...
LBB> Fair enough! What were the challenges that came up when making Boris and how did you overcome them?
Boris and Trump both have hair styles that can best be described as... characterful. A lot of extra VFX work went into ‘assisting’ the styling in post. With the help of a trained hairdresser on set we would ideally have improved the look and integration of the wigs worn on the shoot. The end result would have been achieved a lot quicker if we’d only had to apply a face replacement rather than face and hair.
LBB> What other projects has Framestore done with deepfake tech?
Like we said at the beginning, we’ve had many requests to explore using this technology. We can’t tell you where we are using it yet, but some Deepfakes will pop up in the not too distant future - and you may (or may not) realise that’s what they are.
LBB> The talk is called Demystifying Deepfakes – what do you think are the most common misconceptions about the technology?
A lot of people think that you can take a bit of footage of Boris Johnson and make him say something different. This is not how it works. What you can do is take a face and apply it like digital makeup to another performance. The ‘animation’ is driven from your new performance and so to create something believable the impersonation needs to be very good.
That it is bedroom tech and anyone can do it to a minimum standard (and you can google any number of examples to see the results). To do it well you need a talented cast, a good script, decent lighting, and the ability to tweak all the additional elements that make a really convincing fake.
LBB> The most common application of deepfake tech that we’ve seen publicly is creating digital versions of celebrities, but what are the other applications for it for brands and filmmakers?
Other applications of the same technique can be used for de-ageing and ageing a cast. Alternatively any actor could now speak any language for worldwide campaigns. There will be numerous uses for deepfake reusing old footage (provided there is enough material to train the network. But brands will also be able to specifically design and own the face of their campaign, or create local faces in regional campaigns. Digital or virtual newsreaders will now be possible, as well as faces for Alexa or Siri. Or just your own personal PC.
LBB> Will deepfake technology democratise creativity?
Technology is only ever a tool. Machine Learning is a powerful tool but still one that needs to be harnessed, trained and combined with strong creative ideas to get the most from its potential. Like most software and hardware as it becomes more mainstream and affordable it will find its way into the home studios of artists and technologists and enable them to create customised digital experiences for any platform.
In the worlds of TV, film and advertising we have a technique and technology very much in its infancy but one that offers creatives an exciting new range of storytelling possibilities. At Framestore we continue to research and develop AI and machine learning, looking to adapt and integrate them with our existing facial capture, performance capture, creature animation and fast rendering tool sets.
LBB> How will this technology change the creative and production process?
In our talk, deepfake is just one application of neural networks, AI and Machine Learning. The power of computers that can teach themselves will open a world of creative possibilities.
Right now, being able to do the same things faster means more time for creative iteration. The step change occurs when artists and creators are not just using the technology to execute something they have already decided on, but when Machine Learning is able to create and refine images according to creative parameters; to make an image ‘more sinister’ or a face ‘more friendly’. At that point working with a computer will become more akin to working with any other creative collaborator, like an art director or a designer.
The change to existing working practices promises to be so game changing that we will not just ‘do what we do quicker’ but will fundamentally change ‘the way we do what we do’.
Demystifying Deepfakes will take place on June 26th on Cannes Lions Live at 5.10pm (BST) and 12.10pm (EDT)