My interest is piqued when someone tells me that the most significant disruption in the film industry in 100 years is happening. And it's all thanks to generative AI.
Generative Al is a sub-field of machine learning, most prominently used in text-to-speech and text-to-design software. For example, ChatGPT writes text like a human — unless you're trying to write accurately about economics and finance. DALL-E creates art from text.
But Generative AI is also transforming film and video. For example, in Berlin, Colossyan helps companies make corporate videos, using AI and machine learning to generate real-life actors to create hyper-realistic, synthetic videos in minutes by inserting a script.
And now it's hit Hollywood.
Scott Mann is the co-CEO and founder of the neural network lab Flawless. He is a seasoned Hollywood Director and Producer, working with A-list talent on films such as "Heist", "Final Score" and "The Tournament", directing actors like Robert De Niro, Pierce Brosnan, Robert Carlyle, Kate Bosworth, Jeffrey Dean Morgan, and Bruce Willis.
As a director, Mann saw a problem with the visuals of films dubbed into other languages:
I had written a script with the actors, directed it, edited it, adjusted it and been so careful with not just the words, but the nuance of the sound of a word or the inflexion in what's implied or, the detail of expression. And then to see it ruined or almost like a caricature with all the wrong dialogue – it was awful.
In response, his company developed in-house proprietary software called TrueSync. Through a process Flawless calls "vubbing", it makes an actor's mouth movements match the dubbed dialogue when a film is translated into another language.
Generative AI is applied to video to "create something that didn't exist before using a lot of underlying ground truth training data of the original to make it as genuine as the original."
Mann describes this as a way to make speech authentic and capture the actor's nuance: "It's going from what's within and beneath a performance and capturing something action-centric to generate something new.
The company currently works with studios in post-production editing. Time Magazine rated the software one of the best inventions of 2021.
F-bombs are the enemy of film censors
The company also found a second use case for its tech – removing swear words film footage. Take Mann's film "Fall", where the protagonists climb vertigo-inducing heights and find themselves stranded with no way down.
If there's ever a movie that deserves a few F-bombs, it's Fall. You can bet the audience would be shouting more than "darn".
But the studio wanted it released with a PG-13 rating, but it had too many swear words. Shooting a film is, according to Mann, "very difficult, cumbersome, and expensive. It's a process of iteration from the captured footage that involves extensive editing to try and make the best version out of it, which may require reshooting, which adds even more delays and costs."
And this would be doubly hard given most of Fall was shot high in the air.
Thankfully, Flawless saved the day, and Truesync was able to remove the necessary amount of foul language quickly.
But are there ethical issues about changing what and how someone speaks in the first place or generating new dialogue altogether?
What are the ethics around AI in filmmaking?
Ethics and AI in filmmaking don't always go together. The documentary "Roadrunner" showcased the life of Anthony Bourdain, now deceased. It used AI on previously recorded audio for projects to generate dialogue that narrates the film.
This raised ethical issues for film critics, some of whom felt that audiences should have been warned in advance, giving them the option to opt out, especially since Bourdain could not consent.
Flawless works closely with the Guild unions representing various actors, directors, and writers. Mann shared, "I think it's always been a case of involving them as early as possible to ensure those protections are in place."
Mann notes that the film industry already has frameworks built around issues such as film rights, permissions, compensation and consent. "We're working on updating these to accommodate emerging technologies.
Will AI put people out of a job?
What about AI disrupting traditional industries and replacing people with computers? I'd be thrilled to watch a film in Deutsch that doesn't use the same voice-over artist for multiple roles, but Mann explained that dubbing isn't going away anytime soon.
"Synthetic audio isn't at the stage where it has the nuance to do it better than a human. And that's going to be the case for a while.
He notes that "we've all been victims of disruption in the past" and advises anyone worried about being replaced to "dive in and learn".
"This is creating a huge amount of creative opportunities, and embracing some changes will help many people sooner rather than later." Like digital editing and digital cameras, the film industry is forever evolving.
A future of superpower tools for filmmakers
Mann believes that the next five years will see more filmmaking tools that allow film technicians to bring things formerly restricted to film sets to the post-production environment:
These co-pilot tools are going to be superpowers in filmmakers' hands, and it's going to allow us to get the most from artists, actors and filmmaking. From my view, I would shoot a film differently. It would probably be shorter because I know what I can do on the other side.
While Flawless is currently designing software for high-level use, like an IMAX cinematic experience, there's scope in the future to push it into different sectors, including B2C and YouTube.
It's lowering of production costs will according to Mann not only improve filmmaking but increase quality and expand audiences:
It's going to get right to the heart of why we make movies – to communicate across cultures and languages.
And if you like watching movies, that's a good reason to get behind generative AI.
Lead image: Tima Miroshnichenko.