Artificial intelligence is increasingly finding its way into the entertainment industry, with the technology being used to generate fake Drake songs and Hollywood writers striking over its lack of regulations.
In its proliferation, AI is now impacting the past. No, AI hasn’t become so powerful that it’s capable of time travel, but it can be used to bring a person’s voice back from the dead, including John Lennon’s.
Recreating John Lennon’s Voice Using AI
In an interview with BBC Radio 4, The Beatles’ Paul McCartney explained that he’d been able to use AI to isolate John Lennon’s voice from an unreleased demo track that was gifted to him by John Lennon’s widow, Yoko Ono, after his death in 1980.
The BBC speculates that the unreleased track is a 1978 composition titled “Now And Then.” This is due to McCartney’s expressed interest in the track in previous years. In a 2012 BBC Four documentary, for example, McCartney described the track as:
“Still lingering around,”
He said:
“So I'm going to nick in with Jeff and do it. Finish it, one of these days."
Thanks to artificial intelligence, it seems that day has come, and the track is likely to be released at some point this year.
On using AI to clean up the track, McCartney said:
"We had John's voice and a piano, and he could separate them with AI. They tell the machine, 'That's the voice. This is a guitar. Lose the guitar'.”
He continued:
"So when we came to make what will be the last Beatles' record, it was a demo that John had [and] we were able to take John's voice and get it pure through this AI.”
Source: https://www.bbc.co.uk/news/entertainment-arts-65881813
What Is AI Voice Cloning?
Paul McCartney’s crude explanation of how he and his producer used AI to extrapolate John Lennon’s voice from a “ropey” recording leaves much to be desired. However, it does touch on a key point–the separation of audio tracks and instruments despite being recorded together.
There are many ways to do this using AI, and without speaking with McCartney’s audio engineer, it’s impossible to know which method they chose but it’s likely they used a form of AI voice cloning.
This type of technology generates human-like speech – or in this case, singing – from written text by training an AI model on a dataset of recorded human voices. In other words, AI voice cloning involves an AI model listening to and mimicking the sound of a human voice on a large scale.
How Does AI Voice Cloning Work?
So, how exactly might someone clone a voice using AI and reproduce it for use in music and entertainment? Here’s a step-by-step explanation.
Data Collection: A large dataset of recorded human voices – or in this case, one person's voice – is collected to reflect a diverse mix of styles, accents, ranges, and emotions.
Preprocessing: The audio data is then processed to extract key features, such as intonation, pitch, and duration. This step helps the AI learning models used later to better understand the audio formatting.
Model Training: A deep learning AI model, often based on neural networks like ChatGPT and other generative models, is then trained on the preprocessed audio data. This means that it works to identify patterns, relationships, and distinct characteristics of the voice(s) in the audio data.
Text-to-Speech: Once the model is trained, the synthesis phase begins. During this phase, the model takes written text as input and processes it to generate speech. The text may also undergo linguistic analysis and phonetic mapping to ensure accurate pronunciation.
Voice Generation: The trained model leverages the knowledge acquired during training to generate the corresponding speech output. It applies learned speech patterns, intonations, and other features to produce human-like speech. Various parameters, such as pitch, speed, and emotional tone, can be adjusted to customize the generated voice.
It’s important to note that the steps involved will vary depending on the application of the AI-generated voice and the AI model that’s used. There are also many easily accessible voice cloning platforms available to help ease the process, including Murf.ai, Speechify, and Respeecher.
A Positive Example of Deepfakes
AI voice cloning is an example of deepfake technology. We’ve already covered deepfake technology in a previous article, so we won’t go into detail here. What you should know, however, is that deepfake technology gives people the ability to accurately recreate the likenesses of real humans using AI.
Deepfake is almost seen as a dirty word, coming with many negative connotations due to the harmful way it’s often used. Yet, Sir Paul McCartney using AI to recreate John Lenon’s voice is an excellent example of a positive use of deepfake technology for entertainment purposes.
As McCartney notes in his BBC 4 interview:
"[AI is] kind of scary but exciting, because it's the future. We'll just have to see where that leads."
Thanks for reading.
If you enjoyed this article, please subscribe to receive email notifications whenever we post.
AI Business Report is brought to you by Californian development agency, Idea Maker.