Zoo Digital
Zoo Digital works at the forefront of technology for dubbing speech in films and TV but needed a helping hand to make its next move.
ZOO Digital specialises in subtitling and dubbing film dialogue from one language into another and they’ve worked with many well-known names in the TV and film industry.
The global media company, set up in Sheffield in 2001, pioneered use of cloud technology for dubbing so actors could record voices remotely without needing expensive recording studios but knew that maintaining its competitive edge in the fast moving world of film production meant cracking another major problem.
ZOO’s AI Research Manager, Chris Bayliss, said: “It can be very difficult to find professional voice artists with availability, especially in certain territories and languages, When dubbing content that involves children, finding a child voice actor in those territories is even harder.
“We wanted to know if we could recruit an adult voice actor to read the parts, and then have their voice transformed into the voice of particular child.”
Tapping into the University’s AI expertise
ZOO, which works with a global network of over 5,000 freelance translators and dubbing artists, knew that changing an adult voice into a child’s voice was not just about changing pitch but adjusting the style of delivery too and decided to seek help from the University’s Speech and Hearing Research Group after several previous successful projects involving speech and AI.
“Initially we took our lead from them because we didn’t have any internal resource. Like most AI projects, we knew it would require a lot of training data – so lots of examples of speech from both adults and children – what we call parallel data where you have one adult reading a script but several children reading exactly the same script.”
“We don’t see this as replacing real actors. We are looking to generate a specific voice -so there's a real actor who has recorded their voice and provided it as a target specifically for this purpose.
“Our aim was not just to transform the voice, but also the performance, because this process is intended for TV dramas and the like. It can’t just be like a newsreader speaking in flat tones,” Chris said.
Finding children to produce the training data for the University collaboration was a challenge so ZOO staff turned to their own families for help with the research exercise.
“We ended up with a number of individual child speakers and around 10 hours of recorded speech in total.”
University of Sheffield Head of Speech and Hearing Research Professor Thomas Hain said: “Voice conversion is a long standing topic in our speech community and so we've worked on some of that technology before, yes, but not specifically for children. That’s why it was an interesting project.
“This project was all about machine learning. So, you need data that says: ‘Okay, here's an example of an adult speaker. Here's an example of a child speaker, same audio’. The machine then has to ‘learn’ the parameters that provide that mapping between the two.
“If you do it with reasonable quality you will end up with something like the fingerprint of a child’s voice.”
ZOO’s 18 month Knowledge Transfer Partnership (KTP) with the University ended in autumn 2023 and has left the company with a showcase of pre-canned examples of adult/child voice transformations to share with TV and film studios. The company ultimately aims to develop a live system where an adult voice is fed into software and a child’s voice comes out of the other end and expects its new competitive edge to be responsible for generating a £1 million profit within three years.
Professor Hain said: “From my perspective, the KTP worked better than expected and the outcomes were really very good. Given the constraints and timescales under which we were working, the quality of the generated speech was so much better than we could have hoped for - but for me this is all part of a journey where you try to basically upscale companies to use really modern technologies.
“The world of AI progresses very fast. And the very big computer companies that have a lot of money have an enormous advantage. One of the satisfactions for me is helping smaller companies to compete with that.”