In addition to audio description, VIDEO TO VOICE is actively involved in other fields surrounding accessibility for digital media. To break down language barriers, Frazier includes a neural machine translation service. With lower costs and fewer delays, this option is a popular choice for clients wanting to create video content in multiple languages.
Jokes about online translation tools are so last decade. When the first real-time machine translation services emerged in the early 2000s, they were ridiculed for producing questionable results. At best amusing, and at worst wildly offensive, translations created this way would often be too literal and stray away from the intended meaning of the original text.
But as technology has improved in recent years, there have been noticeably fewer “translation fail” memes doing the rounds on social media. The latest developments in machine translation involve neural networks. These are subsets of machine learning that try to mimic how the human brain works during the translation process.
Intrigued? Read on to find out everything you need to know about the language service industry, where providers are having trouble, and how machine translation makes localization easy.
It's simple – we all want access to content and information that is naturally communicated in our own language.
From a business standpoint, effective communication helps companies improve their brand image, boost customer satisfaction, and increase productivity of staff.
As well as keeping existing customers happy, translated content can help companies expand into other markets. This is backed up by findings in the 2020 Common Sense Advisory Report “Can't Read, Won't Buy”:
The global language services industry was valued at $49.6 billion in 2019. The market is forecasted to reach $77 billion by 2024, spurred on by increasing demand for video content in particular.
By 2022, 82% of internet traffic will be video content, a 15-fold increase in the number of videos from 2017. Over 4 in 5 businesses now use video as part of their marketing campaigns.
With more video content comes the need for more translation and localization. But as production increases, an ever smaller fraction of videos are being localized.
Language service providers are not equipped to keep up with the workload.
This is mainly due to time and cost factors.
The traditional workflow usually follows this structure:
The translation itself is usually the most time-consuming step. If there are tight deadlines, potential clients may choose to forego the translation altogether.
Money is another important consideration.
With average translator rates varying between $0.07 to $0.15 per word, potential clients may not have the budget to get the translation done.
If audio needs to be translated, the client also has to consider voice artist and studio costs for the new recording.
This is why language service providers need to embrace machine translation to meet demand.
Machine translation automatically converts a source text from one language into a target text in another language.
Machine translation saves the language service provider time and money, making it more likely that potential clients will decide to translate their content.
With machine translation, the target text is ready in seconds. This also eliminates the costs involved at the actual translation stage.
The process is simple. After the translation is generated, a post-editor looks over the text.
Post-editing is the stage where human linguists make any necessary changes to the translation for the finished product.
The earliest machine translation technologies were developed using a rule-based system. This means the tool would translate sentences word for word based on dictionaries and set grammar rules.
Unfortunately, the rule-based approach often failed to take into account the context and overall meaning of the text. These errors are often lampooned in humorous translation fail memes doing the rounds on clickbait sites.
The rule-based system has since given way to more refined models such as statistical machine translation and neural machine translation.
Both statistical machine translation (SMT) and neural machine translation (NMT) are AI-based and use corpora.
Corpora are texts that have been written and translated by professionals in different languages, which can be compared side by side.
For example, EU Parliament documents need to be translated into each member state's language. Therefore, these parallel data sources are a good example of the sorts of texts that can be used in corpora.
SMT matches up equivalent words and expressions in texts, while NMT learns from them using neural networks.
Neural networks are a subset of machine learning that attempt to replicate how the brain works when translating between two languages.
NMT accounts for the context in which a word is used, instead of just translating each word on its own.
For example, the technology recognizes whether the text is using a formal register or slang. If a user makes any corrections, the system also updates itself with the new translation.
As a result, these sophisticated systems achieve high-quality results and have since become the industry standard.
The following providers currently dominate the market:
DeepL recently expanded its services to support 23 languages with the promise of more on the way.
Amazon Translate supports 54 languages and dialects, resulting in 2,804 language pairs.
Microsoft Translator has 90 languages and dialects available on the platform.
In terms of quantity, Google Translate comes out on top with 109 languages to choose from.
It is difficult to determine which provider delivers the best results.
One key factor is the quality of the training data available for a particular language pair. Training data is a large dataset that is used to teach machine learning models.
In-house translators at phrase.com conducted a blind test to evaluate target texts generated by machine translation providers. Each pair had a different first preference:
Language pair | 1st preference | 2nd preference | 3rd preference | 4th preference |
---|---|---|---|---|
English to French | Microsoft | DeepL | Amazon | |
English to German | DeepL | Microsoft | Amazon | |
English to Russian | Amazon | Microsoft | DeepL |
As the results show, it is important to shop around and read evaluations to find the best provider for your language pair.
Neural machine translation technology has been integrated into Frazier.
This option is particularly popular with clients who need to produce multilingual versions of their videos.
For example, Swiss companies often provide content in the four national languages: German, French, Italian, and Romansh.
On Frazier, users can generate a version of the video in another language in seconds. A second user can then be added to the project to perform any post-editing on the translated text.
Once approved, human-like synthetic voices read out the translated audio. The new version of the audio is then mixed with the original soundtrack and mastered to professional broadcasting standards.
The neural approach has transformed machine translation's reputation. With demand for video growing at an exponential rate, neural machine translation, combined with post-editing, is a reliable solution for localizing content quickly and at low cost. Language service providers now need tools such as Frazier to keep up and increase output.