Research evaluates the quality of AI literary translations by comparing them with human translations

news7f11/09/2022

1 3 minutes read

Research evaluates the quality of AI literary translations by comparing them with human translations

Recent advances in the field of machine learning (ML) have significantly improved the quality of automatic translation tools. Currently, these tools are mainly used to translate basic sentences, as well as short texts or informal documents.

Literary texts, such as novels or short stories, are still fully translated by professional translators who are experienced in capturing abstract and complex meanings and translating them into another language. other language. While several studies have investigated the potential of computational models for translating literary texts, findings in this area are still limited.

Researchers at UMass Amherst recently conducted a study that explored the quality of machine-generated literary text translations, by comparing them with similar human-generated text translations. out. Their findings, pre-published on arXiv, highlight some shortcomings of existing computational models for translating foreign texts into English.

“The machine Translate (MT) has the potential to complement the work of interpreters by improving both their training process and overall efficiency,” Katherine Thai and her colleagues write in their paper. Literary translations are less constrained than traditional MT settings because translators must balance critical equivalence, readability, and interpretability in the target language. This property, together with the complex context at the discourse level in the literary text, also makes the literary environment more difficult to model and evaluate computationally. “

The main goal of recent work by Thai and colleagues is to better understand the ways in which modern MT tools still fail to translate literary texts when compared to human translations. They hope that this will help identify specific areas that developers should focus on to improve the performance of these models.

“We collect a dataset (PAR3) of non-English novels in public domaineach was aligned at the paragraph level for both automatic English and human English translations,” Thai and her colleagues explain in their paper.

PAR3, the new dataset compiled by the researchers for the scope of their study, contains 121,000 passages extracted from 118 original novels written in languages other than English. For each of these passages, the dataset includes a number of different human translations, as well as a translation generated by Google translate.

The researchers compared the quality of human translations of these literary passages with those generated by Google Translate, using common metrics to evaluate MT tools. At the same time, they asked professional human translators about their preferred translation, and prompted them to identify problems with their least preferred translation.

“Using PAR3, we found that professional literary translators preferred human-referenced translations to machine-translated passages at an rate of 84%, while automatic MT indices now age is not correlated with those preferences,” Thai and her colleagues write in their paper. “Experts note that MT outputs contain not only mistranslations, but also errors that disrupt discourse and stylistic inconsistencies.”

Essentially, the findings gathered by Thai and her colleagues suggest that metrics for assessing MT (e.g., BLEU, BLEURT, and BLONDE) may not be particularly effective, as translators do not agree with their predictions. Notably, the feedback they gathered from translators also allowed the researchers to identify specific problems with the translations generated by Google Translate.

Using the feedback of human experts as a guide, the team eventually created an automated post-editing model based on GPT-3, a deep learning approach introduced by a team of experts. Research at OpenAI. They found that professional human translators preferred the literary translations generated by this model by 69%.

In the future, the findings of this study may inform new studies exploring the use of MT tools to translate literary texts. In addition, the PAR3 dataset compiled by Israel and her colleagues, currently public on GitHubcan be used by other groups to train or evaluate their language models.

“Overall, our work uncovers new challenges to advancement in literary MT, and we hope that the public release of PAR3 will encourage researchers to address them,” the authors wrote. the researchers concluded in their paper.

More information:
Katherine Thai et al., Exploring literary machine translation at the document level with parallel passages from World Literature, arXiv (In 2022). DOI: 10.48550 / arxiv.2210.14250

Journal information:
arXiv

Quote: Study assessing the quality of AI literary translations by comparing them with human translations (2022, Nov. 8) retrieved November 9, 2022 from https://techxplore.com /news/2022-11-quality-ai-literary-human.html

This document is the subject for the collection of authors. Other than any fair dealing for personal study or research purposes, no part may be reproduced without written permission. The content provided is for informational purposes only.