Machine translation - Is AI really up to it?

(MDÜ article from BDÜ (German Translator Association)


Generative artificial intelligence is currently shaking things up, including the way (SMBs) deal with text and translation. However, the "fast, free and as good as human" promises made by vendors often do not stand up to scrutiny. Anyone who wants to make a difference with (foreign language) texts and words should take a look behind the scenes, because the time and cost savings are often only apparent - and are often accompanied by considerable risks, including damage to image or liability.

While the development of neural networks and deep learning led to an immense increase in the performance of machine translation almost 10 years ago, the release of ChatGPT in November 2022 and its development since then have catapulted everyone who works with text and foreign languages into an unprecedented dimension: Quickly translate the website into Japanese, test the AI-translated version of the new product campaign developed for Germany on the Brazilian market, or deliver the video lecture in Arabic in no time at all, lip-synched of course - all with just a few prompts, at the touch of a button, and virtually free of charge... How fascinating can the new (marketing) world be?

Very, no question. But unfortunately, the same applies here: The devil is often in the details, and fascination quickly gives way to disillusionment when the beautiful website in Japanese somehow fails to attract anyone, the campaign test in the Brazilian market elicits only laughter, and the video presentation leads to a shitstorm on social media.

But why - it all sounds so good (at least what you understand)? And translation is just swapping words, so why doesn't it work out as hoped in the end?

These are questions whose answers are particularly relevant to SMEs, especially smaller ones. As panel surveys have shown, smaller companies (which generally do not have their own language departments) are eager to experiment with the new possibilities of AI for translation - especially in terms of rapid availability and cost savings. However, the potential dangers of unchecked machine translation are often overlooked. For smaller companies in particular, mistakes can quickly become life-threatening as a result of unintentional over-reliance on - supposedly - perfect AI translation.

A look into the engine room of machine translation (see 'Insights') reveals where these dangers and challenges lurk: when the machine translates, it does not do what a human does when translating. Instead, a purely mathematical, algorithmic sequence of words or "tokens", based on the principle of probability, is combed through a vast collection of data.

Whether this probability corresponds to what the author actually wanted to express, or even to the truth, is simply irrelevant to the machine.


Background: Machine translation and its evolution

What do humans do differently - and better - when translating?

Contrary to popular belief, translation is not just a matter of transferring individual words, then putting them into the correct verb form, and so on. In fact, translation is not about words, but about words. And there is a big difference.

"Real" translation is more about transferring the content and meaning of a written text into another language as accurately as possible, taking into account the communicative intent and effect as well as the addressees.

The fundamental question is what the text to be translated is. In order to determine this, the content must first be fully understood. MTs or LLMs do not do this - qualified translators do, which is why they tend to specialise in certain subject areas.

What's more: "No one reads a text as accurately as the translator", as the saying goes. As a result, professional human translators often catch content inconsistencies (e.g. when something in a sentence is mixed up due to multiple copy/paste operations) or content errors (e.g. an incorrectly placed period instead of a comma in a decimal number, or a number with too many digits) when reading and capturing text. While the AI simply ignores these, professional translators ask questions in such cases - an additional upstream quality check, so to speak.

Finally, when translating texts, human translators also consider for whom or for what purpose the text is being translated - taking into account any applicable DIN or ISO standards, as well as text type conventions: The linguistic style and structure of a press release in Germany may follow different rules or conventions than in other language areas or cultures, and there may even be requirements (by law or other regulation) that must be met. In one case, non-compliance may simply result in a "slightly unusual" style; in the other, it may have consequences that could lead to liability. The same applies to cultural aspects, which may even require the translation to deviate from the source text in order to achieve its goal and fulfil its purpose.

Even though ChatGTP and co. work hard in the background on these aspects: With suitably qualified and professional translators, companies are currently on the safe side. Quite apart from the fact that humans are much more creative with language than artificial intelligence. After all, AI only outputs the result of complex, stochastic calculations.


Risks and side effects

Even if a text produced by a machine or a generative AI reads well and fluently, there is no guarantee that there is not an error somewhere, that a linguistic nuance deliberately placed in the source text has not been recognised and translated differently - or that something has not been "hallucinated" into the translation (see "Background"). It can happen that a small but perhaps not unimportant "not" is lost in translation - and the meaning of the sentence is suddenly reversed. It's painful when an instruction manual says that you should (not) run over your foot with a running lawnmower. This sounds amusing in the example, but can have serious consequences in reality.

The situation is similar with generated text. ChatGPT & Co. are still known for simply making things up and claiming them as their own. This so-called hallucination can range from the invention of a fictitious CV to the naming of a correct-looking but non-existent court order reference number. Why? Because the invented information is ultimately the most likely result based on the model's training data at the time. And so it also happens that when typing in text directly for translation, for example in DeepL, the machine simply completes a sentence on its own - even though you haven't even typed in the conclusion yet.

The point is that the more fluent the text or translation sounds, the better these typical "machine errors" can be hidden. Proofreading is therefore a must and requires not only a high level of concentration, but also technical and linguistic expertise. In-house staff who have a "pretty good" command of the language are not really the best choice for this task: qualified and experienced translators know (based on their linguistic skills, but also based on their knowledge of the quality possible by machine) where the machine likes to make mistakes and find them more reliably and quickly.


Quality of training data

How well an AI solves text and translation tasks in principle also depends crucially on the quality of the data material used to train the systems. If this material contains errors or is incomplete, it will lead to poorer translations. The challenge is to collect enough high-quality and diverse data to improve the accuracy of the content generated by these models.

But it's not just about errors or incompleteness. What happens if the training data used by LLMs contains biased or discriminatory patterns? Or if the material is subject to government control? The models may then reproduce these patterns in their translations and generated texts. The dangers are obvious: prejudices may be unconsciously reinforced, or even discriminatory statements may creep into the translation that were not present in the source text. Transparency about the training data is therefore essential to ensure that the output of the models - in the form of texts and translations - is fair and unbiased.


Complex legal situation - and lurking (data protection) traps

Last but not least, there are still many legal issues surrounding the use of machine translation and AI-generated texts, particularly with regard to copyright, data protection and liability.

With regard to copyright, the first issue is the possible protection of texts created or translated by AI. Under German law, the level of creativity required for protection is often lacking; the situation may be different in other legal systems. Another set of questions is currently attracting increasing interest: Does the use of the data used to train the freely available models already constitute a copyright infringement? Several lawsuits are currently pending in the US. The situation in Germany remains to be seen.

However, companies are likely to be much more concerned about data protection issues - regardless of whether AI systems are used for translation or authoring. The General Data Protection Regulation, which applies in Europe, sets out clear rules for personal data. GDPR-compliant data processing is certainly possible with AI tools, but it requires some effort: If necessary, personal data must be removed from texts before they are processed by the AI. In addition to the GDPR, when using artificial intelligence, it is also important to ensure that data is protected from industrial espionage through targeted hacking or prompt injections.

However, the bigger risk for many companies - especially smaller ones - is not the AI itself or a possible external attack, but the people in the workplace. With the omnipresence of computers, tablets and smartphones, it is simply very tempting to quickly send a text email through the free DeepL version - or even upload an entire document there. Regardless of whether the document contains sensitive data or information, how do you know if you're uploading it for translation because you don't speak the source language? And who knows that these free systems usually use the data to improve the systems - and thus make it available to the general public?

If you take this aspect further, you quickly come to a question that can even threaten the existence of smaller companies in particular: Who is liable if machine translation causes damage? While this is easy to understand in the case of "human data leaks", it is less clear in the case of errors in - unchecked - translations, even by "authorised" in-house AI models: Is it the operators of the AI systems, the users or the developers?

Regardless of the fact that this question is currently occupying hordes of specialist lawyers, when it comes to the translation of liability-relevant, sensitive or critical content, the answer can only be one: AI can only ever be a supplier here; a review of the results by competent, appropriately trained people - ideally qualified specialist translators - is a must and of existential importance to businesses.


When does AI translation make sense?

There are, of course, scenarios where machine translation is possible and useful - even without human review. For example, machine translation could be used to pre-translate public tender documents in order to decide whether the specifications and award conditions justify more intensive work (and possibly a professional translation). Machine translation is also an option for small-scale communications that do not involve economic or interpersonal risk. Especially if you are in direct contact with the other person and can identify and correct any misunderstandings based on the response.


So what should you bear in mind when using ChatGTP & Co for translation?

First of all, it is important to be sensitive to privacy issues: what data (i.e. texts) can be entrusted without hesitation to the respective (third-party) systems? If in doubt, opt for a closed subscription or contract that guarantees that the data entered will not be used to train the public systems.

As a general rule, unverified machine-translated (or even created) texts should be marked as such - with a note that they may contain errors.

The more important a text or translation is, the more care should be taken to ensure that machine output does not reach the outside world without being checked and, if necessary, reworked by a competent specialist. Ideally, this person should be a qualified translator - because, as professionals, they have long since incorporated machine translation systems into their ever-expanding toolkit, are aware of the weaknesses of current AI systems, and know what to look out for in order to use them in accordance with regulations and laws, so that the systems can bring their advantages to bear in the interests of their clients. And they often have a range of other services in their portfolio that can turn a pure translation into a real value-added service for the company: from creating and maintaining company-specific terminology glossaries or databases to SEO optimisation of translations (or texts in the source language). As with many things, this is best discussed in detail in a face-to-face meeting (where the personal relationship of trust can develop, which is particularly valued by small and medium-sized enterprises).


Back