My rage against the machine

Updated: Jul 9, 2018

I was about to start working on my first machine translation post-editing project in the field of IT, but the client did not specify any requirements in terms of UI items and did not provide glossaries. I was provided with a set of instructions that confused me though.

They were basically saying that I should not spend more than 4-5 seconds evaluating whether the machine translated segment can be easily and quickly reused. I should not make preferential changes to correct what is already good and I should refrain from making unnecessary fixes. I should reuse as much as possible in order to increase productivity.

OK then. Dear machine, let's see what you got.

Allow users to log in with their own credentials for all web applications was translated to say To allow user to stump in with their to own certificates for all weaving of applications in Croatian.

Gotta translate this from scratch.

Your account has been locked out for 5 minutes, please try again later = Your account is locked outer side for 5 minutes, to gratify to try again subsequent

Gotta translate this from scratch.

We can’t seem to find the page you are looking for - We hypocrisy to seem to find the page you are looking for

I know that "cant" is hypocritical and sanctimonious talk, but dear machine, how did you get "hypocrisy" out of "can't", with an apostrophe?

Are we playing the Shave My Legs For Free game?


It seems like the connection is slow or something went wrong while loading the data = Internet to seem like connection is sluggish or something he left topsy turvy short time span bunker data

Obviously, there were segments that had to be translated from scratch. Other than that, there was one segment in Cyrillic - perhaps the machine confused Croatian with Serbian? The pronoun "it" was consistently translated as "Internet", which was quite a puzzle. Another thing that this machine was consistently bad at was capitalization. Capitalization rules are quite different in English and Croatian, but something made this machine think it can think, so it did not capitalize what had to be capitalized and vice versa.

Did you know that "part(s)" can be translated in a number of ways in Croatian and neither of them would be using parentheses, e.g. dio ili dijelovi, dijela ili dijelova, dijelu ili dijelovima, dio ili dijelove, etc. Croatian has 7 grammatical cases, whereas English has three, completely unrelated to the ones in Croatian. Come on, machine, I dare you: show me your algorithm for that!

As for the UI elements, it was up to me, so I decided to fully translate them. I used the MS-approved terminology, and I can only hope that the readers are using the localized version of the software to which this documentation relates.

What made me smile was that this machine even produced some typos. Which made me like it a little better. Maybe it's more similar to humans than I thought?

Final thoughts

I still don't see where this is going. I did spend more time post-editing unusable machine translation output than I would translating this text from scratch. Some segments made me laugh, while others frustrated me because they reminded me of editing substandard translations for peanuts.

I understand that there is potential in this field and that a whole new generation of machine-translation post-editors may be formed. What worries me is the impact this may have on language in general.

Since machines translate words instead of ideas, for machine translation to work, source text needs to be perfect, but not only that; it has to be written clearly and as simply as possible. In the era of globalization, will we modify the way we write only to be able to translate using machines? When you input a complex sentence filled with idioms or phrasal verbs into Google Translate and you don't get the expected result, you become aware that you need to simplify your sentence. Lose the phrasal verbs, lose the idioms, lose any possible ambiguity and you get the perfect simplified source for your perfect cheap translation.

