Saturday, 16 January 2010

English - the superlanguage

My first programming language (ND80) saved all identifiers by reference in order to save RAM, which was scarce. Using the swap instruction, it was possible to replace any identifier with another, so basically, the entire, programming language could be translated to Danish, my native language. Sounds ridiculous, right? It was. Later, Microsoft did the same: Excel functions were translated to Danish, and even VB programming was Danish-ified, COM APIs were localized etc. This caused a huge amount of problems - it made support difficult, it made it difficult to find help on the internet, localizing APIs meant that some apps did not work with MS programs that were localized to other languages etc. Of course there were workarounds and solutions for most of the "problems", but the problems were real and sometimes caused real havoc. One of the 5 Danish regional administrations just introduced ODF as standard format for document interchange between MS Office 2003, MS Office 2007 and OpenOffice, because this solves problems like date format problems (ddmmyy in some, localized ddmmåå in others). It will not be solved fully, because if you have an expression in a spreadsheet where 'ddmmåå' is part of it, it may not work in a non-Danish spreadsheet at all, no matter how you save it. The easy solution was to do everything in English, using U.S. notation (decimal dot instead of decimal comma) etc. I guess everybody now realized that this is the way to go for source code, APIs, XML files etc.

However, in the recent years, evolutions in the internet has expanded this problem. Humans are increasingly interfacing directly to software, specifying parameters. The most common interface is the search engine. How do you explain it easily to a 6 y.o. how the angle of Earth's rotation axis creates summer and winter? Youtube, of course. But don't use Danish words for your search, it will probably not yield a single good result. So, even though my daughter can write on a computer, she still cannot use youtube. She doesn't know English. I encounter this problem many times per week.

The problem is not just limited to searching. Many electronic devices are not localized, a lot of software is not localized, and what language do you use on Facebook if your friends don't all understand Danish? Wikipedia is another good example: The absolutely biggest wikipedia uses the English language, and it is 3 times bigger than number 2: German. Wikipedia has become a significant provider of information, and you simply need to know English to use it.

In order to understand all implications of international contracts, English is the language of choice. EU has made a guide for European English, which defines a terminology that may not always match that of any English-speaking country, and many terminologies are translated from English-language originals. English has become the new Latin.

Google Translate tries to solve some of this. However, when I read Chinese web pages in Danish, using Google Translate, it is obvious that it was translated to English before it was translated to Danish. There can be many reasons for this, but it surely helps comprehensibility when I use Chinese->English instead of Chinese->Danish. Anyway, Google Translate cannot solve all problems, it's merely a patch.

Only 1-2 decades ago, you would have looked at the size of countries, measured by population count and economic size, in order to find out what language to learn. Today, English is much larger than the sum of English-speaking countries.

The latest statistics indicate, that other languages than English are currently losing popularity in school in Denmark. That's a problem: Most people in the world don't do English well. If you want to target those people, you need to localize. Even when you meet a person that seems to talk and understand English well, you need to realize, that this sometimes requires the full brainpower of that person. In other words, if you ask this person to solve a complicated task, that involves the use of English language, like programming HD recorder, it is much harder than if the HD recorder had been localized. Also, just because a person knows how to express himself/herself in English in a given context, it doesn't necessarily mean that this person can express himself/herself in another context that would work out fine in his/her native language. In order to localize well, an application specialist should know the target language well enough to be able to inspect the localized result.

So, remember to localize, learn languages, and remember to teach your children English. And in the unlikely case that your native language is English, here is a sign not to laugh at, it's very serious:

If you're in doubt about what it means, use Google Translate.


Chris said...

Just a couple of minor comments...

'Only 1-2 decades ago, you would have looked at the size of countries, measured by population count and economic size, in order to find out what language to learn.'

Hardly, on both the timescale and the supposed change of ratioanle. English has been dominant in the West ever since the US became the dominant economic (and therefore political) power after WW2 - that Pascal uses English identifiers, despite being designed by a Swiss, is testament to that.

On the OLE thing, for sure the initial attempt at localising the Word and Excel macro languages (along with other stuff) turned out to be a bad idea, but that *binary* operability proved problematic was not something intrinsic to the idea of localising macro languages, but the specific implementation of that idea. Indeed, in principle, the design of IDispatch should avoid this problem, since method identifiers in it are intrinsically arbitrary (it's the numeric DISPID values that count).

'The absolutely biggest wikipedia uses the English language, and it is 3 times bigger than number 2: German.'

With respect to that particular example, 'lots more articles' doesn't necessarily mean 'better' - the English and German articles for Delphi are a case in point. More generally, the standard of philosophy articles on the German Wikipedia is typically superior to the English one too.

Lars D said...

@Chris: 20 years ago, we could receive Danish, Swedish and German TV but no English-language TV. When we went border-shopping, we needed to speak German, Swedish, Polish and maybe Dutch - it's all within 500km distance - but English was not an option. If you wanted to study mechanical engineering at the university, you'd better be good at German, because the foreign books were in German language.

This has changed. Most people in Europe are still not able to communicate in English, but for many people, English has become a vital part of a normal day.

A good example is, that even Playmobil has now replaced "Polizei" on their toy-cars with "Police" in many countries.

Xepol said...

Remember speed control? I suspect that's phrase that translates very poorly.

I'm having troubles even guessing at what it means out of context. In the context of it being a yellow traffic sign, I can only imagine it means something like There is construction ahead, decrease your speed before you kill someone or get a huge fine or loose your license/car.

It's a little like "slow children playing" - without a little context (and indeed some punctuation) the actual meaning can be a little obscure (ah, the commedians have fun with it)

This is an interesting discussion about how a single language is eventually going replace the rest. One might be inclined to suggest that it will be english or chinese, except the chinese are too insular are do not have a singular language to rally around.

Eventually I will all be english unless your on a farn run by a religious group OR live in Quebec. (quick, someone guess what country I live in! :-)

Lars D said...

The sign basically tries to remind the drivers that they are now approaching a police radar that measures the vehicle's speed.

English will not replace all other languages - but during the last 20 years it has replaced much of the Danish used in Danish universities, and it is slowly replacing a lot of the Danish used in Danish high schools. This also applies to much larger countries.

LDS said...

Well, many of us would not be here communicating to each other if we didn't speak the same language. IT was born in England and USA, and spoke English since the beginning.
But it is true we should not forget other languages, and imply that everybody using a computer knows English well enough. Localization was and is still a need - I just wish Delphi localization tools could work too far better - right now they don't with widely used 3rd party controls (JVCL and DevExpress, for example), and need polishing. But Delphi always had a USA-centric approach despite its success abroad.

Lars D said...

Actually Delphi's origins are in Europe. Pascal was invented in Europe, and Anders Heilsberg developed the first product in Denmark, not targeted at the American market. The first commercial computer, that I used for CompasPascal / PolyPascal / TurboPascal, was a Danish computer with Danish manuals and a Danish programming guide for Pascal. It was not necessary to know English in order to learn Pascal programming.

However, it was based on ASCII, which is surely American. The ASCII standard was a big roadblock for localization for many years.

With regard to localization, you may want to have a look at

LDS said...

That's why I said "Delphi", not Pascal which has wholly European origins - both the language and it most successful implementation.
Also, localization > string translation. dxgettext does a decent job at translating string, but does not allow to fix control sizes to accomodate the strings easily, and moreover does not allow translations of non string items. When you target cultures different from the western ones, even graphics elements may need localization to avoid misunderstanding or worse.
That's why I prefer resource-based localization tools over others.