As far as PayPal is concerned Canada only speaks English

We recently launched a new client site and as part of the process we needed to hook the site up to PayPal’s transaction processing screens.

A dig through their site / developers area didn’t really turn up any info on how to present the screens in localized form. After a few emails back and forth we were finally pointed at a PDF and buried deep in there we found the code needed.

PayPal was actually pretty smart about making it easy to enable the localized interface – it is essentially a hidden form field named “lc” where you input the language code.

This is where things start to fall apart.

On the next page is a list of “language codes” – only last I checked the following weren’t languages:

  • Canada
  • United states
  • Switzerland
  • United Kingdom
  • Belgium

Yep, all of the so-called language codes are actually country codes. No mention of what dialect / language has been used to do the localization. Just country name and code.

This is problematic on a couple of levels.

  1. Many translated sites aren’t targeting a specific country. Our client, for example, built this site with a global audience – so which country’s Spanish should they use? Not the end of the world, but, it creates a cognitive burden for people trying to set up a site when they may not be entirely familiar with the subtle differenced between regional variations of a language. Worse still, because there’s no indication of what regional variation was used to create the localized version you’re left guessing. At the end of the day it might not even be worth considering because they could have easily used the same variation of Spanish for any country they deemed to primarily speak Spanish.

  2. The even bigger issue though is when it comes to bi-lingual countries. Let’s say I’m developing a site exclusively for the Canadian market. Canada is a bi-lingual nation (English & French). So what happens when I select “Canada” as a language code?

According to PayPal Canada is an English speaking country, the response to my question “How do I access the French Canadian localized interface?”:

The below are the country codes available for the french language.

FRENCH SOUTHERN
TERRITORIES= TF

FRANCE = FR
FRENCH GUIANA= GF
FRENCH POLYNESIA= PF

Hmm, great. So for English I’m okay to use the “language” code for Canada but if I want French I need to pick another country???? What is someone who is not deeply immersed in language supposed to do with that? Is there any difference between those “languages”? Which is the closest to the most generic/global French? What are my French-Canadian users going to think of this?

This is the exact same problem as using flags to represent languages. Countries may have adopted a language as the official language (or languages) of their country but a country does not represent a language in any way. What “language” does a Canadian flag represent? A Belgian flag?

An easy solution

The solution to this isn’t all that difficult. The issue is really just someone at PayPal coming at the information provided to them for this process in a slightly incorrect manner. The language codes should correspond to actual, agreed upon language codes (i.e ISO 639-2). You can then take these codes and create a reference chart where users can either select from a list of languages that have been flagged as “global friendly” (i.e. they are common enough to most variations of a language that it is commonly accepted as the generic form of that language) or, if they have a specific target country they can reference a c hart which features the country name, languages spoken and recommended ISO codes for those languages.

Even better it could specify the code for the commonly accepted generic variation as well as identify the codes for any regional dialects that are supported.

At the end of the day PayPal should be commended for taking the steps to try and offer a broadly localized interface – it just needs a couple of tweaks to make it more user-friendly.

Technorati: , , , ,

  • Norbert

    ISO 639 is the right direction, but by itself it is also inadequate. That’s because some languages come in variants that have diverged so much that they require separate localizations. There are differences in vocabulary and grammar between different countries in languages such as Spanish and Portuguese. There are also different writing systems used for languages such as Chinese (simplified and traditional characters) and Mongolian (Cyrillic and Mongolian writing). The current standard for identifying languages on the internet is RFC 4646, which addresses these issues.

  • Pat Hall

    Hi Ryan,

    Very interesting post, I hope Paypal comes around on this.

    In response to Norbert’s observations, it’s my own impression that the i18n and l10n world is really hoping that ISO 639-3 will prove flexible enough to resolve the sorts of problems he describes. They are certainly real.

    As far as things like the two writing systems for Mongolian, the emerging standard on script names will probably play a role, but the whole picture of describing languages in terms of writing systems, dialects, etc, still strikes me as pretty complex and beyond the ken of the average web developer.

    And there’s still Ulong way to go even in getting the script names right.

    But at least there’s a process in place which will take us beyond the current status quo… which is pretty static.