Google Integrates Automatic Translation into Gmail

Not exactly a stunning technical development, but still pretty interesting: Google has gotten around to integrating their translation tools right into Gmail through their labs section. Once the “Message Translation” plug-in is activated Gmail will detect if an email is not in your default language and will automatically give you the option to translate it. You can see the “Translate message to:” option in the screen shot below:

autotranslate
Source: Google Gmail Blog

While they readily admit “it’s not quite the universal translators we’re so fond of from science fiction,” they do make a comment that I thought was pretty interesting:

“If all parties are using Gmail, you can have entire conversations in multiple languages with each participant reading the messages in whatever language is most comfortable for them.”

This is an interesting concept, and certainly for any non-mission critical exchanges will be quite handy – although I have to wonder what some of the quoted text further down in the email will look like after multiple runs through the machine translation system. I’ve found in the past even once through the grinder and back can leave the text pretty mangled, who knows what several exchanges back and forth will leave it looking like.

Either way an interesting addition to the Gmail system and another tiny step towards knocking down the language barrier.

Move over America…

… China is about to become the largest online population. Between the end of 2006 and the end of 2007 China added roughly 73 Million users to the Internet.

73 MILLION.

To put it in perspective – even if Canada doubled it’s population and put an internet connection in the house of every man, woman and child in the country. We’d still come up about 7 Million people short.

On the flip side though there’s two factors at work here. China’s population is roughly 1.3 Billion right now which means a total user base of 210 Million is only a 16% penetration rate. In Canada we have a ~65% penetration and the US has ~71%.

India will no doubt pick up steam in the coming and will definitely rank in the number 2, if not number 1 spot.

So what does this mean for the Internet in general?

The connected world’s borders are no longer geographical – they’re lingual.

The world may be flattening, but there’s still a a few big walls running across the landscape. The reality is the “hidden web” is going to keep growing. As I’ve posted about before, your ability to access information online revolves almost exclusively around the languages you can read/write.

As countries like China & India continue to pump new users online more and more content will be generated in their native languages, likely invisible to you unless you speak (and search in) that language.

Google’s getting better and better with opening access to these sites through their machine translation tools but the reality is there just isn’t enough CPU horsepower to run every Google search through machine translation for all the different language variations.

Language Weaver, through Kontrib, is also making an interesting attempt at opening up more content to a broader audience through a Digg like portal. It’s a great idea although I think they’re going to have a hard time getting the traction it needs. I’d personally love to see them work with Digg directly instead and create a licensing deal similar to what my friends at Idee have done with their image duplication detection technology.

It’s going to be interesting to watch this story play out. Who ever busts the language barrier the mos effectively first will dramatically change the search game. Google is clearly out in front, and the most likely victor, but you never know who’s running in stealth right now and could surprise us all.

In-chat Machine Translation via Google Talk

Saw on Techcrunch this am that Google talk know has a few bots you can add to your chats that will “translate” your conversations for you in realtime.

trans_botIt creates the translation through Google Translate so at the very least you’d want to be sure whoever you’re talking with understands that some translations might be downright wacky. Needless to say if it requires clarity and exact directions (talking through heart surgery, supporting nuclear power center operators or peace negotiations for example) this is not an appropriate tool.

The implementation is a little clunky – you need to add a bot to the chat for each language pair, in each direction. For example talking to someone in French you’d need both the English-to-French bot and the French-to-English bot. I’m guessing this is the result of someone’s 20% time at Google. (Edit: They’ve since confirmed it is)

I’d hope if they had actually roadmapped this feature the translation option could have just been built in to the tool. The system should really just know if the people who are talking to each other are using different language interfaces or preferences. If it detects that two people with different preferences start chatting just throw up a “We see you’re talking with someone who may speak a different language, would you like us to translate for you?” kind of message.

Right now it supports 29 language pairs, which is kind of odd as it leaves the conversation a little bit one sided… From my quick look it seems English-to-Bulgarian is the pair left out in the cold (but Bulgarian-to-English is supported.) (See edit below: real number is 24)

All in all, for the time being it’s a fun toy but it’ll be interesting to see how this functionality evolves…

Google Blog post

Edited to Add: If anyone out there can read the Chinese text in the screen cap I’d love to know how legible it actually is. The English is passable, which is probably why they used it, but I wouldn’t be surprised to find out there’s some crazy stuff happening on the other end.

EDIT: They published the wrong list of language pairs on the Google blog initially… there’s actually 24: ar2en, de2en, de2fr, el2en, en2ar, en2de, en2el, en2es, en2fr, en2it, en2ja, en2ko, en2nl, en2ru, en2zh, es2en, fr2de, fr2en, it2en, ja2en, ko2en, nl2en, ru2en, zh2en

Translation lets Google’s "cat" out of the bag?

Read an interesting article this morning (via Lifehacker) about how it looks like Google may be readying a new version of GMail.

How did they get the “scoop”?

As most will know Google allows it’s users to assist with and polish the translations it uses throughout their platform. Recently a user was logged in and doing some translation on the GMail UI and got prompted to enter a translation for “New Version” with supporting context text that said “Link that users can click on if they are part of the trusted testers program to go to the new UI”.

Oops.

Click here to see the story/screen capture.