Translating the Web with free laborers

Updated: 2012-07-01 08:45

By Somini Sengupta (China Daily)

  Comments() Print Mail Large Medium  Small 分享按钮 0

Language does not come naturally to machines. Satire and jokes? Algorithms have trouble with that. Irony? Wordplay? Cultural context? Forget it.

That human edge in decoding what things mean is what a computer scientist turned entrepreneur, Luis von Ahn, is betting on. His start-up, Duolingo, which opened to the public in June, aims to put armies of language students to work translating text on the Web.

For the learners, Duolingo offers basic lessons, followed by sentences to translate, one at a time, from simple to more difficult. For online content providers wanting translations, Duolingo offers, for now at least, free labor. There are no independent assessments available of how accurate or efficient it can be.

The site has been available by invitation only for the last five months and is now limited to English, Spanish, French and German. People can submit their content to Duolingo for translation, a service the company may begin to charge for.

"You're learning a language and at the same time, helping to translate the Web," Mr. von Ahn said. "You're learning by doing."

Google Translate, by contrast, relies entirely on machines to do the work - and while it usually captures the essence of a piece of text, it can sometimes produce bewildering passages.

Mr. von Ahn's last enterprise, ReCaptcha, makes use of those wavy letters and numbers that Web users transcribe every day on sites to ensure that they are not robots trying to break in. Mr. von Ahn gathered those squiggles from digitized images of old manuscripts, books and newspapers - including The New York Times. Every time Web users transcribe the wavy words, they provide free help in transcribing fading texts that are hard for a machine to read. Google bought his start-up in 2009.

Mr. von Ahn, an associate professor at Carnegie Mellon University in Pittsburgh, where Duolingo is based, came up with the translation idea when he noticed that friends and relatives in his native Guatemala had far less content available to them online if they did not know English.

Human and machine translation can work in different scenarios, said Alon Lavie, another Carnegie Mellon professor who has a machine translation company called Safaba, aimed at corporate clients.

Mr. von Ahn is thinking of taking on Wikipedia as his first project. The New York Times has been experimenting with Duolingo as a way to translate its digital content to other languages, said Marc Frons, the company's chief information officer.

"Where I think Duolingo's crowdsourcing makes a lot of sense," Mr. Lavie said, "is in scenarios where a consumer or enterprise has a small translation job that needs to be done quickly and cheaply, and the translation needs to come out at 'human' quality - similar to what a human translator or bilingual speaker would generate."

The New York Times