Twitter launches translation crowdsourcing, again

Twitter went live with its newly updated translation center today. This is the second iteration of the platform; it first launched in October 2009, but was closed less than a year after for an overhaul.

I gave it a quick tour. A number of people were complaining (via Twitter naturally) about the slowness of the site. But it was fast enough on my end.

There are nine target languages as of today (six of which are already live). The three new languages are Indonesian, Russian, and Turkish. It’s fascinating to see Indonesian and Turkish as part of this first batch of languages — ahead of, say, Dutch or Swedish. Twitter is simply going where the users are — and Twitter is HUGE in Indonesia and Turkey.

Also, not surprisingly, Chinese is NOT on the list of target languages.

Overall, I liked the new design. The language translation interface is similar in many ways to Facebook’s UI. But what I found most intriguing (see above) as how the home page segments the text strings by platform (Android, Twitter.com, iPhone) as well as audience and content type (Business, Open Source, and Help).

If you’re wondering why Twitter.com text strings are handled differently than iPhone text strings, consider the platforms. On a PC, you have a good deal more real estate to work with. On a mobile device, you may only have a fraction of that real estate, which would require a much-shorter text string. So you could have the same message translated differently depending on the target device or application.

Finally, I thought I’d share the “opt in” text that Twitter presents potential volunteer translators. I like the fact that Twitter is up front with users in that they are giving away their time and text for free. Though I’m not sure how Twitter plans to enforce the confidentiality rule:

  • Since you’ll be helping out Twitter (thanks again!) we want to let you know our ground rules. Please read the full agreement below before continuing. Here are some of the things you can expect to see:
  • We may show you confidential, yet to be released products or features and you must be willing to keep those secret.
  • You’ll be volunteering to help out Twitter and will not be paid.
  • Twitter owns the rights to the translations you provide. You are giving them to us so that we can use them however we want. Among other things, Twitter plans to share the translations with the Twitter development community. We want to help make all of the other great Twitter apps, not just Twitter.com, available in your language.

Now that Twitter has its new platform, will it match the record set by Facebook awhile back — translating 70 languages in less than 18 months?

Adobe launches translation crowdsourcing in China

Facebook has demonstrated that you can crowdsource translations with high quality and rapid turnaround, leading many other companies to ask how they too can leverage the crowd to translate their content.

Enter Adobe and Lingotek.

Adobe has recently begun leveraging Lingotek’s software platform to enable the crowdsourcing of translations within China. As of now, there are 40 volunteer translators in China translating documentation.

Keeping in mind that this is a new and ongoing effort, I recently conducted a Q&A with Lingotek’s CEO Rob Vandenberg.

Here is the interview:

Q: What incentives did Adobe use to get Chinese users interested in translating content?

Adobe takes a very user-centric approach to volunteer translation. Instead of asking users to translate certain material, Adobe provides the content and tools for users to translate what they are interested in. They went to their user groups, and offered community translation as an opportunity. This allowed them to find people who were already interested in translating – whether because they are a reseller of the software, they want to put Adobe’s name on their résumé, or they are end-users who just want Adobe content in their language.

Q: Does the Lingotek platform stand alone or is it integrated into existing Adobe translation systems?

We have worked with Adobe to provide a number of integration points, including:

  1. Providing an API to allow community members to upload documents from an Adobe Flex application.
  2. Providing a version of our leaderboard that could be placed on the Adobe Groups site, as well as an API to get leaderboard data.
  3. Providing a version of our signup page that could be placed on the Adobe Groups site.

Q: How is quality managed with regard to the volunteers. Even Facebook relies on a vendor to ensure quality.

The primary means of producing quality translations in the Adobe communities is to limit who is allowed to participate. Adobe selects project managers who they can trust, and these people are in charge of determining which translators should be allowed to participate.

Q: Are the project managers Adobe employees in China? And are they effectively the gatekeepers for quality?

As I understand it, there is a Community Manager who is the interface between Adobe and the community, but the project management is all done by community members. The translated content is then given to the community, and they publish it.

In addition, the Lingotek platform allows for a number of tools which not only help translators to work faster, but improve the quality of the translations, including:

  • Shared Translation Memories
  • Translation Voting
  • Notes on each segment
  • Terminology tools

Q: How does Adobe get rapid turnaround using volunteers? Are deadlines used?

The speed of translation is affected most by letting volunteers translate the things that they want to translate. In addition, Adobe brings attention to the project managers and translators who have done the most work.

Q: How does Adobe deal with customers who assume that they should not be required to translate content themselves?

Adobe focuses on the users who are eager to help them to translate. They don’t try to recruit general end-users, and I think that is why they have avoided most of this criticism.

Q: Why is Adobe doing this exactly?

The main driving factor is Adobe’s community users are asking for translated content that isn’t in Adobe’s professional translation pipeline. By using Lingotek’s API’s and translation software and Adobe’s existing community to translate content were making new content available to Adobe users quicker and at a much lower price.

Q: How does Adobe license the Lingotek platform?

Lingotek is licensed on a concurrent user basis. We don’t share pricing information.

Q: Is this limited to only volunteers? That is, will the same platform be used not only for documentation but for product/software loc work?

The Lingotek platform is designed to support many different workflows. Some clients are using their communities to provide the initial translation, and then use internal reviewers to do the final review before publishing. Other clients use a traditional assigned workflow, without using community members.  In Adobe’s case, so far they are only using their community members.

For more information, here’s the Lingotek press release.

Is Google the best machine translation engine? It depends…

Two weeks ago, I introduced Ethan Shen and his project to analyze the three major free machine translation (MT) engines — Google, Microsoft, and Yahoo! Babelfish — by relying on translator reviews.

Ethan has provided me with a mid-point summary of results, which I’ve included below. I was surprised to find that Microsoft and Babelfish are beating Google on some languages pairs, as well as on shorter text strings. Although Google is emerging the overall winner — and receiving some much-deserved attention from the media — it’s nice to see some healthy competition.

That said, quality is only one piece of the puzzle. The other piece — perhaps much more important — is usability. Now that Google has embedded its MT engine into Gmail and Reader — and now its Chrome client –I find I’m using Google exclusively as my MT engine.

Here are Ethan’s findings so far (emphasis mine):

At the highest level, it appears that survey participants prefer Google Translate’s results across the board.

In a few languages (Arabic, Polish, Dutch) the preference is overwhelming with votes for Google doubling its nearest competitor

However, once you remove voters that have self defined their fluency in the source or target language as “limited,” the contest becomes closer along some of the heavily trafficked languages. For example:

  • Microsoft Bing Translator leads in German
  • Yahoo! Babelfish leads in Chinese
  • Google maintains its lead in Spanish, Japanese, and French

Observing only the self-defined “limited fluency” voter reveals a strong brand bias. If your fluency in the target translation language is limited, it would stand to reason your ability to assess the quality of the translation is very limited. And yet…

  • Limited-fluency voters chose Google over Bing by 2 to 1
  • They also chose Google over Yahoo! Babelfish by 5 to 1

As I had guessed, Yahoo! and Microsoft’s hybrid rules-based MT model performed better on shorter text passages

For phrases below 50 characters, Google’s lead in Spanish, Japanese, and French disappear. And Microsoft’s lead in German widens.

Beyond 50 characters, Google’s relative performance seems to improve across the board.

For passages that are only one sentence, the same effect is seen, though to a lesser extent than under 50 characters.

On March 4th, we made a few changes to our survey – hiding the brands and randomizing the positions of the text results before voting.  Since then, we have not yet collected enough data to draw conclusions, but Babelfish seems to be receiving the biggest boost, perhaps showing the effects of the recent neglect of that tool.

Clearly, Ethan needs more data to arrive at more concrete conclusions. If you’re a translator and you want to lend a hand, here is the voting site.

PS: Here’s an interview with Google’s MT guru Franz Josef Och.

Forgetting English (literally)

I’m working on the Web Globalization Report Card, and this, plus my fascination with Facebook, inspired me to check out my Forgetting English page in several different languages.

Here it is in Spanish…

And Chinese…

And, my favorite, “pirate English”…

Thanks largely to volunteer translators, Facebook has localized from one to 70 languages in two years. (Personally, I think we need more of the goofy ones — I’d so much rather “Adjust me riggins” than “Change settings” or change the “Settins o’ me piracy” than my “Privacy Settings.” I’m thinking of volunteering to do “Snarky English” myself.)

If you’re a translator, there’s a link on Facebook (on the language setting page) where you can find out more. And if you’re interested in more where this came from, check out our new report, coming in 2010.