<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Will Google Kill the Translation Industry?</title>
	<atom:link href="http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/</link>
	<description>Adventures in Web Globalization</description>
	<lastBuildDate>Wed, 17 Mar 2010 14:06:34 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Josep Condal</title>
		<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/comment-page-1/#comment-39</link>
		<dc:creator>Josep Condal</dc:creator>
		<pubDate>Sat, 04 Jun 2005 16:52:19 +0000</pubDate>
		<guid isPermaLink="false">http://globalbydesign.com/2005/05/28/will-google-kill-the-translation-industry/#comment-39</guid>
		<description>My opinion is that statistical-based translation has a bright future, because it has a great opportunity of understanding how information collocates with a bruttal force word, term and snippet mining in its context, as long as it is able to perform the necessary pruning so that an acceptable output is rendered when translating a sentence.

Please note that one of the most important breakthroughs in terms of translation quality and dramatic shortening of the learning curve for new translators in human translation has been &quot;concordance&quot; (i.e. human-driven word, term and snippet mining of a translation memory).

If it worked very well for human beings, I&#039;m not sure why it should not also work very well for machines.  If Google is around the corner with that technology, MT translation may become mainstream earlier than we think, as there will be other players trying to catch up with that.

In any case, it will still remain true that &quot;nobody can translate what he does not know what it is&quot;, which is something that MT cannot deliver.  

It is not clear what the scenario will be in the future for translators that currently are translating content that they do not understand, but that use TM concordance for guidance, as these translators could be eventually replaced by MT and pushed professionaly to other types of content that they understand and, therefore, where they can be competitive in terms of quality, but possibly as editors instead of as translators.
</description>
		<content:encoded><![CDATA[<p>My opinion is that statistical-based translation has a bright future, because it has a great opportunity of understanding how information collocates with a bruttal force word, term and snippet mining in its context, as long as it is able to perform the necessary pruning so that an acceptable output is rendered when translating a sentence.</p>
<p>Please note that one of the most important breakthroughs in terms of translation quality and dramatic shortening of the learning curve for new translators in human translation has been &#8220;concordance&#8221; (i.e. human-driven word, term and snippet mining of a translation memory).</p>
<p>If it worked very well for human beings, I&#8217;m not sure why it should not also work very well for machines.  If Google is around the corner with that technology, MT translation may become mainstream earlier than we think, as there will be other players trying to catch up with that.</p>
<p>In any case, it will still remain true that &#8220;nobody can translate what he does not know what it is&#8221;, which is something that MT cannot deliver.  </p>
<p>It is not clear what the scenario will be in the future for translators that currently are translating content that they do not understand, but that use TM concordance for guidance, as these translators could be eventually replaced by MT and pushed professionaly to other types of content that they understand and, therefore, where they can be competitive in terms of quality, but possibly as editors instead of as translators.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: fravia+</title>
		<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/comment-page-1/#comment-38</link>
		<dc:creator>fravia+</dc:creator>
		<pubDate>Thu, 02 Jun 2005 22:43:39 +0000</pubDate>
		<guid isPermaLink="false">http://globalbydesign.com/2005/05/28/will-google-kill-the-translation-industry/#comment-38</guid>
		<description>Many here do not even grasp the concept and the philosophy behind this. This is NOT &quot;machine translation&quot;. It is not translating nothing. It is RETRIEVING human translated snippets, putting them together with clever algos and creating a translation which will be as good as the corpora they are using will allow. The laws of statistics (and google algos) will put &#039;bad&#039; translations underneath and good translations on the surface.

They have already indexed ALL of the united nations and of the european union documents. We are speaking of BILLIONS of documents.

While the new european union languages may not have (yet) the necessary &quot;critical mass&quot;, so no latvian, no czech and no slovak, all the &quot;older&quot; languages have it: so yes german, yes italian, yes  dutch, yes portuguese, yes greek, yes finn, yes swedish, yes danish... and fantastic englsh, french and spanish, coz they will draw BOTH from zillions of UN-documents and EU-documents... HUMAN TRANSLATED. Hence you just need some stupid, quick algos: alignements ok? keep segments. Alignements screwed? throw segments away, who cares, we have enough.

Search google limiting the search to the europa.eu.int server or to the united nations one. See? All documents, even the most useless ones, are already there (often also cached btw).

I bet it all started with a request to have a quick translator into english from arabic and chinese, and someone snapped his fingers and said, well, why not the UN-documents? They are on line and they will deliver just that.

Da cosa nasce cosa...
and first they found spanish adn french (and russian) as collateral advantage, now they added and are adding all the EU-documents to make it into the universal tarnslator.

Quality?
Dont make me laugh. A billion documents (and a coupla of easy to program algos) provide quality &quot;automagically&quot;, duh.</description>
		<content:encoded><![CDATA[<p>Many here do not even grasp the concept and the philosophy behind this. This is NOT &#8220;machine translation&#8221;. It is not translating nothing. It is RETRIEVING human translated snippets, putting them together with clever algos and creating a translation which will be as good as the corpora they are using will allow. The laws of statistics (and google algos) will put &#8216;bad&#8217; translations underneath and good translations on the surface.</p>
<p>They have already indexed ALL of the united nations and of the european union documents. We are speaking of BILLIONS of documents.</p>
<p>While the new european union languages may not have (yet) the necessary &#8220;critical mass&#8221;, so no latvian, no czech and no slovak, all the &#8220;older&#8221; languages have it: so yes german, yes italian, yes  dutch, yes portuguese, yes greek, yes finn, yes swedish, yes danish&#8230; and fantastic englsh, french and spanish, coz they will draw BOTH from zillions of UN-documents and EU-documents&#8230; HUMAN TRANSLATED. Hence you just need some stupid, quick algos: alignements ok? keep segments. Alignements screwed? throw segments away, who cares, we have enough.</p>
<p>Search google limiting the search to the europa.eu.int server or to the united nations one. See? All documents, even the most useless ones, are already there (often also cached btw).</p>
<p>I bet it all started with a request to have a quick translator into english from arabic and chinese, and someone snapped his fingers and said, well, why not the UN-documents? They are on line and they will deliver just that.</p>
<p>Da cosa nasce cosa&#8230;<br />
and first they found spanish adn french (and russian) as collateral advantage, now they added and are adding all the EU-documents to make it into the universal tarnslator.</p>
<p>Quality?<br />
Dont make me laugh. A billion documents (and a coupla of easy to program algos) provide quality &#8220;automagically&#8221;, duh.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jerome</title>
		<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/comment-page-1/#comment-37</link>
		<dc:creator>Jerome</dc:creator>
		<pubDate>Mon, 30 May 2005 06:34:46 +0000</pubDate>
		<guid isPermaLink="false">http://globalbydesign.com/2005/05/28/will-google-kill-the-translation-industry/#comment-37</guid>
		<description>I&#039;m less confident than other commenters that usable MT will forever remain &quot;just a few more years away.&quot;

Still, I&#039;m not extremely worried even though I do believe that this kind of corpus-based approach is probably going to yield the promised accuracy, if only because computing power and storage are now cheap enough to make a large enough corpus possible.

Two issues stand out, just for starters: even if the corpus becomes large enough (and the software sophisticated enough) that the system can decipher any metaphor, it then needs to be able to select a culturally appropriate target-language metaphor or even a literal rendering. That&#039;s the kind of decision that requires human intervention. (Let&#039;s not even mention a device like sarcasm: in an age when many humans can&#039;t detect it, I won&#039;t hold my breath waiting for a machine with the ability.)

The second, related, issue is style. Brute force will surely improve the accuracy of MT, but I don&#039;t see any likelihood of it ever producing stylistically interesting copy -- and that&#039;s something that many clients value very highly.

At the very least, human editors will remain a critical part of the process for many years to come.</description>
		<content:encoded><![CDATA[<p>I&#8217;m less confident than other commenters that usable MT will forever remain &#8220;just a few more years away.&#8221;</p>
<p>Still, I&#8217;m not extremely worried even though I do believe that this kind of corpus-based approach is probably going to yield the promised accuracy, if only because computing power and storage are now cheap enough to make a large enough corpus possible.</p>
<p>Two issues stand out, just for starters: even if the corpus becomes large enough (and the software sophisticated enough) that the system can decipher any metaphor, it then needs to be able to select a culturally appropriate target-language metaphor or even a literal rendering. That&#8217;s the kind of decision that requires human intervention. (Let&#8217;s not even mention a device like sarcasm: in an age when many humans can&#8217;t detect it, I won&#8217;t hold my breath waiting for a machine with the ability.)</p>
<p>The second, related, issue is style. Brute force will surely improve the accuracy of MT, but I don&#8217;t see any likelihood of it ever producing stylistically interesting copy &#8212; and that&#8217;s something that many clients value very highly.</p>
<p>At the very least, human editors will remain a critical part of the process for many years to come.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Patrick Hall</title>
		<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/comment-page-1/#comment-36</link>
		<dc:creator>Patrick Hall</dc:creator>
		<pubDate>Mon, 30 May 2005 06:22:01 +0000</pubDate>
		<guid isPermaLink="false">http://globalbydesign.com/2005/05/28/will-google-kill-the-translation-industry/#comment-36</guid>
		<description>Keep in mind that when you&#039;re talking about statistical machine translation, or brute force translation, as you put it, the translations have to exist in the first place. These systems are trained on content created by authors and translators. 

It&#039;s quite possible that Google will be able to produce pretty good machine translations in languages for which large translated corpora exist. After all, they have access to more content than any organization in history, probably. I suspect that their system will produce much better results than say, Systran.

But that doesn&#039;t change that fact that for many pairs of languages, such corpora simply don&#039;t exist. Google isn&#039;t in the business of producing content, they&#039;re in the business of extracting information from content. So don&#039;t expect any good JapaneseSpanish or GermanChinese any time soon, to say nothing of languages with fewer speakers. 

Of course, Google has a habit of doing things on an unprecedented scale. So we&#039;ll see how far this goes. There will certainly repercussions for the translation industry here, but I don&#039;t think it will be a blanket effect.</description>
		<content:encoded><![CDATA[<p>Keep in mind that when you&#8217;re talking about statistical machine translation, or brute force translation, as you put it, the translations have to exist in the first place. These systems are trained on content created by authors and translators. </p>
<p>It&#8217;s quite possible that Google will be able to produce pretty good machine translations in languages for which large translated corpora exist. After all, they have access to more content than any organization in history, probably. I suspect that their system will produce much better results than say, Systran.</p>
<p>But that doesn&#8217;t change that fact that for many pairs of languages, such corpora simply don&#8217;t exist. Google isn&#8217;t in the business of producing content, they&#8217;re in the business of extracting information from content. So don&#8217;t expect any good JapaneseSpanish or GermanChinese any time soon, to say nothing of languages with fewer speakers. </p>
<p>Of course, Google has a habit of doing things on an unprecedented scale. So we&#8217;ll see how far this goes. There will certainly repercussions for the translation industry here, but I don&#8217;t think it will be a blanket effect.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pet Computer</title>
		<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/comment-page-1/#comment-35</link>
		<dc:creator>Pet Computer</dc:creator>
		<pubDate>Mon, 30 May 2005 01:51:28 +0000</pubDate>
		<guid isPermaLink="false">http://globalbydesign.com/2005/05/28/will-google-kill-the-translation-industry/#comment-35</guid>
		<description>Machine translation is still machine translation, and it&#039;s bound to stay just that for a long while still. Furthermore, the farther apart the languages are, the worse the outcome of the machine translation attempts will be. I had more than one chance to see it.</description>
		<content:encoded><![CDATA[<p>Machine translation is still machine translation, and it&#8217;s bound to stay just that for a long while still. Furthermore, the farther apart the languages are, the worse the outcome of the machine translation attempts will be. I had more than one chance to see it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David</title>
		<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/comment-page-1/#comment-34</link>
		<dc:creator>David</dc:creator>
		<pubDate>Sun, 29 May 2005 20:13:18 +0000</pubDate>
		<guid isPermaLink="false">http://globalbydesign.com/2005/05/28/will-google-kill-the-translation-industry/#comment-34</guid>
		<description>Hi John,

I can&#039;t see why this would be a &quot;blow to Macromedia Flash&quot;.  It is a straightforward matter to keep the text of Flash content or a Flash app seperate in XML and I can&#039;t see any reason one couldn&#039;t use a service like this as a Flash developer.

Further, why is AJAX a blow to Flash?   This is what we have been proposing for years--that the page based model of web apps is weak and that a model that is Asynchronous (Flash offers this) uses JavaScript (Flash uses the sames ECAMScript standard) and XML (Flash does this too) is a better model.  So if Google helps us convince the world that there is a better way to build web apps than just pages of HTML, that is a good thing.  If more people decide to innovate and compete on the quality and responsiveness of experience in web apps, lots of people may use &quot;AJAX&quot; and DHTML, but we think a whole lot will choose &quot;AJAX&quot; and Flash.  They aren&#039;t mutually exlusive, and can be used together, and also Flash adds many benefits over DHTML for many use cases.

-David

Regards,
David
Macromedia</description>
		<content:encoded><![CDATA[<p>Hi John,</p>
<p>I can&#8217;t see why this would be a &#8220;blow to Macromedia Flash&#8221;.  It is a straightforward matter to keep the text of Flash content or a Flash app seperate in XML and I can&#8217;t see any reason one couldn&#8217;t use a service like this as a Flash developer.</p>
<p>Further, why is AJAX a blow to Flash?   This is what we have been proposing for years&#8211;that the page based model of web apps is weak and that a model that is Asynchronous (Flash offers this) uses JavaScript (Flash uses the sames ECAMScript standard) and XML (Flash does this too) is a better model.  So if Google helps us convince the world that there is a better way to build web apps than just pages of HTML, that is a good thing.  If more people decide to innovate and compete on the quality and responsiveness of experience in web apps, lots of people may use &#8220;AJAX&#8221; and DHTML, but we think a whole lot will choose &#8220;AJAX&#8221; and Flash.  They aren&#8217;t mutually exlusive, and can be used together, and also Flash adds many benefits over DHTML for many use cases.</p>
<p>-David</p>
<p>Regards,<br />
David<br />
Macromedia</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andreas Ramos</title>
		<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/comment-page-1/#comment-33</link>
		<dc:creator>Andreas Ramos</dc:creator>
		<pubDate>Sat, 28 May 2005 21:50:26 +0000</pubDate>
		<guid isPermaLink="false">http://globalbydesign.com/2005/05/28/will-google-kill-the-translation-industry/#comment-33</guid>
		<description>There&#039;s a name for people who think that computers will one day be able to translate. They&#039;re called monolingual. Anyone who is fluent in several languages knows that languages aren&#039;t merely the same thing with different vocabulary. A machine can indeed convert an Arabic text into English, but a Washington politician simply won&#039;t understand it because it&#039;s not in his social context. MT won&#039;t kill the translations industry; on the contrary, as we globalize, countries and companies will discover they must work in multiple languages. There will be far more translation.</description>
		<content:encoded><![CDATA[<p>There&#8217;s a name for people who think that computers will one day be able to translate. They&#8217;re called monolingual. Anyone who is fluent in several languages knows that languages aren&#8217;t merely the same thing with different vocabulary. A machine can indeed convert an Arabic text into English, but a Washington politician simply won&#8217;t understand it because it&#8217;s not in his social context. MT won&#8217;t kill the translations industry; on the contrary, as we globalize, countries and companies will discover they must work in multiple languages. There will be far more translation.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Philipp Lenssen</title>
		<link>http://www.globalbydesign.com/blog/2005/05/28/will-google-kill-the-translation-industry/comment-page-1/#comment-32</link>
		<dc:creator>Philipp Lenssen</dc:creator>
		<pubDate>Sat, 28 May 2005 19:28:09 +0000</pubDate>
		<guid isPermaLink="false">http://globalbydesign.com/2005/05/28/will-google-kill-the-translation-industry/#comment-32</guid>
		<description>As for brand names, I suppose they just leave them as they are because they are also translated as such (I suppose &quot;Coca Cola&quot; would stay &quot;Coca Cola&quot; in many languages). But oddball terms and slang is a bit of problem I guess...</description>
		<content:encoded><![CDATA[<p>As for brand names, I suppose they just leave them as they are because they are also translated as such (I suppose &#8220;Coca Cola&#8221; would stay &#8220;Coca Cola&#8221; in many languages). But oddball terms and slang is a bit of problem I guess&#8230;</p>
]]></content:encoded>
	</item>
</channel>
</rss>
