Message Boards Message Boards

Google Translate Structure (TextStructure like function)

TextStructure is a very nice new function in Mathematica. It can create amazing things like:

TextStructure@"If love be blind, love cannot hit the mark."

TextStructure

Can we do the same for translations?

This piece of code downloads a JSON-like code from google translate without the need for API calls (which I never bothered to learn).

GoogleTranslate[str_String] := GoogleTranslate@str = Import[
    StringTemplate["https://translate.googleapis.com/translate_a/single?client=gtx&sl=`1`&tl=`2`&dt=t&q=`3`"][
      "pt", "en", URLEncode@str], "JSON"][[1, 1, 1]]

And this other piece of code formats the translation.

MakeBoxes[TranslateElement[main_, down_], _] := GridBox[
  {{MakeBoxes@main}, {StyleBox[MakeBoxes@down, "TextElementLabel"]}},
  BaseStyle -> "TextElementGrid"]

GoogleTranslateStructure[str_String] := Block[{sentence, words, phrase},
    sentence = StringSplit[str, p:"."|"," :> p] //. {a___String, s_String, p:"."|",", b___String} :> {a, s<>p, b};
    phrase = Table[
       words = StringSplit[sentence[[i]], WhitespaceCharacter];
       TranslateElement[Row@Riffle[TranslateElement @@@ Transpose@{words, GoogleTranslate /@ words}, " "], GoogleTranslate@sentence[[i]]]
    , {i, Length@sentence}];
    If[Length@sentence == 1,
       phrase[[1]],
       TranslateElement[Row@Riffle[phrase, " "], GoogleTranslate@str]
    ]
]

A usage example would be:

GoogleTranslateStructure@"Se amor é cego, nunca acerta o alvo."

GoogleTranslateStructure

Changing the language from English to Japanese (which I don't speak, btw):

JP

Or French:

FR

POSTED BY: Thales Fernandes
8 Replies

Thales, Thank you for an interesting post. I've enjoyed your example from English to Japanese. However, it does not work well from Japanese to English in my circumstance. Adding a bit to your GoogleTranslate[], it worked well. enter image description here

POSTED BY: Kotaro Okazaki

Of course I didn't mean to say this is not really cool. I especially liked your MakeBoxes code I think it might be very useful to expose to end users in some form, maybe it should be a way to typeset (tree) Graphs or just general expressions, maybe to show how evaluation works.

POSTED BY: Carlo Barbieri

enter image description here - you have earned "Featured Contributor" badge, congratulations !

This is a great post and it has been selected for the curated Staff Picks group. Your profile is now distinguished by a "Featured Contributor" badge.

POSTED BY: EDITORIAL BOARD

We have "GoogleTranslate" in ServiceConnect already.

http://reference.wolfram.com/language/ref/service/GoogleTranslate.html

gt = ServiceConnect["GoogleTranslate"]

ServiceExecute["GoogleTranslate", "Translate", 
    {"Text" -> "Hola mundo!", "From" -> "Spanish", "To" -> "en"}]

but that might be a bit tedious to wrangle as it requires Wolfram Connector and API Key:

enter image description here

POSTED BY: Sam Carrettie

I really like this way of annotating text though. With some work, it might be useful for language learning.

That was my main intention, to show how cool can we make this. I just wish there was a less hacky way of doing this. Perhaps with a TextStructureBox (or making TextStructure more general).

POSTED BY: Thales Fernandes

Interesting idea, there is one slight problem: google translate doesn't work word by word, so if a word means something in context, the API call on the whole sentence will return something different for the whole thing. Look at how be is translated to ĂȘtre first, and est in the context of the sentence.

One can do only so much with just a few lines of code. There is another "tag" in parsing google translate results that can show word definitions with their respective weights. But it would be overkill to do so. I only intended to create a 5-minute code to show one cool, but frail, usage.

POSTED BY: Thales Fernandes

The problem that Carlo mentions is even more serious in the Japanese example. If you could read Japanese, you'd see how very little the word-by-word translation had in common with the sentence translation. Take the word "If" which is translated as ??. Conditionals in Japanese are done by inflecting the verb. There's no equivalent of "If". In fact the only word level translations that make it into the sentence level translations are the words for "love", "cannot", and "mark".

If we could do word segmentation and stemming in languages other than english, it would be fun to see what percent of words at the word-by-word translation appeared in the sentence level translation. It might be a useful way to measure how different the grammar of the language is from English.

I really like this way of annotating text though. With some work, it might be useful for language learning.

POSTED BY: Sean Clarke

Interesting idea, there is one slight problem: google translate doesn't work word by word, so if a word means something in context, the API call on the whole sentence will return something different for the whole thing. Look at how be is translated to ĂȘtre first, and est in the context of the sentence.

POSTED BY: Carlo Barbieri
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract