Message Boards Message Boards

Google Translate Structure (TextStructure like function)

GROUPS:

TextStructure is a very nice new function in Mathematica. It can create amazing things like:

TextStructure@"If love be blind, love cannot hit the mark."

TextStructure

Can we do the same for translations?

This piece of code downloads a JSON-like code from google translate without the need for API calls (which I never bothered to learn).

GoogleTranslate[str_String] := GoogleTranslate@str = Import[
    StringTemplate["https://translate.googleapis.com/translate_a/single?client=gtx&sl=`1`&tl=`2`&dt=t&q=`3`"][
      "pt", "en", URLEncode@str], "JSON"][[1, 1, 1]]

And this other piece of code formats the translation.

MakeBoxes[TranslateElement[main_, down_], _] := GridBox[
  {{MakeBoxes@main}, {StyleBox[MakeBoxes@down, "TextElementLabel"]}},
  BaseStyle -> "TextElementGrid"]

GoogleTranslateStructure[str_String] := Block[{sentence, words, phrase},
    sentence = StringSplit[str, p:"."|"," :> p] //. {a___String, s_String, p:"."|",", b___String} :> {a, s<>p, b};
    phrase = Table[
       words = StringSplit[sentence[[i]], WhitespaceCharacter];
       TranslateElement[Row@Riffle[TranslateElement @@@ Transpose@{words, GoogleTranslate /@ words}, " "], GoogleTranslate@sentence[[i]]]
    , {i, Length@sentence}];
    If[Length@sentence == 1,
       phrase[[1]],
       TranslateElement[Row@Riffle[phrase, " "], GoogleTranslate@str]
    ]
]

A usage example would be:

GoogleTranslateStructure@"Se amor é cego, nunca acerta o alvo."

GoogleTranslateStructure

Changing the language from English to Japanese (which I don't speak, btw):

JP

Or French:

FR

POSTED BY: Thales Fernandes
Answer
5 months ago

Interesting idea, there is one slight problem: google translate doesn't work word by word, so if a word means something in context, the API call on the whole sentence will return something different for the whole thing. Look at how be is translated to être first, and est in the context of the sentence.

POSTED BY: Carlo Barbieri
Answer
5 months ago

Interesting idea, there is one slight problem: google translate doesn't work word by word, so if a word means something in context, the API call on the whole sentence will return something different for the whole thing. Look at how be is translated to être first, and est in the context of the sentence.

One can do only so much with just a few lines of code. There is another "tag" in parsing google translate results that can show word definitions with their respective weights. But it would be overkill to do so. I only intended to create a 5-minute code to show one cool, but frail, usage.

POSTED BY: Thales Fernandes
Answer
5 months ago

Of course I didn't mean to say this is not really cool. I especially liked your MakeBoxes code I think it might be very useful to expose to end users in some form, maybe it should be a way to typeset (tree) Graphs or just general expressions, maybe to show how evaluation works.

POSTED BY: Carlo Barbieri
Answer
5 months ago

The problem that Carlo mentions is even more serious in the Japanese example. If you could read Japanese, you'd see how very little the word-by-word translation had in common with the sentence translation. Take the word "If" which is translated as もし. Conditionals in Japanese are done by inflecting the verb. There's no equivalent of "If". In fact the only word level translations that make it into the sentence level translations are the words for "love", "cannot", and "mark".

If we could do word segmentation and stemming in languages other than english, it would be fun to see what percent of words at the word-by-word translation appeared in the sentence level translation. It might be a useful way to measure how different the grammar of the language is from English.

I really like this way of annotating text though. With some work, it might be useful for language learning.

POSTED BY: Sean Clarke
Answer
5 months ago

I really like this way of annotating text though. With some work, it might be useful for language learning.

That was my main intention, to show how cool can we make this. I just wish there was a less hacky way of doing this. Perhaps with a TextStructureBox (or making TextStructure more general).

POSTED BY: Thales Fernandes
Answer
5 months ago

We have "GoogleTranslate" in ServiceConnect already.

http://reference.wolfram.com/language/ref/service/GoogleTranslate.html

gt = ServiceConnect["GoogleTranslate"]

ServiceExecute["GoogleTranslate", "Translate", 
    {"Text" -> "Hola mundo!", "From" -> "Spanish", "To" -> "en"}]

but that might be a bit tedious to wrangle as it requires Wolfram Connector and API Key:

enter image description here

POSTED BY: Sam Carrettie
Answer
5 months ago

enter image description here - you have earned "Featured Contributor" badge, congratulations !

This is a great post and it has been selected for the curated Staff Picks group. Your profile is now distinguished by a "Featured Contributor" badge.

POSTED BY: Moderation Team
Answer
5 months ago

Thales, Thank you for an interesting post. I've enjoyed your example from English to Japanese. However, it does not work well from Japanese to English in my circumstance. Adding a bit to your GoogleTranslate[], it worked well. enter image description here

POSTED BY: Kotaro Okazaki
Answer
2 months ago

Group Abstract Group Abstract