Google Translate API messing up Cyrillic Thread poster: Susan Welsh
|
Susan Welsh United States Local time: 04:11 Russian to English + ...
I have been translating a Russian document in a CAT tool, using the GT API with no problems, and all of a sudden it starts throwing in garbage for Cyrllic segments (lots of %signs, Ds and Os). Universal Online Cyrillic converter identifies the encoding as KOI-7. Yesterday when this happened, it worked fine on other segments; but today it's happening again. Has anyone else had this problem? | | |
esperantisto Local time: 11:11 Member (2006) English to Russian + ... SITE LOCALIZER No problem with Anaphraseus | Dec 28, 2011 |
Please, provide more details on your environment. I experience no problem with the latest build of Anaphraseus in LibreOffice 3.4.4/OpenOffice.org 3.3.0 in Windows 7 and openSUSE 11.3/11.4 when translating ENG→RUS. Previously Anaphraseus returned strings of incorrectly decoded characters in UTF-8 (BTW, you description makes me think, your problem may be the same, but you need to provide a sample of what you get), but Ole solved the problem. | | |
Didier Briel France Local time: 10:11 English to French + ... Are you translating long segments? | Dec 28, 2011 |
Susan Welsh wrote: I have been translating a Russian document in a CAT tool, using the GT API with no problems, and all of a sudden it starts throwing in garbage for Cyrllic segments (lots of %signs, Ds and Os). Universal Online Cyrillic converter identifies the encoding as KOI-7. Yesterday when this happened, it worked fine on other segments; but today it's happening again. Has anyone else had this problem? Do your "garbage" segments begin with Server returned HTTP response code: 414 for URL: I can reproduce the issue in OmegaT if I try and translate long segments from Russian to English. Short segments are translated fine, but I get the 414 error for long segments. That's because Russian characters have to be encoded, so the strings are much longer than for "ASCII" based languages. E.g., googleapis.com/language/translate/v2?key=xxxxx&source=RU&target=EN&q=%D0%92+1526+%D0%B3%D0%BE%D0%B4%D1%83+%D0%BF%D0%B5%D1%80%D0%B5%D0%B1%D1%80%D0%B0%D0%BB%D1%81%D1%8F I know there is another method, which allows to send slightly longer strings. I'll check with Alex (he's more concerned than I am), but eventually the problem will always exist for lengthy segments. Didier | | |
Susan Welsh United States Local time: 04:11 Russian to English + ... TOPIC STARTER
Hi Didier and esperantisto, Didier, you seem to have identified the problem (although I would not say that this segment is terribly long), because it does give that code (below). I am working with OmegaT 2.5.2 on Ubuntu Linux, OOo 3.2.0. (Esperantisto, I'm not familiar with Anaphraseus -- not sure what it is. I'll check when I get a chance.) Thanks, Susan PS - After some editing, the garbage is no longer displaying in this message as I am... See more Hi Didier and esperantisto, Didier, you seem to have identified the problem (although I would not say that this segment is terribly long), because it does give that code (below). I am working with OmegaT 2.5.2 on Ubuntu Linux, OOo 3.2.0. (Esperantisto, I'm not familiar with Anaphraseus -- not sure what it is. I'll check when I get a chance.) Thanks, Susan PS - After some editing, the garbage is no longer displaying in this message as I am seeing on my screen. It is exclusively full of %DO%BE%DO%B4 and stuff like that, with no Cyrillic words. I'm going to delete the example, except for the source text and the error code. Выросши в холодной Сибири, постоянно с величайшим вниманием следя за описаниями полярных путешествий и многое узнав о них от покойного моего друга Норденшильда, совершившего ряд славных экспедиций в области льдов, я получил полное убеждение в возможности решительной победы над полярными льдами при помощи соответственных для того приспособлений и, главное, - ясного понимания сил, до сих пор препятствовавших кораблям проникнуть в неведомую околополюсную область, занимающую пространство около 4 млн кв. Server returned HTTP response code: 414 ...
[Edited at 2011-12-28 14:49 GMT] ▲ Collapse | |
|
|
Susan Welsh wrote: (Esperantisto, I'm not familiar with Anaphraseus -- not sure what it is. I'll check when I get a chance.) Anaphraseus (http://anaphraseus.sourceforge.net/ ) is a Wordfast (Classic) "clone". It works in OpenOffice instead of MS Office, is quite slower than Wordfast and has a much smaller feature set. | | |