Pages in topic:   [1 2] >
Unicode and .po files
Thread poster: Russell Jones
Russell Jones
Russell Jones  Identity Verified
United Kingdom
Local time: 00:31
Italian to English
Apr 14, 2011

I have been translating some .po files (new to me) using Poedit.
The accented characters entered in Poedit turn out as hieroglyphs in the amended text document (Wordpad), for example the Italian "è già" appears as "è già" .

If I amend the text document, Poedit then refuses to open it, saying "unable to convert file to Unicode".

I have no idea whether this will cause the client problems; since Poedit is intended for translating, presumably this is how i
... See more
I have been translating some .po files (new to me) using Poedit.
The accented characters entered in Poedit turn out as hieroglyphs in the amended text document (Wordpad), for example the Italian "è già" appears as "è già" .

If I amend the text document, Poedit then refuses to open it, saying "unable to convert file to Unicode".

I have no idea whether this will cause the client problems; since Poedit is intended for translating, presumably this is how it is supposed to work?

I should be grateful for any expert advice.
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 01:31
Member (2006)
English to Afrikaans
+ ...
@Russell Apr 14, 2011

Russell Jones wrote:
The accented characters entered in Poedit turn out as hieroglyphs in the amended text document (Wordpad), for example the Italian "è già" appears as "è già" .


Before translating, make sure you're input file (the file that you start to translate) is in UTF8 encoding. Then, when you open the translated file, open it in MS Word (not Wordpad) or an editor that supports UTF8 (such as Akelpad, which is freeware). If you open a UTF8 file in a text editor that thinks the file is not UTF8, it will display the characters incorrectly.

Make sure you use a recent version of PoEdit (later than 1.1.0).

I have no idea whether this will cause the client problems; since Poedit is intended for translating, presumably this is how it is supposed to work?


If the characters are incorrect, it will cause the client problems. The fact that PoEdit is intended for translation doesn't mean it won't break something under the wrong conditions.

Another PO editor you can try is Virtaal (but Virtaal opens only UTF8N files, not UTF8Y files, but it sounds to me like your source file is likely to be UTF8N anyway).


 
Ambrose Li
Ambrose Li  Identity Verified
Canada
Local time: 19:31
English
+ ...
Do not use WordPad Apr 14, 2011

I just tried opening a PO file with WordPad and from what I saw I do not recommend using WordPad to open any Unicode files. Even if I specify Unicode Text File it still failed to open the file as Unicode.

If you just want to check whether the file is OK, the easiest way would be to try opening the file in a web browser, then change the encoding to Unicode.


 
Russell Jones
Russell Jones  Identity Verified
United Kingdom
Local time: 00:31
Italian to English
TOPIC STARTER
Second thoughts Apr 14, 2011

Thank you Samuel and Ambrose

I'm afraid your suggestions are rather over my head but I'll try and research them.

However, if I open the files with Poedit, the characters appear correctly, so since the client also uses Poedit, will he/she be able to use this to create a usable format?


 
opolt
opolt  Identity Verified
Germany
Local time: 01:31
English to German
+ ...
With Samuel and Ambrose Apr 14, 2011

Russell,

obviously, as Ambrose has said, Wordpad is not fully Unicode compliant. You might also have a font problem, i.e. lack of complete fonts (though after seeing your language combo, I think it's very unlikely).

Wordpad is the poor man's word processor (and that's a very poor man indeed), and it's not a text editor in the narrower sense. The same goes for Notepad wrt text editing.

I know i
... See more
Russell,

obviously, as Ambrose has said, Wordpad is not fully Unicode compliant. You might also have a font problem, i.e. lack of complete fonts (though after seeing your language combo, I think it's very unlikely).

Wordpad is the poor man's word processor (and that's a very poor man indeed), and it's not a text editor in the narrower sense. The same goes for Notepad wrt text editing.

I know it's probably not going to help you much with your current problem -- but I've said it before and I'm going to say it again: somehow, somewhere you're going to hit the limits of what Wordpad/Notepad have to offer anyway, so a decent, industrial-strength text editor should be in the toolbox of every translator out there. Samuel has mentioned one option. Another one is jEdit (www.jedit.org) which is a very well behaving Java program, and is also freely available.
Collapse


 
Russell Jones
Russell Jones  Identity Verified
United Kingdom
Local time: 00:31
Italian to English
TOPIC STARTER
No progress Apr 14, 2011

Well, as I said the characters appear correctly in Poedit.

I've downloaded both jEdit and Alelpad but both convert these characters to the hieroglyphics mentioned before.


 
NMR (X)
NMR (X)
France
Local time: 01:31
French to Dutch
+ ...
Don't know if this is useful, but Apr 14, 2011

POEdit can convert into html. Then a check should be possible.

But I agree, it is very rudimentary software. Next time ask the client if he can give you an Excel sheet.

[Modifié le 2011-04-14 21:56 GMT]


 
Ambrose Li
Ambrose Li  Identity Verified
Canada
Local time: 19:31
English
+ ...
Never used these before but Apr 14, 2011

Russell Jones wrote:

Well, as I said the characters appear correctly in Poedit.

I've downloaded both jEdit and Alelpad but both convert these characters to the hieroglyphics mentioned before.


In case they support more than one encoding, make sure you open the file as a UTF-8 file. The specific “hieroglyphics” you are seeing is the result of opening the file in the wrong encoding (specifically opening a UTF-8 file as an ISO-8859-1 file).

Have you tried opening your file in a browser?


 
opolt
opolt  Identity Verified
Germany
Local time: 01:31
English to German
+ ...
Maybe a poedit bug then, though ... Apr 14, 2011

... I couldn't repeat the behaviour on my machine, even with some accented characters (like yours) thrown in. (Then again I'm on Linux, plus the version numbers of my software will be different.)

A few questions, let's see where the problem lies:

Did you receive the .po file from somewhere else, or did you create it yourself from a so-called .pot file?

Are you sure that the original .po file is really Unicode/UTF? It might not be.

After opening
... See more
... I couldn't repeat the behaviour on my machine, even with some accented characters (like yours) thrown in. (Then again I'm on Linux, plus the version numbers of my software will be different.)

A few questions, let's see where the problem lies:

Did you receive the .po file from somewhere else, or did you create it yourself from a so-called .pot file?

Are you sure that the original .po file is really Unicode/UTF? It might not be.

After opening your .po file in jEdit, does the status line at the bottom say "gettext,none,UFT-8", or does it say something else? Are there any differences (as per the status line indication) between the original file and your file?

At any rate, I'm not sure what you are trying to achieve with your editing outside poedit, why is it necessary?

Cheers.

[Edited at 2011-04-14 22:19 GMT]

[Edited at 2011-04-14 22:33 GMT]
Collapse


 
Ambrose Li
Ambrose Li  Identity Verified
Canada
Local time: 19:31
English
+ ...
I am pretty sure his original file is UTF-8 Apr 14, 2011

The way the characters show up in the other editors are consistent with the hypothesis that the original file is UTF-8 but incorrectly opened as an ISO-8859-1 file. (“è già” in UTF-8 is c3 a8 20 67 69 c3, which when interpreted as ISO-8859-1, will give “è giÔ.) I’d say the probability that this is a POedit bug is very low.

 
opolt
opolt  Identity Verified
Germany
Local time: 01:31
English to German
+ ...
Makes sense Ambrose, though ... Apr 14, 2011

... I'm puzzled why it should be opened as ISO-8859-1 by two or three different editors? JEdit, for instance, autodetects UTF just fine.

@Russell: In jEdit, you can do this: "File -> Reload with Encoding -> UTF-8"

PS @Ambrose: Might this be related to how default locale settings are handled on Windows???

[Edited at 2011-04-14 23:13 GMT]


 
Russell Jones
Russell Jones  Identity Verified
United Kingdom
Local time: 00:31
Italian to English
TOPIC STARTER
No solution yet Apr 15, 2011

I had uninstalled jEdit in disgust!
I tried again with Akelpad though, this time making sure the coding was UTF8 but it didn't help.
The original .po file was provided by the client and if I open it in Akelpad, the Settings show the Default codepage as 65001 (UTF-8).


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 01:31
Member (2006)
English to Afrikaans
+ ...
One more possibility Apr 15, 2011

Russell Jones wrote:
The original .po file was provided by the client and if I open it in Akelpad, the Settings show the Default codepage as 65001 (UTF-8).


I find that Akelpad guesses encodings correctly 99% of the time. For the other 1% of cases, I use Babelpad. If Babelpad says the file is ISO-8859-1 but Akelpad says it's UTF8, then it may well be ISO-8859-1. The problem with Babelpad is that you're more limited in how you can resave the file.

If your client asked you to use PoEdit, and if the translation looks good in PoEdit, then I don't think you need to worry, even if your translation looks weird in another program.


 
Russell Jones
Russell Jones  Identity Verified
United Kingdom
Local time: 00:31
Italian to English
TOPIC STARTER
Thank you Apr 15, 2011

That is very reassuring Samuel; I hope you're right.

I'll report back when I know.


 
Russell Jones
Russell Jones  Identity Verified
United Kingdom
Local time: 00:31
Italian to English
TOPIC STARTER
Good News Apr 15, 2011

I'm relieved to say my client has no problems with seeing these characters correctly.

Thank you everyone for your help; I clearly need to follow up your advice.


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Laureana Pavon[Call to this topic]

You can also contact site staff by submitting a support request »

Unicode and .po files






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »