Tag soup - accents
Thread poster: Thomas Carey
Thomas Carey
Thomas Carey  Identity Verified
Local time: 20:03
French to English
+ ...
Apr 23, 2014

Hi all,

I have recently been experiencing problems with tags in source when it comes to apostrophes and accents (source text in French). I use MemoQ 2013 R2 (6.8.56).

At first, I thought it was due to the document being a converted .pdf, but I've been told it is actually an original word file. Now, nearly every time I receive a document from this client, I have these tag problems.

The original file is a .doc file (97-2003). The main text font is Arial but t
... See more
Hi all,

I have recently been experiencing problems with tags in source when it comes to apostrophes and accents (source text in French). I use MemoQ 2013 R2 (6.8.56).

At first, I thought it was due to the document being a converted .pdf, but I've been told it is actually an original word file. Now, nearly every time I receive a document from this client, I have these tag problems.

The original file is a .doc file (97-2003). The main text font is Arial but the accents are in Arial Unicode MS (and the style is slightly different).

I tried saving the doc as docx, this sometimes works or helps but not perfectly.

I also selected all the text and changed all font to normal Arial, this removed tags but messed up the text by missing or repeating parts of sentences (I even ended up with what appeared to be some kind of Asian language!?!).

Has anyone else been experiencing such problems recently? Any suggestions?

Thanks,

Tom
Collapse


 
esperantisto
esperantisto  Identity Verified
Local time: 21:03
Member (2006)
English to Russian
+ ...
SITE LOCALIZER
OCRd document? Apr 23, 2014

Is your document really in MS Word? Your description makes me think that it's a result of OCRing by ABBYY FineReader or a similar program. In such a case, it's not really DOC, but RTF. Try the following:
1. Re-save to make sure it's really DOC. Or use Apache OpenOffice / LibreOffice and save to ODF.
2. Make sure that the entire text is in the same (desired) language.
3. After verifying the language, apply uniform font attributes such as fontface, height, color to the text.


 
Thomas Carey
Thomas Carey  Identity Verified
Local time: 20:03
French to English
+ ...
TOPIC STARTER
Solved Apr 23, 2014

Hi, thanks, esperantisto

Yes, pretty sure it was Word. Anyway, problem solved by selecting everything again, changing all font to normal arial and saving, and then saving once more in .docx. I don't know why that didn't work the first time I tried...

thanks again,

Tom


 
LEXpert
LEXpert  Identity Verified
United States
Local time: 13:03
Member (2008)
Croatian to English
+ ...
Similar problem, same solution Apr 23, 2014

Thomas Carey wrote:

Hi, thanks, esperantisto

Yes, pretty sure it was Word. Anyway, problem solved by selecting everything again, changing all font to normal arial and saving, and then saving once more in .docx. I don't know why that didn't work the first time I tried...

thanks again,

Tom


Glad that worked for you! I just saw your post, and I've had similar issues in MemoQ with umlauted characters in OCR'd German files - the often appear with a tag pair around every such letter.
Indeed, selecting all and changing the font to normal Arial usually resolves the problem. For good measure I also change the language to EN, but I'm not sure if that step makes any difference.


 
David Turner
David Turner  Identity Verified
Local time: 20:03
French to English
+ ...
As correctly surmised by esperantisto... Apr 29, 2014

Thomas Carey wrote:
At first, I thought it was due to the document being a converted .pdf, but I've been told it is actually an original word file. Now, nearly every time I receive a document from this client, I have these tag problems.
Tom


... the document is almost certainly a converted PDF if Arial Unicode MS is used for accents. PDFs really get in the way of agencies applying "Trados discounts" so some of them aren't too keen to pass on this information to translators in case they baulk at the idea ("no discounts for PDFs"). They are desperate to convert them and as a result there will usually be a whole host of other formatting and layout problems to be fixed before you can start the translation.


 
Thomas Carey
Thomas Carey  Identity Verified
Local time: 20:03
French to English
+ ...
TOPIC STARTER
Definitely Word Apr 29, 2014

David Turner wrote:

... the document is almost certainly a converted PDF if Arial Unicode MS is used for accents.


Hi, Thanks for the info.

Not this time. The sender confirmed the document was created in word.


 
Sofia Costa
Sofia Costa  Identity Verified
Portugal
Local time: 19:03
English to Portuguese
+ ...
Umlauted letters between tags in MemoQ May 29, 2015

Thank you all.

Your comments helped me a lot.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Tag soup - accents






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »