Fine Reader and Wordfast
Autor de la hebra: teddd76

teddd76
Local time: 22:00
inglés al francés
Nov 17, 2008

Hi,
I just bought Fine Reader 9. I used it this morning to convert a PDF file to Word. The conversion went OK and I could translate the document with Wordfast. However, when I tried to clean it I got the following message: “Failure segmented document, analysis was dropped”. I suspect it’s got to do with the fact the document was originally a PDF file. Do you have any idea what happened and how I could avoid this in the future? Thanks in advance for your help!


 

Sergei Leshchinsky  Identity Verified
Ucrania
Local time: 23:00
Miembro 2008
inglés al ruso
+ ...
only manual work is of value Nov 17, 2008

Never let FR do anything in Automatic mode!

Segment pages yourself. Only this way you will be sure the formatting is OK. After you recognise the text, check and adjust formatting in Word (don't forget to switch on "show unprintable characters" to see the formatting markers).

Only then start translating. It is better to spend one hour preparing the document, then not to know what to do at the end.

FR is good at recognising, but PDF is not a usual text format
... See more
Never let FR do anything in Automatic mode!

Segment pages yourself. Only this way you will be sure the formatting is OK. After you recognise the text, check and adjust formatting in Word (don't forget to switch on "show unprintable characters" to see the formatting markers).

Only then start translating. It is better to spend one hour preparing the document, then not to know what to do at the end.

FR is good at recognising, but PDF is not a usual text format and the program does not process it as a scanned page. It tries to extract the text by copying-pasting where possible. (It is proved by the error, when you try to recognise a file with content extraction protection on. If it were not using copy-paste method, but just printing to bit map and recognise, it would not produce error messages.)

Any automation still requires good manual input.
Collapse


 

teddd76
Local time: 22:00
inglés al francés
PERSONA QUE INICIÓ LA HEBRA
Thank you! Nov 17, 2008

Thank you Sergei!

Just another (dumb) question: how do I segment pages myself? I've just bought FR and the only mode I know is "automatic conversion"!


 

Claire Cox
Reino Unido
Local time: 21:00
francés al inglés
+ ...
Have you added Fine Reader to your Word add-ins? Nov 17, 2008

Just a thought; I know that if you allow Abbyy to be installed as part of the Word set-up (which is the default option when you set it up, unless you do a custom install), it can mess up the settings for Wordfast big time! If Fine reader is shown as an add-in under Tempates and Add-ins, uninstall it and reinstall without letting it be part of Word's set-up and you should avoid conflicting with the Wordfast template. Apologies if you've already done this, but it just struck me as something worth ... See more
Just a thought; I know that if you allow Abbyy to be installed as part of the Word set-up (which is the default option when you set it up, unless you do a custom install), it can mess up the settings for Wordfast big time! If Fine reader is shown as an add-in under Tempates and Add-ins, uninstall it and reinstall without letting it be part of Word's set-up and you should avoid conflicting with the Wordfast template. Apologies if you've already done this, but it just struck me as something worth checking.

Best of luck!

[Edited at 2008-11-17 23:04 GMT]
Collapse


 

teddd76
Local time: 22:00
inglés al francés
PERSONA QUE INICIÓ LA HEBRA
Thanks Nov 18, 2008

Ok I'll do that! Thanks for your help Claire and Sergei!

 

Sergei Leshchinsky  Identity Verified
Ucrania
Local time: 23:00
Miembro 2008
inglés al ruso
+ ...
there are tools Nov 18, 2008

how do I segment pages myself?


There are many toolbars. Use Image Toolbar to set the rectangular fields for OCR. You can modify the shape of the field, add/cut space. Assign whether it is a text or an image or a table. The behaviour will be different.

Play with the tools. Customize them. There are much more buttons than on the default toolbar.


 

Roman Bulkiewicz  Identity Verified
Polonia
Local time: 22:00
Miembro 2004
inglés al ucraniano
+ ...
segmenting and segmenting Nov 18, 2008

teddd76 wrote:
However, when I tried to clean it I got the following message: “Failure segmented document, analysis was dropped”. I suspect it’s got to do with the fact the document was originally a PDF file.


The "segmentation" referred to in the WF's message is the WF's segmentation and not segmentation you apply in the FR for text recognition/conversion. These two have nothing in common. The FR's segmentation determines how various pieces of the text are arranged in respect to each other and thus may affect the document's formatting, but it should not leave any traces in the converted document that might interfere with WF's segmentation.

On the other hand, the conversion from PDF does leave traces in the resulting document (regardless of the segmenation), and these are known to cause trouble when the document is opened in TagEditor. I've never had any problems with such documents in WF, though. Also, if the conversion were the cause, I would expect the WF segmentation problems to occur when you are translating. But if you'd completed the document successfully, and then WF could not clean it -- probably that means that the segmentation got messed up in the process of the translation? Did you check it? What Claire said makes sense, too.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Fine Reader and Wordfast

Advanced search






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Búsqueda de términos
  • Trabajos
  • Foros
  • Multiple search