Help! Text boxes into text?
Autor de la hebra: Eleni Makantani
Eleni Makantani
Eleni Makantani
Grecia
Local time: 17:33
inglés al griego
+ ...
Apr 23, 2012

Hello everyone,

I am working on an MS Word file with high repetitivity if worked on Trados, which is much wished.

The only problem is that all text in that file is in text boxes (apparently, it was a pdf file converted in Word by the client using OCR - the pdf is not available), which makes it extremely hard to work with Word+Trados. Also, it cannot be worked on TagEditor, as there appear to be a million tags even in between single words.

The question is:
... See more
Hello everyone,

I am working on an MS Word file with high repetitivity if worked on Trados, which is much wished.

The only problem is that all text in that file is in text boxes (apparently, it was a pdf file converted in Word by the client using OCR - the pdf is not available), which makes it extremely hard to work with Word+Trados. Also, it cannot be worked on TagEditor, as there appear to be a million tags even in between single words.

The question is: do you know any way to convert text boxes in plain text without losing their format/ content?

Many thanks for any answer!
Collapse


 
Jean Lachaud
Jean Lachaud  Identity Verified
Estados Unidos
Local time: 10:33
inglés al francés
+ ...
try Werecat Apr 23, 2012

http://www.volny.cz/ddaduc/werecat.html

Please read the warnings carefully.

I use Werecat with Wordfast, so, if it does actuially work with your version of Word, it ought to work with Trados, I suppose.

FWIW, I used Werecat recently in Word 2010/Win 7, and it worked as usual, but it involved a limited number of text boxes.


 
Tony M
Tony M
Francia
Local time: 16:33
Miembro
francés al inglés
+ ...
LOCALIZADOR DEL SITIO
Werecat Apr 23, 2012

PDF OCR to DOC is a pain when it uses text boxes to 're-create' the formatting! Much better to turn off the option at the time of conversion — but of course, by the time we get to see it, ti's too late

Werecat is a very helpful little utility, downloadable freeware, which was originally designed for Wordfast users — it extracts text from text boxes (with tags) into a Word .DOC that you can then translate as normal
... See more
PDF OCR to DOC is a pain when it uses text boxes to 're-create' the formatting! Much better to turn off the option at the time of conversion — but of course, by the time we get to see it, ti's too late

Werecat is a very helpful little utility, downloadable freeware, which was originally designed for Wordfast users — it extracts text from text boxes (with tags) into a Word .DOC that you can then translate as normal, and then it will put the text back into the right places for you!

There are some provisos: you mustn't either add or remove any hard returns, otherwise this messes up the re-insertion; if your translation makes this unavoidable, then you MUST repair them after cleaning and before re-insertion.

Otherwise, it works superbly well for .DOC and .PPT files (at least up to office XP, don't know about the latest versions...)

If you are not sure of yourself, feel free to send me the files and I'll pre- and post-process them for you.
Collapse


 
Eleni Makantani
Eleni Makantani
Grecia
Local time: 17:33
inglés al griego
+ ...
PERSONA QUE INICIÓ LA HEBRA
Thanks to both of you Apr 23, 2012

Thank you for your answers, I will certainly try our Warecat to see how it works. I also appreciate very much Tony's offer to help. In the mean time, I found my way around the problem:

I transformed the problematic word file back into pdf, using doPDF freeware and then I OCR-ed it again, seeing to avoid text boxes. Inconvenient as it may sound, this procedure worked like a wonder! I guess that cool blood and imagination are first-rank properties in our line of business...

... See more
Thank you for your answers, I will certainly try our Warecat to see how it works. I also appreciate very much Tony's offer to help. In the mean time, I found my way around the problem:

I transformed the problematic word file back into pdf, using doPDF freeware and then I OCR-ed it again, seeing to avoid text boxes. Inconvenient as it may sound, this procedure worked like a wonder! I guess that cool blood and imagination are first-rank properties in our line of business...

Thank you again!

[Edited at 2012-04-23 21:53 GMT]
Collapse


 
Maria Ramon
Maria Ramon  Identity Verified
Estados Unidos
Local time: 09:33
neerlandés al inglés
+ ...
Wordfast PRO Apr 24, 2012

Wordfast PRO works wonders when there are text boxes in Word documents.
That is what I would recommend using.


 
Sergei Leshchinsky
Sergei Leshchinsky  Identity Verified
Ucrania
Local time: 17:33
Miembro 2008
inglés al ruso
+ ...
Try a smarter PDF -> DOC converter, Apr 24, 2012

... if you have the source PDF file.

(Try SilidDocuments PDFtoWord.)

[Редактировалось 2012-04-24 07:01 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Help! Text boxes into text?






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »