Translation from scanned images Autor de la hebra: Tatsuhiro Sugihira
|
Hi guys, So I am asked to quote for Japanese to English translation from a scanned copy of a manual type document. I have given my rate quote, but they seem to be asking for the total amount. I have tried to convert the original Japanese pdf file into word file using Nitro Pro 8 (which has a feature to convert pdf files into other file formats), but was unsuccessful. (It doesn't recognize Japanese I think.) It seems like the manual word count or some kind of... See more Hi guys, So I am asked to quote for Japanese to English translation from a scanned copy of a manual type document. I have given my rate quote, but they seem to be asking for the total amount. I have tried to convert the original Japanese pdf file into word file using Nitro Pro 8 (which has a feature to convert pdf files into other file formats), but was unsuccessful. (It doesn't recognize Japanese I think.) It seems like the manual word count or some kind of calculation using a formula is required. So assuming the translation rate is USD 0.10 per source Japanese letter, how much should I charge? Also, assumes that ultimately I might have to type Japanese text first from viewing scanned pdf files first, then translate the text into English. Additionally, the original file includes, different font sizes, different text orientation, charts with words and etc etc... Please tell me your opinion or experience working in a similar assignment. Thank you for any comments in advance. -Tatsu ▲ Collapse | | | You shouldn't have to type... | Aug 16, 2013 |
...all of it: for Optical Character Recognition I used the free http://en.wikipedia.org/wiki/Nuance_PDF_Reader for a number of years, in which opened PDF's are (were?) uploaded to their web site and then emailed back as doc etc. Now I'm using their PDF Converter 8 (50 USD) which is offline and therefore maybe a safer alternative. Also the free way is (was?) sometimes slower, when there's a ... See more ...all of it: for Optical Character Recognition I used the free http://en.wikipedia.org/wiki/Nuance_PDF_Reader for a number of years, in which opened PDF's are (were?) uploaded to their web site and then emailed back as doc etc. Now I'm using their PDF Converter 8 (50 USD) which is offline and therefore maybe a safer alternative. Also the free way is (was?) sometimes slower, when there's a lot of people using it/volume in one file. I can't remember how good the results for Japanese were in the free Reader, but in the paid version they've been good most times, and only really bad when the PDF was almost illegible (which too often is the case with stuff from Japanese clients, bless their hearts...). Or you can set a price per PDF page, of course. But it's still going to be less accurate an estimate than OCR+cleaning/estimating. Either way, best of luck, ganbare! Mårten
[Edited at 2013-08-16 00:23 GMT] ▲ Collapse | | |
I think if the pdf is from jpg then OCR may not work. | | | Tatsuhiro Sugihira Estados Unidos Local time: 08:28 inglés al japonés + ... PERSONA QUE INICIÓ LA HEBRA Trying OCR, so far no success | Aug 16, 2013 |
Marten, Thank you for mentioning about OCR process. I didn't know about it and trying it right now. So far I have tried using OCR in few programs and not successful. Srini, Yes you are kind of right about it. Actually, I'm not quite sure... OCR isn't working well because, -the file is based on Japanese (non-alphabet language) -the pdf is based on scanned images. I'm D/L OCR tool speciali... See more Marten, Thank you for mentioning about OCR process. I didn't know about it and trying it right now. So far I have tried using OCR in few programs and not successful. Srini, Yes you are kind of right about it. Actually, I'm not quite sure... OCR isn't working well because, -the file is based on Japanese (non-alphabet language) -the pdf is based on scanned images. I'm D/L OCR tool specialize (or at least it claims so) in East Asian language. If this doesn't work... then I might have to go the manual way... =/ ▲ Collapse | |
|
|
Elina Sellgren Finlandia Local time: 18:28 Miembro 2013 inglés al finlandés + ...
I have 'highlighted' text in PDF files (with your mouse), then copy+pasted it into Word. All the formatting disappears but you should be able to get the word count that way, if that's the main thing you need. Not sure how well it works with Japanese characters though. | | |
try to charge more for PDF originals, although I do not work in Chinese, because this usually requires additional work of this or that kind. Or, I ask the client to supply an editable version. Best, Branka | | | the worst... | Aug 16, 2013 |
Hi, I often get this kind of documents as well (certificates, sent by fax to the agency and afterwards emailed to me). Normally I am using ABBY Fine Reader Professional for converting pdf to word which works quite well. But all the mentioned methods do not work if you have a scanned document, because the whole text is saved as one picture. You cannot copy any part of it, use the 'extract text' function or similar. So you can neither use a CAT tool nor give an exact quo... See more Hi, I often get this kind of documents as well (certificates, sent by fax to the agency and afterwards emailed to me). Normally I am using ABBY Fine Reader Professional for converting pdf to word which works quite well. But all the mentioned methods do not work if you have a scanned document, because the whole text is saved as one picture. You cannot copy any part of it, use the 'extract text' function or similar. So you can neither use a CAT tool nor give an exact quote. I don't think it has anything to do with the Japanese characters. But if you have a difficult formatting converting is most of the times useless anyway. The layout work is so difficult afterwards that you are faster translating in a new Word document and format afterwards. But how do quote: I simply go and count the words on a full page and assume the same count for the other pages, plus an additional fare for all the layout work (because this can mean quite some time...) When quoting this way (which is in my favor) I always tell the client that when getting a word document I could give an exact quote, maybe give a small discount on repetitions and am much faster... They simply have to learn that we cannot work with a badly scanned PDF or fax but need the orginal document (which in the case of the PDF often is a PowerPoint or Word). You could also offer to invoice on the target word count, but remember to add your layout work to the price... Kind regards Sandra ▲ Collapse | | | Samuel Murray Países Bajos Local time: 17:28 Miembro 2006 inglés al afrikaans + ... Use the old-fashioned count method | Aug 16, 2013 |
Tatsu02 wrote: I have given my rate quote, but they seem to be asking for the total amount. Take a few random lines, count the number of characters in them, then multiply the average characters per line by the number of lines. The price will be slightly inflated, but you can always offer a discount in the end, if you feel guilty about it. | |
|
|
Just make sure you're paid for your time | Aug 16, 2013 |
Tatsu02 wrote: ultimately I might have to type Japanese text first from viewing scanned pdf files first, then translate the text into English. Surely, you'll just read the phrases in Japanese and type them in the target language if you can't convert it, won't you? As someone else mentioned, CAT tools are probably going to be useless on this job even if the file can be converted. So I can't personally see any circumstances where this would be necessary, or even advisable, but if you do it, make sure the client knows about it and pays for the time taken. Additionally, the original file includes, different font sizes, different text orientation, charts with words and etc etc... Does the client need all that formatting? That's going to take time if you're knowledgeable about these things, and lots of time if you aren't. Ask the client what their requirements are, bearing in mind that perfection will cost extra. If it's an agency, you can bet your bottom dollar that they are charging their client more for that formatting! Remember, there's absolutely no interest in you working for half your normal hourly rate simply because the client wants something complicated that you can't deliver. The client must pay you correctly or go elsewhere. You're better off refusing the job and spending the free time researching how to deal with the next similar request (as you're doing here), so that next time you can approach the job differently. Sometimes, we just have to say "No". No client would ever pay me what it would take for me to deal with this type of job (in my pair, of course!), so I just politely refuse such jobs. Somewhere out there, there will be someone who can reconstruct that document in a flash, using all sorts of IT tricks that I know nothing about, and will charge their normal per-word rate plus 5% or so for formatting. They're welcome to the job. | | | Target word rate would also work | Aug 16, 2013 |
In cases like this, I use a target word rate (typically 1.5x source word rate for ZH-->EN) and impose a "premium" for extra work OCR-ing, etc. | | | Tatsuhiro Sugihira Estados Unidos Local time: 08:28 inglés al japonés + ... PERSONA QUE INICIÓ LA HEBRA Tried all these OCR programs | Aug 16, 2013 |
ABBYY Fine Reader is the closest to the success after trying all these OCR programs. (It's having discount right now too. =D) It's OCR rate is around 60-70% I think. Which I don't really like it for going back and forth to check whether it's OCR is correct. The reason I would like to have pdf into proper word file at first is because CAT tool might be beneficial on this project considering it's technical (with repeated terms) and fairly large volume. Just thi... See more ABBYY Fine Reader is the closest to the success after trying all these OCR programs. (It's having discount right now too. =D) It's OCR rate is around 60-70% I think. Which I don't really like it for going back and forth to check whether it's OCR is correct. The reason I would like to have pdf into proper word file at first is because CAT tool might be beneficial on this project considering it's technical (with repeated terms) and fairly large volume. Just thinking about the benefit on using accurate and consistent words for the client. Another benefit would be any future translation review usage in the future (which is also for the client). Well I decided to provide the total project fee based on per page rate and also explained what work will be done and the final product at the end. Thanks guys for all your posts! =] ▲ Collapse | | | Manual and semi-manual solutions | Aug 16, 2013 |
You can always just type, whether or not you tell the client. Obviously, you can't type a long text in time to offer a reasonably quick quotation. Samuel's solution based on average counts per line, page etc. is also good, especially if you can find some standard formula to rely on for credibility. Alternatively, you can just simply tell the client that the alternative is manually counting the words, so you suggest this or that method of approximation. Almost all clients should be r... See more You can always just type, whether or not you tell the client. Obviously, you can't type a long text in time to offer a reasonably quick quotation. Samuel's solution based on average counts per line, page etc. is also good, especially if you can find some standard formula to rely on for credibility. Alternatively, you can just simply tell the client that the alternative is manually counting the words, so you suggest this or that method of approximation. Almost all clients should be reasonable and understand, and you really don't need to go to great lengths to avoid any remote possibility of charging a cent or two too high. Remember the approximate solution is just that, an approximation, and one that aims to makes lives easier by skipping the full manual count. So don't make it difficult. Also, Sandra may be right in that just simply counting the stuff manually may be less time consuming than finding sophisticated ways around the problem. Sometimes it really takes less time to do the footwork than to avoid it. Also, yeah, target count. I use target count in such situations. So do my agencies. There are some people who don't really understand this, but they'd normally realise that they aren't experts, so they shouldn't be too difficult to deal with. If they are, well, just put your foot down. You're the pro there. Oh, and avoid the kind of OCR that's more trouble than it's worth. If OCR increases your workload instead of reducing it, dump the OCR. Also, you could probably hire a student for typing if you need to. Get yourself a walk in the sunshine in the meantime. ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Translation from scanned images TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |