This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
Do you know about any software tool to count the number of identical segments in a document?
Autor de la hebra: Rafał Kotlicki
Rafał Kotlicki Polonia Local time: 22:45 inglés al polaco + ...
May 7, 2009
Hello,
do you know about any tool which makes it possible to count the number of identical segments in a document? I have this +40000 entry text with a lot of repetitions and need to calculate the actual (not total) number of entries. Thank you in advance!
Rafał
[Subject edited by staff or moderator 2009-05-07 13:07 GMT]
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Heinrich Pesch Finlandia Local time: 23:45 Miembro 2003 finlandés al alemán + ...
Any CAT-tool
May 7, 2009
Run an analysis with Trados Workbench or Wordfast and it will show the percentage of repetitions. But do you need to know how many times a specific segment is repeated? That would be difficult.
Regards
Heinrich
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Gerard de Noord Francia Local time: 22:45 Miembro 2003 inglés al neerlandés + ...
Wordfast?
May 7, 2009
Hi Rafał,
A Wordfast analyses reports e.g. this for each document and all documents.
Regards,
Gerard
Number of files: 6. Totals:
Analogy segments words char. %
---------------------------------------------------------
Repetitions 240 2292 15286 39%
100% 78 134 1019 2%
95%-99% 3 4 31 0... See more
Hi Rafał,
A Wordfast analyses reports e.g. this for each document and all documents.
Regards,
Gerard
Number of files: 6. Totals:
Analogy segments words char. %
---------------------------------------------------------
Repetitions 240 2292 15286 39%
100% 78 134 1019 2%
95%-99% 3 4 31 0%
85%-94% 2 30 219 1%
75%-84% 14 53 378 1%
00%-74% 315 3426 23243 58%
Total 652 5939 40176
=========================================================
Note: The character count includes spaces. ▲ Collapse
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Adam Łobatiuk Polonia Local time: 22:45 Miembro 2009 inglés al polaco + ...
Excel would work too
May 7, 2009
If you need to calculate unique segments, you can paste the content into Excel.
In Word, replace all full stops with ".^p", all colons with ":^p" and all tabs (^t) with just the paragraph symbol (^p). Remove all empty paragraphs, and paste everything into Excel. This should give you all segments in separate rows. Now go to Data - Filter - Advanced filter, and check the Only unique records checkbox.
You can paste that back into Word and get the unique wordcount.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Rafał Kotlicki Polonia Local time: 22:45 inglés al polaco + ...
PERSONA QUE INICIÓ LA HEBRA
Excel works
May 8, 2009
Thank you all for your replies. Yes, I needed to calculate the exact number of unique segments. Adam, your way does the trick most of the time. Thanks!
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
One central location to store and manage multilingual terminology.
By providing access to all those involved in applying terminology (such as engineers, marketers, translators, and terminologists), our terminology management solution ensures consistent and high-quality content from source through to translation.
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.