Do you know about any software tool to count the number of identical segments in a document?
Autor de la hebra: Rafał Kotlicki

Rafał Kotlicki
Polonia
Local time: 22:45
inglés al polaco
+ ...
May 7, 2009

Hello,

do you know about any tool which makes it possible to count the number of identical segments in a document? I have this +40000 entry text with a lot of repetitions and need to calculate the actual (not total) number of entries. Thank you in advance!

Rafał

[Subject edited by staff or moderator 2009-05-07 13:07 GMT]


 

Heinrich Pesch  Identity Verified
Finlandia
Local time: 23:45
Miembro 2003
finlandés al alemán
+ ...
Any CAT-tool May 7, 2009

Run an analysis with Trados Workbench or Wordfast and it will show the percentage of repetitions. But do you need to know how many times a specific segment is repeated? That would be difficult.
Regards
Heinrich


 

Gerard de Noord  Identity Verified
Francia
Local time: 22:45
Miembro 2003
inglés al neerlandés
+ ...
Wordfast? May 7, 2009

Hi Rafał,

A Wordfast analyses reports e.g. this for each document and all documents.

Regards,
Gerard


Number of files: 6. Totals:
Analogy segments words char. %
---------------------------------------------------------
Repetitions 240 2292 15286 39%
100% 78 134 1019 2%
95%-99% 3 4 31 0
... See more
Hi Rafał,

A Wordfast analyses reports e.g. this for each document and all documents.

Regards,
Gerard


Number of files: 6. Totals:
Analogy segments words char. %
---------------------------------------------------------
Repetitions 240 2292 15286 39%
100% 78 134 1019 2%
95%-99% 3 4 31 0%
85%-94% 2 30 219 1%
75%-84% 14 53 378 1%
00%-74% 315 3426 23243 58%
Total 652 5939 40176
=========================================================
Note: The character count includes spaces.
Collapse


 

Adam Łobatiuk  Identity Verified
Polonia
Local time: 22:45
Miembro 2009
inglés al polaco
+ ...
Excel would work too May 7, 2009

If you need to calculate unique segments, you can paste the content into Excel.

In Word, replace all full stops with ".^p", all colons with ":^p" and all tabs (^t) with just the paragraph symbol (^p). Remove all empty paragraphs, and paste everything into Excel. This should give you all segments in separate rows. Now go to Data - Filter - Advanced filter, and check the Only unique records checkbox.

You can paste that back into Word and get the unique wordcount.


 

Rafał Kotlicki
Polonia
Local time: 22:45
inglés al polaco
+ ...
PERSONA QUE INICIÓ LA HEBRA
Excel works May 8, 2009

Thank you all for your replies. Yes, I needed to calculate the exact number of unique segments. Adam, your way does the trick most of the time. Thanks!

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Do you know about any software tool to count the number of identical segments in a document?

Advanced search






SDL MultiTerm 2021
One central location to store and manage multilingual terminology.

By providing access to all those involved in applying terminology (such as engineers, marketers, translators, and terminologists), our terminology management solution ensures consistent and high-quality content from source through to translation.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Búsqueda de términos
  • Trabajos
  • Foros
  • Multiple search