Pages in topic:   [1 2 3 4 5] >
Lift technology - is it on its way?
Thread poster: Wojciech_ (X)
Wojciech_ (X)
Wojciech_ (X)
Poland
Local time: 13:49
English to Polish
+ ...
Aug 21, 2015

A question to SDL specialists here. I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift and from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation (MemoQ offers something similar called "Guess translation" when using Concordance, but it's very inaccurate).

I have learnt that SDL has acquired the technology and my questi
... See more
A question to SDL specialists here. I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift and from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation (MemoQ offers something similar called "Guess translation" when using Concordance, but it's very inaccurate).

I have learnt that SDL has acquired the technology and my question is - will it be available soon in the next incarnation of Studio? I have seen a video where the technology is already implemented into one of the versions of Studio and the shown results were truly impressive.

Thank you.
Collapse


 
RWS Community
RWS Community
United Kingdom
Local time: 13:49
English
It has been... Aug 21, 2015

pro-lingua wrote:

A question to SDL specialists here. I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift and from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation (MemoQ offers something similar called "Guess translation" when using Concordance, but it's very inaccurate).

I have learnt that SDL has acquired the technology and my question is - will it be available soon in the next incarnation of Studio? I have seen a video where the technology is already implemented into one of the versions of Studio and the shown results were truly impressive.

Thank you.


... in Studio 2015. It is very cool and returns results from TM lookups (100% and Fuzzy) as well as concordance chunks. So when this is all combined with standard AutoSuggest Dictionaries, and even Machine Translation AutoSuggest and Regex AutoSuggest you have a very impressive resource at your fingertips.

It also means you can start with an empty TM and you get results immediately based on what's in your TM, and this is something many users wanted for a long time. The AutoSuggest Dictionaries are very cfocussed and still provide excellent suggestions but this does add to the overall solution.

Regards

Paul
SDL Community Support


 
Roy Oestensen
Roy Oestensen  Identity Verified
Denmark
Local time: 13:49
Member (2010)
English to Norwegian (Bokmal)
+ ...
Same as Deep Mining in Dejavu? Aug 21, 2015

I get the impression that what you describe is something similar to what Dejavu calls Deep mining, where Dejavu tries to find the best translation from the context in the TM. It sounds very good, but apparently it doesn't quite deliver what you would expect.

I am not convinced by a good presentation which may give a special circumstance where it works well. It regretably doesn't necessarily follow that it will do so in general usage. Instead it often gives worse results than having
... See more
I get the impression that what you describe is something similar to what Dejavu calls Deep mining, where Dejavu tries to find the best translation from the context in the TM. It sounds very good, but apparently it doesn't quite deliver what you would expect.

I am not convinced by a good presentation which may give a special circumstance where it works well. It regretably doesn't necessarily follow that it will do so in general usage. Instead it often gives worse results than having it turned on.

So I would rather wait and see.
Collapse


 
Patrick Porter
Patrick Porter
United States
Local time: 07:49
Spanish to English
+ ...
already possible...sort-of Aug 21, 2015

pro-lingua wrote:
...from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation...


In general terms, this is basically what statistical machine translation accomplishes. You could get this kind of result by using your TMs to train a machine translation model. In fact, I do this all the time and it ends up being a really great resource to have while I'm working. The result is like an automatic recursive concordance search. Of course, there are many different ways you could implement that, with varying results, but there are some well-developed and mature SMT tools that make it easy.

I use the Moses SMT toolkit, which is relatively simple to use (once implemented) and has an easy-to-follow user manual if you are familiar with any Linux OS. There is also an open-source project called "Casmacat Home" which has a browser-based user interface to make uploading TMs and using them to train models fairly easy. The advantage to these is my MT models reside on my machine at all times and I don't even need internet access to use them.

As another alternative, it seems that Microsoft now has a portal for training your own private MT engines on their servers and then accessing via the Translator API. This may be simpler to use for some people.

I would recommend trying one of these methods. Even with relatively small TMs (well...small from the MT corpus point of view...like 50,000 segments), my results have been very effective as a resource. I mean...if you are realistic about the quality of the output, i.e. just looking for a way to help you automatically look up previously translated shorter subsegments, and not really looking to make your own full-fledged general MT engine, then an SMT tool works really well.

There is even a way to set up Moses so that you can update the models with every segment you translate, so it sort of learns as you work, without having to re-train the model every time you have new data. Right now I'm using this method in Trados Studio with a prototype plugin I've developed. My plan is to release this plugin as an open-source project in the next few weeks, so if you are interested, watch my GitHub profile for an update on this. I'm also thinking of making some videos/tutorials about this topic.

[Edited at 2015-08-21 12:48 GMT]


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:49
Member (2009)
Dutch to English
+ ...
Agree with Roy Aug 21, 2015

I have to see it to believe it, although it does sound interesting, and I have heard good things about this "Lift" thing.

Roy mentioned that it sounds like the Deep Miner in Déjà Vu. CafeTran has its own version, simply called "auto-assembly", which a few users love, and most of us leave switched off. It really depends on how carefully you prepare the resources it feeds off of, and of course, the type of work you do (preferably highly standardised, repetitive, consisting of small
... See more
I have to see it to believe it, although it does sound interesting, and I have heard good things about this "Lift" thing.

Roy mentioned that it sounds like the Deep Miner in Déjà Vu. CafeTran has its own version, simply called "auto-assembly", which a few users love, and most of us leave switched off. It really depends on how carefully you prepare the resources it feeds off of, and of course, the type of work you do (preferably highly standardised, repetitive, consisting of small chunks, etc).

Michael
Collapse


 
Meta Arkadia
Meta Arkadia
Local time: 18:49
English to Indonesian
+ ...
Hits Aug 21, 2015

Michael Beijer wrote:
Roy mentioned that it sounds like the Deep Miner in Déjà Vu. CafeTran has its own version, simply called "auto-assembly"

Nope. It used to be called "Subsegment Matching" and is now "Hits." I have no idea how many CafeTran users use AA and/or Hits. I use AA extensively, but usually disable Hits, because the latter costs too much time and triggers too many false positives.

Hans


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:49
Member (2009)
Dutch to English
+ ...
@Hans: Aug 21, 2015

Meta Arkadia wrote:

Michael Beijer wrote:
Roy mentioned that it sounds like the Deep Miner in Déjà Vu. CafeTran has its own version, simply called "auto-assembly"

Nope. It used to be called "Subsegment Matching" and is now "Hits." I have no idea how many CafeTran users use AA and/or Hits. I use AA extensively, but usually disable Hits, because the latter costs too much time and triggers too many false positives.

Hans


Not sure is calling it merely "Hits" does it justice, but I see what you mean:

Both DVX and CT can auto-assemble stuff.

(1) In DVX, you can switch on "DeepMiner" to assist this process.

(2) The CafeTran counterpart is enabling "Fuzzy & Hits" in the the "Matching type" drop-down menu in the TM settings.

Both DeepMiner and "Hits" involve the CAT tool trying to guess stuff.


 
Meta Arkadia
Meta Arkadia
Local time: 18:49
English to Indonesian
+ ...
Explanation Aug 21, 2015

I'll try to explain the process(es), not to try to be the great educator, but more to see if I understand it myself. Which is why it's very simplistic, out of necessity. So please correct me if I'm wrong.

A CAT tool will first look if there are exact matches for the segment. If there aren't, some tools will try to find close matches, supplemented by marches in the termbase(s). This is still on segment level, and is usually called Auto-Assembly, either inserted, or not. If there are
... See more
I'll try to explain the process(es), not to try to be the great educator, but more to see if I understand it myself. Which is why it's very simplistic, out of necessity. So please correct me if I'm wrong.

A CAT tool will first look if there are exact matches for the segment. If there aren't, some tools will try to find close matches, supplemented by marches in the termbase(s). This is still on segment level, and is usually called Auto-Assembly, either inserted, or not. If there are still missing parts, it will look for them in other segments of the TMs, thereby "leaving" the segment. Which is why it can take a lot of time, especially in the case of large and/or multiple TMs. This is called "Dynamic TM Analysis," or Deep Miner in DejaVu and Subsegment Matching (now Hits) in CafeTran. I like the term"Lift" because it the process "lifts' a part of another segment, and drops it in the current one.

Related is Auto-Suggest/Auto-Complete etc., which also looks for terms in other segments (and termbases), but only inserts the match when you start typing. It doesn't Auto -Assemble.

Hans



[Edited at 2015-08-21 14:55 GMT]
Collapse


 
Wojciech_ (X)
Wojciech_ (X)
Poland
Local time: 13:49
English to Polish
+ ...
TOPIC STARTER
How I understand this. Aug 21, 2015

I actually think that subsegment matching is something different from Assemble function, but the first can be involved in the second one.

What I understand by Assemble function (in DVX and Cafetran) is the app trying to literally assemble the whole segment from various sources (MT, TM, Glossaries etc), while subsegment matching is trying to find smaller chunks, phrases from the TMs of the user.

In the past I remember Wordfast Classic had a function wherein if there were
... See more
I actually think that subsegment matching is something different from Assemble function, but the first can be involved in the second one.

What I understand by Assemble function (in DVX and Cafetran) is the app trying to literally assemble the whole segment from various sources (MT, TM, Glossaries etc), while subsegment matching is trying to find smaller chunks, phrases from the TMs of the user.

In the past I remember Wordfast Classic had a function wherein if there were no 100% or fuzzy matches, WF looked for phrases and the user could adjust how long (in words) the phrases were to be. This, I believe, was the predecessor of today's subsegment matching.

As I understand, Lift searches for the phrases (that otherwise would produce no match, because the rest of the sentence is different) and ALSO highlights appropriate translation of the phrase in your TM's target language. As I mentioned before, MemoQ has the "guess the translation" function in its Concordance, but from what I could see it's very unreliable.
Lift, however, is able to provide appropriate translations more frequently. Thanks to this, the target translation can be fed to you e.g. via Autosuggest.

Anyway, look at this video:
http://www.kftrans.co.uk/lift/
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:49
Member (2006)
English to Afrikaans
+ ...
WFC Aug 21, 2015

pro-lingua wrote:
In the past I remember Wordfast Classic had a function wherein if there were no 100% or fuzzy matches, WF looked for phrases and the user could adjust how long (in words) the phrases were to be. This, I believe, was the predecessor of today's subsegment matching.


I'm a long-time WFC user and I don't recall such a feature. I do recall, however, an old feature called "subfuzzy matching", wherein if the segment itself was very short (2-4 words), WFC would propose matches that it guessed contained useful words even though the match was below the fuzzy threshold. I also recall that it used to be possible to set WFC's fuzzy threshold really, really low, which would yield fuzzy matches that contained only phrase matches, but it wasn't an intelligent phrase matching service. The current version of WFC can't find matches below 50%, at all.


[Edited at 2015-08-21 19:31 GMT]


 
Meta Arkadia
Meta Arkadia
Local time: 18:49
English to Indonesian
+ ...
Find & Replace Aug 22, 2015

Meta Arkadia wrote:
...and is usually called Auto-Assembly...


Please replace all instances of "Auto-Assemble" by "Fuzzy Matches," it may make slightly more sense.

Cheers,

Hans


 
Dominique Pivard
Dominique Pivard  Identity Verified
Local time: 14:49
Finnish to French
Which site? Aug 22, 2015

pro-lingua wrote:
I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift

Would you care to share this source?


 
Wojciech_ (X)
Wojciech_ (X)
Poland
Local time: 13:49
English to Polish
+ ...
TOPIC STARTER
Link Aug 22, 2015

Dominique Pivard wrote:

pro-lingua wrote:
I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift

Would you care to share this source?


It's one of the articles listed below the video that I gave the link to in my previous post. Sorry I'm writing from my mobile, so it's slightly difficult to provide the link directly.


 
Dominique Pivard
Dominique Pivard  Identity Verified
Local time: 14:49
Finnish to French
Already in 2014? Aug 22, 2015

SDL Community wrote:
It has been ... in Studio 2015.

According to this paper, it was already implemented in Studio 2014. Did 2015 bring something new (in that respect) compared to 2014?


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:49
Member (2009)
Dutch to English
+ ...
"… SDL has acquired the technology …" :-( Aug 22, 2015

pro-lingua wrote:

A question to SDL specialists here. I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift and from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation (MemoQ offers something similar called "Guess translation" when using Concordance, but it's very inaccurate).

I have learnt that SDL has acquired the technology and my question is - will it be available soon in the next incarnation of Studio? I have seen a video where the technology is already implemented into one of the versions of Studio and the shown results were truly impressive.

Thank you.


A pity. Wonder when they will gobble up memoQ, Wordfast and Across too.

SDLX + Trados
->
SDL Trados Studio
->
SDL Trados memoQ Studio
->
SDL Trados memoQ Wordfast Studio
->
SDL Trados memoQ Wordfast Across Studio


Bad for the industry.

[Edited at 2015-08-22 07:59 GMT]

[Edited at 2015-08-22 07:59 GMT]


 
Pages in topic:   [1 2 3 4 5] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Lift technology - is it on its way?







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »