When I used to study maths as a boy my Father, who was an engineer and very straightforward in his views, always used to say 100% was the best you could give. It meant everything, so there was no more. Any talk of giving 101% for example wouldn’t be entertained for a second because you clearly hadn’t given 100% in the first place. It wasn’t possible and anyone who said otherwise was probably in marketing or sales!
Whilst much of his influence lives on in me today I am a little more accepting of many things, and in particular why 100% isn’t enough, or at least why there’s a need for more. At least in terms of translation matching! So this article is a discussion around matching and I hope at the end of this you’ll understand some of the terminology we use when working with SDL Trados Studio and the effect it can have as you work.
BE WARNED… IT’S A LONG POST!
So let’s start off with matching in general and to do this we can take the following simple sentence:
The Project Manager works on location 3-days a week.
If this was translated into German it might be written like this:
Der Projektmanager arbeitet 3 Tage pro Woche vor Ort.
If I store that translation into my Translation Memory and later on I come across a sentence like this:
The Project Manager works 3-days a week.
Then I might expect to get what we call a fuzzy match. This means a match that is less than 100% but more than 0% because there is something similar about the sentence when compared to something you translated before. The measure of similarity is the match value. In practice Studio will only go down as far as 30% because anything less is generally unusable in practice. In Studio it would look like this and you can see I get an 84% fuzzy match because the translation should be “Der Projektmanager arbeitet 3 Tage pro Woche.” without the “vor Ort”
If I confirm the correct translation, so add “Der Projektmanager arbeitet 3 Tage pro Woche.” into the translation memory, then you’ll see that I now get two matches for these sentences when I come across them again in a document:
Now, if my text actually said this:
Rules for James: PROJECT MANAGER The Project Manager works on location 3-days a week.
Then I would expect to see a 100% match for the third segment and it would be correct. But if my text said this:
Rules for Sally: PROJECT MANAGER The Project Manager works on location 3-days a week.
Then the 100% translation would be incorrect because here the gender changes and the translation should be:
Die Projektmanagerin arbeitet 3 Tage pro Woche vor Ort.
This is a different translation because of the context which means all of a sudden 100% is not enough! So in order to allow for some disambiguation our development team (who must have known my Father and his views on a 101% match) decided to call this a Context Match when you know the context of the translation. A very important point however is that in order to ensure you can do this without adding the segment as a new translation, so relying on the context alone, you must be quite specific about your settings. So this is what I used:
So if I translate the three segments when James is the Project Manager and add this information to my Translation Memory then I might get this effect as I translate:
You can see that segments #10 and #11 are autopropagated 100% matches because they have a different context to segments #7 and #8. You know they are autopropagated because of the colour of the 100% match (this is configurable so you might see a different colour).
If I activate segment #7 (and apply the translation from the TM so the match value is added to the centre column) we have a 100% match as we’ve translated “PROJECT MANAGER” before and then we also have the same preceding segment (#6) for the source and the target. So the highlighted cells shown below match, making “PROJEKTMANAGER” a CM, or Context Match:
Similarly if we look at segment #8 we see the same pattern making this a CM as well:
If we look at segments #10 and #11 these are both 100% matches because they don’t match the criteria of having the same translation in the preceding segment. Typically in most other CAT tools segment #11 might still be considered their equivalent of a CM match (sometimes referred to as a 101% Match) because the target translation is not taken into consideration. So to correctly handle these segments for Sally we correct the translations and save them again by confirming the segments (Note: I did not add as a new translation) so that we now see this:
Note that we see both possible translations in the Translation Memory Results window at the top but that the correct context is recognised and inserted automatically into the document. Similarly if I activate segment #11 I see this:
This concept of disambiguation is further enhanced between 100% matches too by using the Document Structure Information in the right hand column of Studio… so this one:
The basic idea is that when you have multiple 100% matches the Document Structure Information can be used to automatically ensure the correct 100% match value is presented first. This is far easier to explain with a video so Daniel Brockman kindly provided one for me with a very clear example of what this means:
I do have one more interesting thing to note after receiving a question from one of our resellers about why a TU is sometimes added to your TM when you work and why it is sometimes not. The reason is linked to the Context Match again but I better explain the situation first.
Let’s take the following example:
If I confirm segment #1 and look in my TM I see one translation at TU number #1.
If I then edit this same segment with a different translation and confirm it again I still only have one TU, but now it’s numbered TU number #2 because the database recognises this is an edited TU in the same context and replaces the first one with a completely new addition to the TM:
If I then translate segment #2 which is the same source as segment #1, but a different context, with a different translation and confirm it like this:
I will actually get a new TU like this, rather than simply overwriting the original:
So the context of your translation is very important for a number of ways to help improve the surety of your Translation Memory leverage when working and also to minimise the amount of unnecessary TUs added to your TM as you work. You may also have just found an explanation for the numbering of your TM not always being consecutive throughout the file when viewed in the TM Maintenance window.
The next part of this matching discussion should cover Perfect Match… but I think I’ll leave that for a separate article in the future… and also take it as a reminder it’s Valentines Day tomorrow and I’m in big trouble!