“Not only is my short-term memory horrible, but so is my short-term memory.” I have no idea who this quote can be attributed to, and its certainly not original, but it is quite appropriate when I start to think about the evolution of Trados. Ever since Trados Studio was launched you can be sure to find many “experts” in places like ProZ and even the SDL Community recommending you don’t upgrade because there is no difference compared to the last version. To be fair, if you only use a fraction of the features despite having used the software for a decade, then it probably is like this. The alternative being these “experts” have very short-term memories.
I know there are always things some users have been asking for that still don’t make it into the software, and I know there are bugs that users have been asking to be fixed that are still not fixed. But that’s still a long way from there being nothing new at all, and more importantly it’s a very long way from there being nothing to resolve some of the biggest problems Trados users have been complaining about for a decade at least.
So I thought I’d start 2021 off with an article on some new features in Trados Studio 2021 that address one of these big problems that I can only assume has been forgotten. I also intended this to be brief as I originally wanted to make this an article on all things important about translation memories that I hadn’t covered before. After I drafted the things I wanted to cover it was definitely not brief! So I picked just one thing… but it’s still not brief at all and it’s only about one new feature in Trados Studio 2021. I’ll come back to the other things in the coming months, and in the meantime grab a coffee if I’ve still got your interest!
What is this decade old problem?
First, here’s a reminder:
All of these articles deal with the problem of working with numbers, dates, currencies, measurements, that are not recognised by Studio. I’m pretty sure by now the majority of users will know exactly what I’m referring to and even the memories of a few naysayers may have been jogged. The Trados Studio 2021 release provides a long awaited solution to this problem out of the box. It doesn’t go as far as I’d have personally liked to see (more on this later), but nonetheless it’s a little talked about feature that really does do a great job of solving this problem for the vast majority of use cases.
What part of Trados Studio are we referring to?
If we think back to Trados Studio 2019 and earlier we have these features:
This was useful to some extent, if you even knew this was here because you needed to navigate to the specific language pair (en(US) – es(ES) for example) as opposed to All Language Pairs to find it since the settings will vary from language to language. This provides a few basic mechanisms for controlling how dates, times and measurements are handled in the target… but ONLY if they are recognised in the source. Hence the need for the aforementioned articles and endless explanations in the communities to help users manage texts where the source text isn’t quite so straight forward or you want to have something in the target that isn’t in the list.
Trados Studio 2021 features…
If we now take a look at Trados Studio 2021 we have something new. First of all in the language resources of your Translation Memory settings:
It’s probably a little hard to see but the additional options in your language resources relate to:
Each of these provide you with the ability to create your own custom recognisers (a pattern to recognise a date, or number for example… like dd.mm.yyyy). These determine what constitutes a recognised token (something your translation memory knows is a number, or a date or a currency symbol for example… so it only needs to store it once for all numbers, dates and currencies) in the source text, and what this should look like in the target translation. We still have the options you had in 2019, but these have changed a little. First of all you won’t see them at all if you don’t have a TM available in the specific language pair you opened. Instead you’ll see this message.
For Dates and Times:
For Numbers and currency it’s a similar thing. The TM can be at the All Language Pair level, but you still have to make changes to these settings at the specific language pair. Once you add your TM you’ll see the details which look like this.
For Dates and Times:
For Numbers and currency:
The options are mostly self-explanatory, and certainly the Dates and Times will look familiar. But there are some important concepts which I’m going to focus on.
First of all, the options here will change if you add your own custom recognisers. I’ll use number formats to illustrate this concept but the same principle, on the whole applies to all of the new options here. If there are exceptions I’ll tackle them in a section further down… more apologies for the long article and suggestion that you ready the resources for a second coffee!
For example, if I add this rather unlikely pair of separators (plus symbol for the thousands separator and a star symbol for the decimal separator) to an English source in my language resources for an en(US) – pl(PL) translation memory:
And this to the Polish target (minus symbol for the thousands separator and a backslash for the decimal separator):
I will now see my new custom recognisers in my Options or Project Settings when using this TM:
An important note here is that the new recogniser is not default, so don’t forget to check it once you’ve added it or it won’t be used. But once I do check it I now get this used in the TM and also for QuickPlace:
Note that the 2+400*65 has a blue underline in the source and it’s not broken at all which means that Studio is correctly seeing this as two thousand, four hundred point sixty five and not as separate numbers, and as a result is able to correctly autolocalise the target into the format I wanted. By “broken” I mean this:
In this image the translation memory has not recognised this as one number so I get three tokens, one for each of the three numbers Trados Studio thinks are here. The really important point about this is that since it’s properly recognised you will also find that QA checks will benefit from this too as well as your translation memory.
You should also note that if I enter the first segment which uses different separators I will get the same auto-localization because the comma and period used for the separators here are also recognised as they are actually default recognisers in my translation memory language resources as you can see from the images earlier on:
This brings me onto the next very important point…
For TM matches, copy the TM format…
The default settings for all the new features here will always take the formats you have used in your TM (translation memory) before, in preference to the format you checked in the image above. Since I created a brand new translation memory for the examples I have shown so far, the only option would be to use the tokens I instructed it to use. But the beauty of this option is that it will allow you to have multiple source formats handled by your TM in whatever way you have handled them before. So if I did have some other examples of how these should really be handled in my TM then I could achieve this when I pre-translate for example:
Instead of using the one I checked in my project settings Trados Studio actually used the correct context for the source, not based on it simply being a recognised number and using whatever format I instructed it to use, but identifying the formatting that I had used in previous translations and applying the same formatting for each one. In my translation results pane I will see something like this:
You can see I get a Context Match for the correct formatting, and a 100% match for the incorrect formatting. It’s a 100% match because it’s still a recognised number based on what I had translated in the past (which you can see was different to the current source content).
If I uncheck this option then Trados Studio will use the formatting in the target I instructed it to use and this would get me this sort of thing:
This time I have the hyphen and backslash separators for everything and there is no recognition of what I might have translated in the past. This behaviour pretty much mirrors the way Trados Studio always worked, but now you have more control over it and hopefully a better understanding of what’s going on under the hood.
Now, I know I’ve really laboured this point but it’s very important to understand. You have the power with Trados Studio 2021 to take control of your auto-substitution and no longer have to resort to the sort of workarounds I mentioned at the start. But you also have the ability to fail to understand why the software may be returning the results you are seeing when they are not the ones you wanted. So have a good play with this (by “play” I mean make up a document of your own with different numbers, dates, currencies, measurements to practice on) and make sure you understand what’s going on.
But what about the little symbols?
What little symbols? I’m referring to these little “Status” symbols:
Mats Linder of the TradosStudioManual fame asked me this question one night just before Xmas and this prompted an hour long discussion between myself and Vlad Bondor from the SDL Support team trying to figure out what these symbols actually meant and how they worked. The link to the question may provide more context but in a nutshell the status symbols are there to help you when you work with multiple translation memories and start to see rather strange results.
If you have translation memories with all the same separators then you would see something like the screenshot above. If you hover over the status symbol you’ll see this:
But, if you make a change in one or more of the enabled translation memories you are using for your project then you may see something like this:
To achieve this I simply added a new translation memory, and then added a new custom recogniser to it using a space for the thousands separator and a hyphen for the decimal separator. The result being neither of the translation memories have them both and this could cause me to have unexpected results if I’m allowing the format to be taken from things I have translated before by checking the box “For TM matches, copy the TM format“. If I hover over the symbol I now see this:
All makes sense now, and starts to give us some idea of the complexity involved in trying to make things simple for Trados Studio users and probably why this problem was never solved before. I think it’s very smart work from Kevin Flanagan who is mostly responsible for the development of this feature.
We only looked at numbers?
I used number for this article because it was an easy way to explain how Trados Studio is working. But there are other gems.
You can see the settings for currencies in the screenshots above as they are of course closely linked to numbers and appear in the same pane. But now you have more granular control over this because you can use different separators for currencies than you are using for numbers, and you can specify what currency symbol to use for a particular language (including creating your own), and you can instruct Trados Studio to place the symbol at the end of the number or before:
The ability to set defaults for the currency symbols and what should be used are here in your language resources in your translation memory:
There are a few “funnies” here however. First of all, the idea was that this list of currencies should be a complete list of all the currencies that it makes sense to use as defaults. As there are only 12 currency labels included out of the box it’s hardly complete… but I’m told we may add the non-latin or lowercase currency symbols from the 200+ available in dotNet in a future update. Don’t want to add them all as this could cause leverage issues when a previously recognised alphanumeric is now seen as a currency… a lot to think about isn’t there! But this isn’t a massive problem because you can add your own if the one you want isn’t in the current list of 12.
You might also find that you are getting recognition for a currency that isn’t in this list of 12. If you are this is because we still have some legacy code in there that uses dotNet capabilities to say “format this number like a currency in this locale”. Useful… but it could work on one computer and not another because it’s based on the version of windows you happen to be running. The list you can control is not platform dependent… it’s list dependent! Hence the reason we’ll probably revisit this and extend the default list for everyone.
It’s also important to always add the currency symbol to the source and to the target language. You may think it’s working fine with just adding to the source, but you could find that you won’t get proper auto-substitution from your translation memory if you don’t add it to both.
Dates and Times…
Just as we used to have in earlier versions of Trados Studio we have these lists:
The difference of course is that you can define what goes in them. You do that in same way as we did for numbers and currencies by adding them to the language resources in your translation memory. Don’t forget however that the list in the image above relates to the target language only. There is no need to define the source, it’s only important we recognise it. So here I have 37 formats in my en(US) and only 12 in my pl(PL) by default. The settings in the image above only give me the 12 target.
When you add your own a good tip is to click on the dropdown arrow first and this will tell you what’s allowed when defining these patterns:
These patterns are controlled so Trados Studio always knows what to expect and how to transform into the intended target. If you make a mistake it’ll be obvious because you’ll get something like this:
So you now have the ability to cater for most date formats you are ever likely to see (not forgetting what I said at the start about not going far enough) in the source. To ensure you use the correct target you just make sure you select the one you want in here (emboldened selections are the ones I changed from default):
Exactly the same idea for times, so I won’t expand on these.
In many ways my favourite aspect of this… at least it’s something I’ve been pushing every year since Trados Studio 2009 so I’m really happy to finally see it in the product. The same settings we’ve pretty much always have are still the same:
What’s new is the ability to customise what Trados Studio will recognise as a unit of measurement. You do that in here in the language resources of your translation memory:
The reason I’m happy to see this is because I created a document many years ago containing every SI Unit and I wrote them twice… once written with the correct spacing between the number and the unit, and then again with the incorrect spacing. I religiously tested against this document with every release hoping for an improvement, and until the Trados Studio 2020 release the best we ever achieved in terms of being able to recognise these units were 19% recognised when correctly written, and bizarrely, 56% recognised when incorrectly written (improved with Trados Studio 2014 SP2). Today I can achieve 100% recognition in all cases… and I can even add more. So for technical translators this feature is surely a blessing! And I can finally throw that old document away!
Oh yes… don’t forget that the controls here for spaces between the number and the unit also control the currency settings. Hopefully you won’t have documents that contain currencies and other units of measure that use different spaces!
What I’d have personally liked to see…
For all the right reasons Trados Studio as a product has been built (as much as possible) to save the translator from having to be technical. So no need to have to use regular expressions to construct the custom recognisers for all the dates, times, numbers etc. for example. This has its good points, but it also has negatives. The negatives are exactly why we had all the workarounds mentioned at the start of this rather long article (my apologies again… nearly done!). The negatives are also why we have tools like the Terminjector, and the Regex Match AutoSuggest Provider on the appstore. In fact there are so many places a little knowledge of regular expressions are helpful that it’s virtually mandatory training for any translator… especially if they use other tools where this sort of technical knowledge is required to handle anything like the sort of things discussed in this article.
The reason I don’t think we’ve gone far enough… even though I think we have created a wonderful set of options with this new feature-set for Trados Studio 2021… is that we still can’t cover all possibilities even though what’s left may be quite small and rarely a problem.
Take these dates for example:
Some modern, and some old style… neither totally unusual… none of them handled by Trados Studio and the out of the box customisation possible isn’t helpful here. None of this is insurmountable, but workarounds are required. Using a similar approach to the Regex Match AutoSuggest Provider as an example, would have made it possible to cater for absolutely anything so you could recognise these as follows:
- 12th November, in the year 1863 (recognise the entire string as a placeable and convert as needed)
- circa. 2011 (recognise the entire string as a placeable and convert as needed)
- bap y 23d of Dece’ber (recognise this part of the string as a placeable and convert as needed)
- March yr 6th (recognise this part of the string as a placeable and convert as needed)
I’m not going to demonstrate how this is achieved in this article, but will write another one on the Regex Match AutoSuggest Provider if anyone would like one. But this is what I meant when I said I didn’t think we went far enough… I’d have liked to be able to handle anything at all and not just dates, times, numbers, currencies and measurements.
Brilliant and clearly underrated new feature in Trados Studio 2021… I shouldn’t use my holidays to write articles like this as they always end up longer than anticipated… I clearly forgot that this is what happens so also suffer with short term memory problems… you may need another coffee!