Short term memories…

“Not only is my short-term memory horrible, but so is my short-term memory.”  I have no idea who this quote can be attributed to, and its certainly not original, but it is quite appropriate when I start to think about the evolution of Trados.  Ever since Trados Studio was launched you can be sure to find many “experts” in places like ProZ and even the SDL Community recommending you don’t upgrade because there is no difference compared to the last version.  To be fair, if you only use a fraction of the features despite having used the software for a decade, then it probably is like this.  The alternative being these “experts” have very short-term memories.

I know there are always things some users have been asking for that still don’t make it into the software, and I know there are bugs that users have been asking to be fixed that are still not fixed.  But that’s still a long way from there being nothing new at all, and more importantly it’s a very long way from there being nothing to resolve some of the biggest problems Trados users have been complaining about for a decade at least.

So I thought I’d start 2021 off with an article on some new features in Trados Studio 2021 that address one of these big problems that I can only assume has been forgotten.  I also intended this to be brief as I originally wanted to make this an article on all things important about translation memories that I hadn’t covered before.  After I drafted the things I wanted to cover it was definitely not brief!  So I picked just one thing… but it’s still not brief at all and it’s only about one new feature in Trados Studio 2021.  I’ll come back to the other things in the coming months, and in the meantime grab a coffee if I’ve still got your interest!

What is this decade old problem?

First, here’s a reminder:

Working with placeables that are not automatically recognised

Search and replace with Regex in Studio – Regular Expressions Part 3

Spaces and Units…

All of these articles deal with the problem of working with numbers, dates, currencies, measurements, that are not recognised by Studio.  I’m pretty sure by now the majority of users will know exactly what I’m referring to and even the memories of a few naysayers may have been jogged.  The Trados Studio 2021 release provides a long awaited solution to this problem out of the box.  It doesn’t go as far as I’d have personally liked to see (more on this later), but nonetheless it’s a little talked about feature that really does do a great job of solving this problem for the vast majority of use cases.

What part of Trados Studio are we referring to?

If we think back to Trados Studio 2019 and earlier we have these features:

This was useful to some extent, if you even knew this was here because you needed to navigate to the specific language pair (en(US) – es(ES) for example) as opposed to All Language Pairs to find it since the settings will vary from language to language.  This provides a few basic mechanisms for controlling how dates, times and measurements are handled in the target… but ONLY if they are recognised in the source.  Hence the need for the aforementioned articles and endless explanations in the communities to help users manage texts where the source text isn’t quite so straight forward or you want to have something in the target that isn’t in the list.

Trados Studio 2021 features…

If we now take a look at Trados Studio 2021 we have something new.  First of all in the language resources of your Translation Memory settings:

It’s probably a little hard to see but the additional options in your language resources relate to:

  • Dates
  • Times
  • Numbers
  • Measurements
  • Currencies

Each of these provide you with the ability to create your own custom recognisers (a pattern to recognise a date, or number for example… like dd.mm.yyyy).  These determine what constitutes a recognised token (something your translation memory knows is a number, or a date or a currency symbol for example… so it only needs to store it once for all numbers, dates and currencies) in the source text, and what this should look like in the target translation.  We still have the options you had in 2019, but these have changed a little. First of all you won’t see them at all if you don’t have a TM available in the specific language pair you opened.  Instead you’ll see this message.

For Dates and Times:

For Numbers and currency it’s a similar thing.  The TM can be at the All Language Pair level, but you still have to make changes to these settings at the specific language pair.  Once you add your TM you’ll see the details which look like this.

For Dates and Times:

For Numbers and currency:

The options are mostly self-explanatory, and certainly the Dates and Times will look familiar.  But there are some important concepts which I’m going to focus on.

Important Concepts…

Custom Recognisers…

First of all, the options here will change if you add your own custom recognisers.  I’ll use number formats to illustrate this concept but the same principle, on the whole applies to all of the new options here.  If there are exceptions I’ll tackle them in a section further down… more apologies for the long article and suggestion that you ready the resources for a second coffee!

For example, if I add this rather unlikely pair of separators (plus symbol for the thousands separator and a star symbol for the decimal separator) to an English source in my language resources for an en(US) – pl(PL) translation memory:

And this to the Polish target (minus symbol for the thousands separator and a backslash for the decimal separator):

I will now see my new custom recognisers in my Options or Project Settings when using this TM:

An important note here is that the new recogniser is not default, so don’t forget to check it once you’ve added it or it won’t be used.  But once I do check it I now get this used in the TM and also for QuickPlace:

Note that the 2+400*65 has a blue underline in the source and it’s not broken at all which means that Studio is correctly seeing this as two thousand, four hundred point sixty five and not as separate numbers, and as a result is able to correctly autolocalise the target into the format I wanted.  By “broken” I mean this:

In this image the translation memory has not recognised this as one number so I get three tokens, one for each of the three numbers Trados Studio thinks are here.  The really important point about this is that since it’s properly recognised you will also find that QA checks will benefit from this too as well as your translation memory.

You should also note that if I enter the first segment which uses different separators I will get the same auto-localization because the comma and period used for the separators here are also recognised as they are actually default recognisers in my translation memory language resources as you can see from the images earlier on:

This brings me onto the next very important point…

For TM matches, copy the TM format…

Exactly this!

The default settings for all the new features here will always take the formats you have used in your TM (translation memory) before, in preference to the format you checked in the image above.  Since I created a brand new translation memory for the examples I have shown so far, the only option would be to use the tokens I instructed it to use.  But the beauty of this option is that it will allow you to have multiple source formats handled by your TM in whatever way you have handled them before.  So if I did have some other examples of how these should really be handled in my TM then I could achieve this when I pre-translate for example:

Instead of using the one I checked in my project settings Trados Studio actually used the correct context for the source, not based on it simply being a recognised number and using whatever format I instructed it to use, but identifying the formatting that I had used in previous translations and applying the same formatting for each one.  In my translation results pane I will see something like this:

You can see I get a Context Match for the correct formatting, and a 100% match for the incorrect formatting.  It’s a 100% match because it’s still a recognised number based on what I had translated in the past (which you can see was different to the current source content).

If I uncheck this option then Trados Studio will use the formatting in the target I instructed it to use and this would get me this sort of thing:

This time I have the hyphen and backslash separators for everything and there is no recognition of what I might have translated in the past.  This behaviour pretty much mirrors the way Trados Studio always worked, but now you have more control over it and hopefully a better understanding of what’s going on under the hood.

Now, I know I’ve really laboured this point but it’s very important to understand.  You have the power with Trados Studio 2021 to take control of your auto-substitution and no longer have to resort to the sort of workarounds I mentioned at the start.  But you also have the ability to fail to understand why the software may be returning the results you are seeing when they are not the ones you wanted.  So have a good play with this (by “play” I mean make up a document of your own with different numbers, dates, currencies, measurements to practice on) and make sure you understand what’s going on.

But what about the little symbols?

What little symbols?  I’m referring to these little “Status” symbols:

Mats Linder of the TradosStudioManual fame asked me this question one night just before Xmas and this prompted an hour long discussion between myself and Vlad Bondor from the SDL Support team trying to figure out what these symbols actually meant and how they worked.  The link to the question may provide more context but in a nutshell the status symbols are there to help you when you work with multiple translation memories and start to see rather strange results.

If you have translation memories with all the same separators then you would see something like the screenshot above.  If you hover over the status symbol you’ll see this:

But, if you make a change in one or more of the enabled translation memories you are using for your project then you may see something like this:

To achieve this I simply added a new translation memory,  and then added a new custom recogniser to it using a space for the thousands separator and a hyphen for the decimal separator.  The result being neither of the translation memories have them both and this could cause me to have unexpected results if I’m allowing the format to be taken from things I have translated before by checking the box “For TM matches, copy the TM format“.  If I hover over the symbol I now see this:

All makes sense now, and starts to give us some idea of the complexity involved in trying to make things simple for Trados Studio users and probably why this problem was never solved before.  I think it’s very smart work from Kevin Flanagan who is mostly responsible for the development of this feature.

We only looked at numbers?

I used number for this article because it was an easy way to explain how Trados Studio is working.  But there are other gems.

Currencies…

You can see the settings for currencies in the screenshots above as they are of course closely linked to numbers and appear in the same pane.  But now you have more granular control over this because you can use different separators for currencies than you are using for numbers, and you can specify what currency symbol to use for a particular language (including creating your own), and you can instruct Trados Studio to place the symbol at the end of the number or before:

The ability to set defaults for the currency symbols and what should be used are here in your language resources in your translation memory:

There are a few “funnies” here however.  First of all, the idea was that this list of currencies should be a complete list of all the currencies that it makes sense to use as defaults.  As there are only 12 currency labels included out of the box it’s hardly complete… but I’m told we may add the non-latin or lowercase currency symbols from the 200+ available in dotNet in a future update.  Don’t want to add them all as this could cause leverage issues when a previously recognised alphanumeric is now seen as a currency… a lot to think about isn’t there!  But this isn’t a massive problem because you can add your own if the one you want isn’t in the current list of 12.

You might also find that you are getting recognition for a currency that isn’t in this list of 12.  If you are this is because we still have some legacy code in there that uses dotNet capabilities to say “format this number like a currency in this locale”.  Useful… but it could work on one computer and not another because it’s based on the version of windows you happen to be running.  The list you can control is not platform dependent… it’s list dependent!  Hence the reason we’ll probably revisit this and extend the default list for everyone.

It’s also important to always add the currency symbol to the source and to the target language.  You may think it’s working fine with just adding to the source, but you could find that you won’t get proper auto-substitution from your translation memory if you don’t add it to both.

Dates and Times…

Just as we used to have in earlier versions of Trados Studio we have these lists:

The difference of course is that you can define what goes in them.  You do that in same way as we did for numbers and currencies by adding them to the language resources in your translation memory.  Don’t forget however that the list in the image above relates to the target language only.  There is no need to define the source, it’s only important we recognise it.  So here I have 37 formats in my en(US) and only 12 in my pl(PL) by default.  The settings in the image above only give me the 12 target.

When you add your own a good tip is to click on the dropdown arrow first and this will tell you what’s allowed when defining these patterns:

These patterns are controlled so Trados Studio always knows what to expect and how to transform into the intended target.  If you make a mistake it’ll be obvious because you’ll get something like this:

So you now have the ability to cater for most date formats you are ever likely to see (not forgetting what I said at the start about not going far enough) in the source.  To ensure you use the correct target you just make sure you select the one you want in here (emboldened selections are the ones I changed from default):

Exactly the same idea for times, so I won’t expand on these.

Measurements…

In many ways my favourite aspect of this… at least it’s something I’ve been pushing every year since Trados Studio 2009 so I’m really happy to finally see it in the product.  The same settings we’ve pretty much always have are still the same:

What’s new is the ability to customise what Trados Studio will recognise as a unit of measurement.  You do that in here in the language resources of your translation memory:

The reason I’m happy to see this is because I created a document many years ago containing every SI Unit and I wrote them twice… once written with the correct spacing between the number and the unit, and then again with the incorrect spacing.  I religiously tested against this document with every release hoping for an improvement, and until the Trados Studio 2020 release the best we ever achieved in terms of being able to recognise these units were 19% recognised when correctly written, and bizarrely, 56% recognised when incorrectly written (improved with Trados Studio 2014 SP2).  Today I can achieve 100% recognition in all cases… and I can even add more.  So for technical translators this feature is surely a blessing!  And I can finally throw that old document away!

Oh yes… don’t forget that the controls here for spaces between the number and the unit also control the currency settings.  Hopefully you won’t have documents that contain currencies and other units of measure that use different spaces!

What I’d have personally liked to see…

For all the right reasons Trados Studio as a product has been built (as much as possible) to save the translator from having to be technical.  So no need to have to use regular expressions to construct the custom recognisers for all the dates, times, numbers etc. for example.  This has its good points, but it also has negatives.  The negatives are exactly why we had all the workarounds mentioned at the start of this rather long article (my apologies again… nearly done!).  The negatives are also why we have tools like the Terminjector, and the Regex Match AutoSuggest Provider on the appstore.  In fact there are so many places a little knowledge of regular expressions are helpful that it’s virtually mandatory training for any translator… especially if they use other tools where this sort of technical knowledge is required to handle anything like the sort of things discussed in this article.

The reason I don’t think we’ve gone far enough… even though I think we have created a wonderful set of options with this new feature-set for Trados Studio 2021… is that we still can’t cover all possibilities even though what’s left may be quite small and rarely a problem.

Take these dates for example:

Some modern, and some old style… neither totally unusual… none of them handled by Trados Studio and the out of the box customisation possible isn’t helpful here.  None of this is insurmountable, but workarounds are required.  Using a similar approach to the Regex Match AutoSuggest Provider as an example, would have made it possible to cater for absolutely anything so you could recognise these as follows:

  • 12th November, in the year 1863 (recognise the entire string as a placeable and convert as needed)
  • circa. 2011 (recognise the entire string as a placeable and convert as needed)
  • bap y 23d of Dece’ber (recognise this part of the string as a placeable and convert as needed)
  • March yr 6th (recognise this part of the string as a placeable and convert as needed)

I’m not going to demonstrate how this is achieved in this article, but will write another one on the Regex Match AutoSuggest Provider if anyone would like one.  But this is what I meant when I said I didn’t think we went far enough… I’d have liked to be able to handle anything at all and not just dates, times, numbers, currencies and measurements.

Conclusion…

Brilliant and clearly underrated new feature in Trados Studio 2021… I shouldn’t use my holidays to write articles like this as they always end up longer than anticipated… I clearly forgot that this is what happens so also suffer with short term memory problems… you may need another coffee!

Feature rich… it’s overflowing!

01I first wrote about the Glossary Converter on September 17, 2012… over three years ago.  Not only is it a surprisingly long time ago, but I still meet people at every conference I attend who have never heard of this marvelous little tool, and in some cases never heard of the OpenExchange either.  So when I toyed with the idea of writing an article about Xmas coming early and talking about the OpenExchange and all the goodies inside, part of me couldn’t resist writing about this tool again.  In the three years since it was first released it’s morphed beyond all recognition and today it’s awash with features that belie it’s appearance.

I like to take a little credit for the emergence of this tool because back in 2012 I asked around trying to get someone to create one so that it was straightforward for anyone to create a MultiTerm Glossary from a simple two column spreadsheet… the sort of glossary that most translators use for their day to day needs.  I was over the moon when Gerhard (the developer) was interested and created the tool I wrote about back then.  But I can take no credit whatsoever for what the tool has become today and it’s well worth revisiting!

Continue reading

Converting Wordfast resources… out with the old!

01This article is all about out with the old and in with the new in more ways than one!  In the last week I have been asked three times about converting Wordfast translation memories and Wordfast glossaries into resources that could be used in Studio and MultiTerm.  Normally, for the TXT translation memories I get I would go the traditional route and use a copy of Wordfast to export as TMX.  Then it’s simple, but what if you don’t have Wordfast or don’t want to have to try and use it?  Wordfast glossaries are new territory for me as I’d never looked at these before.  But on a quick check it looked as though they are also TXT files so I decided to take a better look.

Before I get into the detail I’ll just add that I’m not very familiar with Wordfast so I’m basing my suggestions on the small number of files I have received, or created, and the process I used to convert them to formats more useful for a Studio user.  I’ll start with the glossaries as this is where I got the idea from,  I better explain my opening statement too… this is because after I did an initial conversion using the Glossary Converter from the SDL Openexchange I was asked to explain how this would work with MultiTerm Convert.  This of course made me think about the old versus the new… I wouldn’t compare Wordfast and Studio in this way at all 😉 Continue reading

IATE, the last word… maybe!

001By now I think we’ve discussed the import of an IATE TBX into CAT tools as much as we can without going over old ground again.  But if you’re reading this and don’t know what I’m talking about then perhaps review these two articles first:
What a whopper!  – which is all about the difficulties of handling a TBX the size of the one that is available from the IATE download site.
A few bilingual TBX resources – which is a short article sharing a few of the TBX files I extracted for a few users who were having problems dealing with the 2.2Gb, 8 million term whopper we started with.
So why am I bringing this up again?  Well I do like to have the last word…  don’t we all… but this time I wanted to share the work of Henk Sanderson who has put a lot of time and effort into breaking the IATE TBX into bite sized chunks and at the same time cleaning them up so they can be more useful to a translator.  I also wanted to share the successful import of the complete original TBX from IATE directly into MultiTerm Server:
Continue reading

Export for External Review – a detour

02***Updated 24 June 2017***
When Studio 2009 was launched one of the first applications on the new SDL OpenExchange was the SDLXLIFF Converter for Microsoft Office.  This was an excellent application created by Patrik Mazanek that paved the way for some of the new features you see in Studio 2014 today.
The idea back then was born out of a requirement to export the contents of an sdlxliff file to Microsoft Excel but with no re-import to update the translation.  If you were an SDLX user you’d probably recognise that this was something you could do in SDLX, and the request that this would be possible in Studio was coming from many SDLX users.
Déjà Vu, another translation tool, had this concept of “External Views” where you could export the contents of your translation into a couple of formats, one of them being an RTF document formatted as a table containing the source and target text.  But the neat thing about this was that you could reimport the RTF and update your translation with whatever edits had been made in the RTF.  This was very cool, and as far as I’m aware no other tool had this capability at the time, short of working in Microsoft Word on a Bilingual DOC in the first place.  So when Patrik produced his first build of the converter and announced that he had included a similar capability using DOCX in addition to the Excel export this was very exciting!
Continue reading

Upgrading your leverage

#01I’m onto the subject of leverage from upgraded Translation Memories with this post, encouraged by the release of a new (and free) application on the SDL OpenExchange called the TM Optimizer.  Before we get into the geeky stuff I want to elaborate on what I mean by the word “leverage” because I’m not sure everyone reading this will know.

Let’s assume you have been a translator for years (English to Chinese), and you always worked with Microsoft Word and Translators Workbench.  TagEditor came along, but you didn’t like that too much so you kept working with Word and Workbench.  It had its problems, but until Studio came along and in particular Studio 2014, you were still quite happy to work the same way you had for years.  But now you wanted to buy a new computer, and you really liked the things you’ve been reading about Studio 2014 so you took a leap and purchased a license of Studio.  The first thing you want to do is upgrade your old Workbench Translation Memories so they could be reused in Studio.  You’ve got around 60,000 Translation Units in one specialised Translation Memory and you really need to be able to have this available as soon as possible to help with a job you know is just around the corner.  You upgrade the Translation Memory and this worked perfectly!

Continue reading

Life without Trados!

RIPThe launch of SDL Trados Studio 2014 this month brings with it the news that SDL Trados 2007 Suite will no longer be supported from the end of this year.  I don’t think this will come as a surprise to anyone as SDL had already ceased to support SDL Trados 2007 since the end of 2012, and with the releases of the 2009, 2011 and now 2014 versions of SDL Trados Studio it’s inevitable that the 2007 Suite version will follow suit.
Continue reading

"Memory is the mother of all wisdom"

#01I believe this interesting quote can be found in “Prometheus Bound”, a play by a Greek dramatist called Aeschylus.  I haven’t read the play, but I like the quote, and it certainly lends itself to the importance of memory… even when we refer to a Translation Memory rather than your own built in capability.  It’s because your Translation Memory is such an important asset to you that you need to regularly maintain it, and also reuse it wherever possible to expand the benefits you get from it.
Continue reading

If I knew then what I know now!

Just learningPeople often tell me that using Studio is complicated.  Other people, who have been working with Studio tell me it’s actually quite logical once you get your mind around it.  I clearly lean towards the latter and whilst I always try hard to see the difficulties the conclusion I always come back to, rightly or wrongly, is that many users who used Trados in the past expect Studio to be similar and then struggle when they discover it’s not.
Continue reading