Archive

CAT Tools

Is English (Europe) the new language on the other side of the Channel that we’ll all have to learn if Brexit actually happens… will Microsoft ever create a spellchecker for it now they added it to Windows 10?  Why are there 94 different variants of English in Studio coming from the Microsoft operating system and only two Microsoft Word English spellcheckers?  Why don’t we have English (Scouse), English (Geordie) or English (Brummie)… probably more distinct than the differences between English (United States) and English (United Kingdom) which are the two variants Microsoft can spellcheck.  These questions, and similar ones for other language variants are all questions I can’t answer and this article isn’t going to address!  But I am going to address a few of the problems that having so many variants can create for users of SDL Trados Studio.

Updating a translation memory

Emma Goldsmith wrote “SDL Trados Studio TM isn’t updating! My translation memory is empty!” in 2013 and it’s a very useful article helping users to answer the question about why their translation memories won’t update.  But there’s another one I have come across a few times recently and it’s all related to these pesky variants!  Let’s say you start using Studio and you have been working on projects using the “translate single document” approach for de(DE) into en(US), so you also have a translation memory with these variants.  Then one day you’re asked to translate into en(UK), so you create your project using de(DE) -> en(UK) thinking you’ll add the translation memory later since it didn’t appear in the way it usually does, only to discover you can’t use the translation memory you have because the language variants are different.  Fortunately someone tells you about AnyTM and you add your TM to the project anyway and all is well!

All is well until you are sent ten files and you decide to create a standard Studio project for this.  You haven’t done this before but as you work through the settings all seems well and you don’t notice that you have two target languages in the project wizard:

Even if you did notice the two languages you might not realise the implications since the one you want is there, English (United States).  You don’t even notice something could be wrong after adding your files since you see your translation memory all set up and ready to go:

You carry on working through the wizard and open the first file to work on it by double clicking on the first one in the list, you translate the file and as you work through it you notice that your TM is not being updated and you see this message in the translation results window:

You check your Project Settings and the translation memory is there and checked to update.  What’s going on?  You google the problem and review Emma’s blog… still can’t sort it, so you post into the SDL Community and nobody can help you there either!  So let’s examine the facts.

  1. You created a multilingual project that is for de(DE) -> en(UK) and de(DE) -> en(US).
  2. You have a translation memory for de(DE) -> en(US) but not for de(DE) -> en(UK)
  3. You opened the first file in the list without checking the language so alphabetically this is en(UK)

A multilingual project means you have one source language and at least two target languages.  You translate the files for each language by selecting the appropriate language from the drop down in the Files View shown above.  What this means is that the default language will be the first in the list and in this case it’s en(UK).  However you only have a translation memory for en(US) so opening the file in your project without changing the language in the drop down first means you are now translating the file without a translation memory at all and this of course explains why you get the message “No open translation memories…” and why your en(US) translation memory is not being updated.

ok… all understood so far but why does Studio show the translation memory in the Project Settings if it’s not there?  The reason is explained by the way the Language Pairs are organised in your settings:

The way Studio works is that it provides a way to apply settings and translation memories for all the languages in a multilingual project by adding them to “All Language Pairs“.  This is very handy and can save a lot of work on projects with 30 or 40 target languages.  You can also specify different settings for any one or all of the target languages by checking this box for the languages in question, in this case de(DE) -> en(UK):

If, after checking that box, I now go back to my “All Language Pairs” I would see that I am also told that the de(DE) -> en(UK) language pair is now using a different Translation Provider, followed up by the very helpful message that there are “No translation providers” because I only have a translation memory added for de(DE) -> en(US).  There are no memories available for an en(UK) target language because I have never added one.  If I applied this checkbox to the en(US) target language as well then I’d see this:

Now it starts to make sense because I can clearly see I don’t have a translation memory for one of the language pairs and it’s clear which one it is.  But unless I was always using the specific language pair settings it’s easy to miss until you understand what’s happening behind the scenes.  In fact unless you worked with multilingual projects on purpose you could be forgiven for not noticing this at all.

So, now we’ve hopefully cleared that up what are the options?  You have a few:

  1. The easiest solution if you don’t care about the differences between en(UK) and en(US) is to remove your translation memory from your settings and then add it back in again using AnyTM.  This way the same translation memory will be used by both language pairs, OR
  2. Add an existing translation memory with the right language pair to your project, OR
  3. Create a new translation memory with the missing language pair and add it to your project.  This would only really be sensible if you actually wanted to use en(UK) in the first place and it was important to have different translation memories, OR
  4. Close the file, switch to the correct target language (en(US)) in the Files View and import your translated SDLXLIFF files from the wrong language into your TM.  Make sure you don’t exclude the language variants:

    Then pretranslate your files to get back to where you were.

So not a huge problem with a few ways to resolve it.  But I think worth noting how this happens and ensure you pay attention to the number of target languages to avoid something like this happening in the future.

Merging Translation Memories

Around five years ago I created a video showing how to merge translation memories which works really nicely and shows the power of SDL Trados Studio for things like like this.  But what it doesn’t handle for merging is variants.  If I have three en(US) -> de(DE) translation memories and one en(UK) -> de(AT) translation memory then the best this feature can offer when I try to merge based on language pair is this:

Two translation memories instead of four because it can’t ignore the variants in the same way an import can.  So if you find yourself in this situation and wish to create a single English to German TM that you intend to use for all variants because you don’t worry about the differences, or you have some other way of identifying differences using Fields and Attributes for example, then the process would be this:

  1. Merge all your translation memories that are the same variants as shown in the video referred to above to get one translation memory
  2. Export the variant translation memories (Im starting to feel as though I’m being racist!) to TMX
  3. Import the TMX files into your one SDLTM and make sure you don’t check the option to exclude variants as we did for the SDLXLIFF files earlier on

So this is also fairly straightforward to solve and it probably easier than editing the language codes in the TMX file.  If you do want to use field values to identify the variants and work with one TM then you need to export all of your TMs to TMX (after merging to reduce the number if you have a lot of them) and then create a new SDLTM containing the appropriate fields like this for example:

Then import your TMX files and choose the variant that is appropriate like this for example:

The same approach would then apply to your future translations… you would select the appropriate field values for your projects and stamp each translation unit accordingly.  If you don’t remember to do this religiously then the whole idea of using fields and attributes in this way becomes a nonsense… so think carefully before you decide to work this way:

My personal opinion is that this explanation of fields and attributes is more of a “what’s possible” than “what’s sensible” and if you wish to retain the ability to uniquely identify differences between variants then you should use multiple translation memories.  This is really simple and explained in this article.  What’s also interesting is that when I wrote that one there were apparently 16 variants of English, and here we are, over 5 years later and we have 94!  How did that happen!  More importantly, if you actually receive work with these variants and you use field values just imagine how complicated it could get!

And the last thing of course (already explained in the articles I’ve referenced) if you do decide to use one translation memory for all your work is that you’ll need to use AnyTM to add your translation memory in your options and your project templates.  This will ensure your translation memory will always be used irrespective of the variants you use.

Spellchecking

The last thing I wanted to cover in relation to these pesky variants is spell checking. When you complete your translation and look for the spell check you are going to see one of these things and it doesn’t matter whether you are using Microsoft Word as your default spellchecker or Hunspell… you can still have this problem, although there is potentially better coverage with Hunspell:

It’s pretty annoying when you get the “not supported” message, but let’s look at why this is and I’ll use English as the target language for this illustration.  94 variants and Microsoft Word covers two of them, en(US and en(UK).  Now, having said that the Microsoft Word spell checker will kick in with more varieties… in fact these ones; Australia, Belize, Canada, Ireland, India, Jamaica, Malaysia, New Zealand, Philippines, Singapore, South Africa, United Kingdom and United States.  Why is this?  I have no idea but perhaps there’s a quick win in there somewhere once we find out!

For now we know you can spellcheck 13 English variants using Microsoft Word while Hunspell doesn’t fare so well out of the box as it only supports 7 variants… Caribbean, Australia, Canada, New Zealand, South Africa, United Kingdom and United States.  But Hunspell does have a significant advantage over Microsoft Word and that is… it’s a doddle to add new ones!  Here’s how it’s done.

  1. Navigate to the Studio program folder and you’ll find the HunspellDictionaries folder.  In Studio 2017 it’ll be in here:
    c:\Program Files (x86)\SDL\SDL Trados Studio\Studio5\HunspellDictionaries\
    In Studio 2015 it’s in Studio 4 and so on.
  2. In this folder you’ll find a pair of files for each language that is supported, an .AFF file and a .DIC file.
    For example en_GB.aff and en_GB.dic are the files relevant to English (United Kingdom), en_AU.aff and en_AU.dic are the files relevant to English (Australia).  The file names use the Windows Language Code Identifier (LCID) reference, or culture identifiers.
  3. You will also find a config file, there’s only one of these, called spellcheckmanager_config.xml
  4. To add a new language variant you can just make a copy of the .AFF and .DIC files for the variant you consider most similar and name them with the appropriate LCID reference, and then add a couple of lines into the spellcheckmanager_config.xml so Studio can find them.
    1. Quick tip: if you’re not sure how to get the appropriate LCID then just open any file as a single file translation and select the language variant you want.  Studio will name the file using the language codes.  Sometimes you could guess them, so en-GI is English Gibralter for example and you would name the files en_GI.aff and en_GI.dic, but I doubt you’d guess English Europe which is en-150, or English World which is en-001.
  5. Once you’ve got your .AFF and .DIC files you just make some space in the spellcheckmanager_config.xml using a decent text editor and add a few lines like this for example and save the file:
    <language>
    <isoCode>en-GI</isoCode>
    <dict>en_GI</dict>
    </language>

    1. Quick tip: pay attention to the differences in the way the codes are written in each line.  The isoCode element uses a hyphen to separate the language and variant, the dict element uses an underscore.
  6. You may find you need to have admin rights to make these changes to files in the Studio program folders.  I find it’s easier to copy them into a folder where I don’t need these rights, make the changes and save them, and then copy them back into the correct folder with admin rights afterwards.

There is a knowledgebase article here which explains how to do this, but since I’ve seen many people still struggling here’s a short video demonstrating how it’s done using English Europe, or en-150.

Duration: 9 minutes 13 seconds

 

There’s always been the occasional question appearing on the forums about data protection, particularly in relation to the use of machine translation, but as of the 25th May 2018 this topic has a more serious implication for anyone dealing with data in Europe.  I’ve no intention of making this post about the GDPR regulations which come into force in May 2016 and now apply, you’ll have plenty of informed resources for this and probably plenty of opinion in less informed places too, but just in case you don’t know where to find reliable information on this here’s a few places to get you started:

With the exception of working under specific requirements from your client, Europe has (as far as I’m aware) set out the only legal requirements for dealing with personal data.  They are comprehensive however and deciphering what this means for you as a translator, project manager or client in the translation supply chain is going to lead to many discussions around what you do, and don’t have to do, in order to ensure compliance.  I do have faith in an excellent publication from SDL on this subject since I’m aware of the work that gone into it, so you can do worse than to look at this for a good understanding of what the new regulations mean for you.

Read More

I’m pretty sure that when we started to build the new Customer Experience Team in Cluj last year that there was nothing in the job description about being competitive… but wow, they are!!!  I’d be lying if I said I wasn’t competitive, because I know I am, but it’s been a long time since I’ve had these kinds of feelings that keep me up at night.

To some extent I think the training requirements at SDL are the perfect fuel for this type of environment and I haven’t made up my mind yet whether it’s healthy or not.  But in their roles the team speak with customers through the online chat, in the community, via email… basically anywhere anyone comes in with a question because they don’t have a support contract or an account manager to ask and they didn’t know about the SDL Community which is of course the best place to go for help.  To be able to answer the variety of technical questions we see, all the team have either completed or are working through the various SDL Certifications available at a rate of knots and are learning more about the sort of problems faced by translators and project managers just by having to help people every day.  They are doing a fantastic job!

Read More

In the years that the SDL AppStore has been around I get asked one question on a fairly regular basis… “How can I find out about new apps or updates to existing apps?”.  A very reasonable question of course and one that has not been addressed particularly well, albeit there have been ways to keep yourself informed.  The ultimate solution we all want to see is the AppStore embedded into SDL Trados Studio, but as that isn’t going to happen for a while here’s a couple of ways you can still keep yourself aware of the updates.  The first is via twitter and this has been around for a while; the second is using an RSS feed which is brand new as of today!

Read More

Using segmentation rules on your Translation Memory is something most users struggle with from time to time; but not just the creation of the rules which are often just a question of a few regular expressions and well covered in posts like this from Nora Diaz and others.  Rather how to ensure they apply when you want them, particularly when using the alignment module or retrofit in SDL Trados Studio where custom segmentation rules are being used.  Now I’m not going to take the credit for this article as I would not have even considered writing it if Evzen Polenka had not pointed out how Studio could be used to handle the segmentation of the target language text… something I wasn’t aware was even possible until yesterday.  So all credit to Evzen here for seeing the practical use of this feature and sharing his knowledge.  This is exactly what I love about the community, everyone can learn something and in practical terms many of SDLs customers certainly know how to use the software better than some of us in SDL do!

Read More

It’s true… I’m a die hard desktop user.  I love the benefits I get from my mobile phone, using dropbox, the benefits of machine translation, Netflix and all the cool things that come with being able to use online features in the cloud.  But I’ve still been reticent to wholeheartedly embrace online technology and talk about it in this blog.  When I ask myself why that is, the first thing that crosses my mind is the unreliability of online connectivity.  Some people have a view of me as being a calm and patient person, and I do try hard to be that person, but when it comes to a lack of connectivity I turn into Mister Angry and Frustrated very quickly!  So the very idea of working with solutions that only offer an online capability for everything leaves me cold.  It’s one thing being unable to watch a film, share files, pick up my email or use my phone, but not being able to work at all is another thing altogether.  If I was working as an independant translator with all the benefits that can bring of being able to work anywhere, then having a good offline capability would be essential.  Studio of course offers me the offline capability, but today (and in a few more articles as there’s a lot to cover) I want to talk about the cloud and in particular SDL GroupShare.  Many of you may wonder if this has any relevance for you, but hopefully you’ll see it does because the solutions SDL offer in this space give you the flexibility you need when working with the cloud and even as a freelance translator you may get asked to work in that environment.  I’m going to tackle a few scenarios to explain, starting with creating projects. Read More

Using stylesheets to enhance the translators experience when working with XML files can be very helpful and sometimes essential.  It allows you to pull details from the XML and display them in a preview pane so that the translator has more context around the translatable text.  It can also provide a mechanism for displaying text that you don’t want extracted from the XML for translation at all.  This is nothing new of course and localisation engineers and experienced translators have been doing this for years.  In fact I’ve even written about this in the past providing a simple example of how it’s done and some reading resources for anyone who would like to learn how.  So why am I bringing this up again?

Read More

%d bloggers like this: