Every now and then I see an application and I think… this one is going to be a game changer for Studio users.  There have been a few, but the top two for me have been the “SDLXLIFF to Legacy Converter” which really helped users working with mixed workflows between the old Trados tools and the new Studio 2009, and the “Glossary Converter” which has totally changed the way translators view working with terminology and in my opinion has also been responsible for some of the improvements we see in the Studio/MultiTerm products today.  There are many more, and AnyTM is a contender, but if I were to only pick my top three where I instantly thought WOW!, then the first two would feature.  So what about the third?  You could say I have the benefit of hindsight with the first two although I’m not joking about my reaction when I first saw them, but the third is brand new and I’m already predicting success!

Not so long ago we saw a plugin called the SDL Trados Studio Word Cloud which I believe the developer created just because he could… and it looked cool.  Fair enough, I liked it and that’s a good enough reason for me.  But then I wondered how much better this might be if we could just string a few things together to solve a problem… and so I wrote this article “It’s not all head in the clouds“.  The article turned out to be quite popular, not least because it opened up the possibility to extract term candidates from a Studio project which is something you still can’t do today without a bit of a workaround using the SDLXLIFF to Legacy Converter to get a TTX or a BilingualDOC from your files and then use these in SDL MultiTerm Extract.  But if the process I went through in the article to achieve this was made simple with a single plugin it would of course be even better.  And then we got lucky!!

We got lucky because we took on a couple of interns in our app development team, Laura Paraschivescu and Adrian Maniu, and tasked them with creating this very solution.  Between them they have created an application to complete my top three… an application I love playing around with because it has really put the fun into terminology extraction.  If you know me, I bet that’s not something you’d expect me to say!

projectTermExtract

The application they have developed is called projectTermExtract and you can download it from the appstore for use with Studio 2015 or 2017, the currently supported versions of Studio.  In a nutshell the way it works is this:

  1. Select your Studio Project or file(s) in your Project
  2. Extract the term candidates as a visual display
  3. Narrow down the potential candidates
  4. Add a file containing these candidates to your Studio Project for translation
  5. Convert the translated file to a Termbase and add it to your Project

I’ll run through the process in a video at the end, but I thought a quick visual would be interesting just to show how simple this really is.  All I did was right-click on the files in my project, select “Extract Project Terms” to extract the terms and then cut down the candidates with a few rules:

The rules are straightforward and make sense without getting too complex.  Perhaps over time and depending on the reaction from users when they play with this the extraction algorithm could be enhanced, but for the time being it’s very simple and just extracts single words from your Studio Project.  You then have the ability to do the following:

  1. Create a blacklist of words you do not wish to be added to a termbase
    • regular expressions are supported so you can filter out product codes, numbers, dates etc. in one go
    • the list can be saved for the active project and pulled up again easily should you wish to stop and come back to the task later
    • words can be added or deleted from the list
  2. Set the number of times a word is allowed to occur in the Project before it gets extracted
  3. Set the minimum length of a word before it gets extracted

All the time you play with these simple options you can preview the effect by regenerating the wordcloud in an instant.  I find this part of the application to be strangely addictive just watching the difference your changes make to the potential candidates!

Once you are satisfied with your setup you click on the “Include terms file to the project” button and this adds a file to your project and prepares it for translation.  You won’t see it immediately, but if you press F5 this will refresh the view and the file will be there.  Now you can translate it and once you are complete just right-click on the file in Studio and select “Generate Termbase“.  The termbase is added to your project immediately and you’ll start to see results that can be used to help your productivity and consistency immediately:

  • term recognition you can apply via autosuggest
  • fuzzy match repair where you have similar sentences with differences in terminology

If there were extracted terms in the final list which you don’t want in your termbase then just don’t translate them.  Any target segments that are left empty will not be added to your termbase.  So if you have a lot of words in your Project and it wasn’t possible to see them clearly in the word cloud you still have a way to exclude them as you work.  The word cloud just delivers a very nice and visual way to clean out the obvious stuff before you get to work on the detail.  It’s also worth noting that whether the translations are confirmed or not they will be added to the termbase.  They just have to be present in the target and they will be added.

What we need now is a nice way to deliver a termbase like this to your customer if they don’t use any tools at all.  We do have the RTF export from MultiTerm… but this doesn’t always deliver what you expect and you need to be an expert with Rich Text syntax to be able to change things to meet your needs.  I’ve had this on my “reluctant to learn todo list” for years!  Perhaps we’ll see a nice little plugin for this in the future and support an easily customised export to DOCX… I certainly think it would be helpful!  In the meantime your best options (if the RTF export doesn’t work for you) are probably these:

  • share the SDLTB (if they use Studio)
  • export to another format like Excel or Tab Delimited text and edit to suit (Glossary Converter will make short work of this)

Some things to watch out for

The projectTermExtract plugin works brilliantly and I really love it, but there are a couple of things to look out for.  The first is languages that the app can’t tokenize such as Japanese and Chinese for example.  The algorithm that creates the wordcloud is very simple at this stage so if there are no spaces between words it can’t determine them.  Trados Studio has only been able to achieve this since Studio 2017 using it’s upLift technology so the app would need to be able to leverage that.  Perhaps something for a future version.

Also, if you decide, after generating your termbase, that you’d like to create another in the same Project using the plugin then this doesn’t work easily.  There are some restrictions in the capability of the API and despite everything the developers tried they could not manage this via the APIs.  So if you do this you’ll get this message:

The instructions are quite explanatory and this works perfectly.  Hopefully this won’t be something you need to do very often and creating one termbase per Project using the plugin will be enough, so you may never need to do this.  Perhaps in a future release the team will be able to overcome this.  For now, I hope you like this as much as I do… so much so that when I think of my top three apps the winner is this one!  It’s been a long standing omission from the Studio solution and I think even with its currently simplistic extraction method it’s going to make a lot of users very happy… nice work Laura and Adrian!

The process

I have created a short video for anyone who would like to see how this all works in practice… hopefully you’ll find this helpful.  If you do then download it today!!

Approx. running time: 14 minutes 31 seconds

I’m back on the topic of PDF support!  I have written about this a few times in the past with “I thought Studio could handle a PDF?” and “Handling PDFs… is there a best way?“, and this could give people the impression I’m a fan of translating PDF files.  But I’m not!  If I was asked to handle PDF files for translation I’d do everything I could to get hold of the original source file that was used to create the PDF because this is always going to be a better solution.  But the reality of life for many translators is that getting the original source file is not always an option.  I was fortunate enough to be able to attend the FIT Conference in Brisbane a few weeks ago and I was surprised at how many freelance translators and agencies I met dealt with large volumes of PDF files from all over the world, often coming from hospitals where the content was a mixture of typed and handwritten material, and almost always on a 24-hr turnaround.  The process of dealing with these files is really tricky and normally involves using Optical Character Recognition (OCR) software such as Abbyy Finereader to get the content into Microsoft Word and then a tidy up exercise in Word.  All of this takes so long it’s sometimes easier to just recreate the files in Word and translate them as you go!  Translate in Word…sacrilege to my ears!  But this is reality and looking at some of the examples of files I was given there are times when I think I’d even recommend working that way!

Read More

According to wikipedia there are some 9.6 to 12 million people speaking Haitian Creole worldwide.  I had no idea it was such a widely spoken language until I was asked a question this week about why the Google Translate machine translation provider in Studio returned French translations when the project was en(US) – fr(HT) (French-Haiti).

In fact I had no idea that French-Haiti was most likely intended to be the language that should be used in Studio for Haitian Creole as this isn’t a language I come across very often.

But before I can ask a developer to fix this problem I have to be able to understand it myself, so the first thing I wanted to know was whether French-Haiti was the same as Haitian Creole or not.  And for anyone interested, as I was, to read more on this I found these three interesting links below explaining how the language came around and it does have a very interesting history: Read More

I’ve always had a secret desire to be able to program computers… the problem is it’s not something you can do just like that!  I can recall starting off with a Commodore PET 2001 some time in the late 70’s and I can remember how enjoyable it was to be able to create simple scripts that could react to whatever you pressed on the keyboard.  I should have realised back then it would have been smart to focus on technology, but instead I took a bit of a detour in my career and computers didn’t feature at all until around 1987 when I was introduced to the HP41c from Hewlett Packard.  This had very basic programming functions, a magnetic card reader and a thermal printer and I loved it!  In fact I loved the way HP calculators worked so much I had an 11c for years until I dropped it trying to align a laser while being dangled headfirst into a catchpit on a construction site!  And we think the Studio alignment process is tricky 😉

Read More

SDL Trados Studio is up to Studio 2017 which is the fifth major version since Studio 2009 was first released some eight years ago now.  During these eight years I think it’s fair to say we have seen less and less requirement for the old Trados features, yet despite that we do see some interesting tools making an appearance in the SDL AppStore that mirror some of the old functionality.  In fact some of these apps are quite recent and seem to have been driven by requests from users who miss some of the things you could do in Trados but still cannot do in the out of the box Studio solution.  So I thought it might be fun to take a look at some of these apps and if you are one of those translators who remembers all the good things Trados could do… and can I say forgotten the things it could not… then perhaps you’ll find these apps useful!

Read More

A nice picture of a cutie cat… although I’m really looking for a cutie linguist and didn’t think it would be appropriate to share my vision for that!  More seriously the truth isn’t as risqué… I’m really after Qt Linguist.  Now maybe you come across this more often than I do so the solutions for dealing with files from the Qt product, often shared as *.TS files, may simply role off your tongue.  I think the first time I saw them I just looked at the format with a text editor, saw they looked pretty simple and created a custom filetype to deal with them in Studio 2009.  Since that date I’ve only been asked a handful of times so I don’t think about this a lot… in fact the cutie cat would get more attention!  But in the last few weeks I’ve been asked four times by different people and I’ve seen a question on proZ so I thought it may be worth looking a little deeper.

Read More

There’s been a few ups and downs getting SDL Analyse off the ground, but it’s finally there and it’s worth it!  If you have no idea what I’m referring to then perhaps review this article first for a little history.  This app was actually released as the 200th app on the SDL AppStore in February this year, but in addition to the applause it received for its functionality there has been less positive aspects for some users that needed to be addressed.

But first, what does it do?  Quite simply it allows you to get an analysis of your files without even having to start Studio, or without having to create a Project in Studio.  If you’re a regular reader of this blog you may recall I wrote an article in 2014, and in 2011 before that, on how to do an analysis in Studio by using a dummy project.  In all that time there has been only one app on the appstore that supports the analysis of files without having to use Studio and this is goAnalyze from Kaleidoscope.  In fact goAnalyze can do a lot more than SDL Analyse but there is one significant difference between these apps that makes this one pretty interesting… you don’t require the Professional version of Studio to use it.  But it’s also this difference that has been the cause of the ups and downs for some users since SDL Analyse was released.  In order to resolve the problem of needing to use the Project Automation API, which needs the Professional version of Studio, the app needed to use a windows service that was hooked into Studio.  For the technically minded we had a few things to resolve:

Read More

%d bloggers like this: