Creating a TM from a Termbase, or Glossary, in SDL Trados Studio

Update : 21 Dec 2015

This article is pretty old now… still interesting, but pretty old.  I think if you are looking for help on how to do this then take a look at the Glossary Converter from the SDL OpenExchange which can convert a termbase to TMX with a drag and drop!  There are a few recent articles on this tool now, like these:

Glossaries made easy…

Great news for terminology exchange

And you also have a Bilingual Excel filetype in Studio 2015 as opposed to the CSV option.

In the last week or two this question of how to create a Translation Memory from a glossary, or termbase exported to Excel has arisen a few times.  There have also been some interesting and clever responses… but notably not the easiest one.

Studio has a csv filetype that provides some very interesting options, like this:

CSV isn’t great for retaining clever formatting, but I think I’d be safe in saying that most glossaries are not formatted anyway, so this presents us with some interesting possibilities.  The simplest and the one referred to at the start of this post is converting your glossary that’s in Microsoft Excel to a Translation Memory.

To do this you only need a few simple steps.

Step One
Save your Excel file as a CSV (Comma delimited) file:

Step 2
Set up your CSV filetype in Studio as follows:

  1. Go to Tools – Options – Filetypes – Comma Delimited Text (CSV)
  2. In my file I have source language in the first column of my spreadsheet and target in the second column.  So I make sure have the columns set up like this:
  3. I also make sure the delimiter is set to be comma:

    This is because if I look in my CSV file the two columns are separated by commas, like this:
  4. I then check the option to automatically confirm the translations so I don’t have to do this (assuming I know the translations are good)

Step 3
Open the CSV filetype in Studio and add the TM you wish to update, or create a new one as you go.  The view of the file in the Editor looks like this:

So you can see both columns from the spreadsheet in the Editor as source on the left and target on the right, all confirmed and ready to update into your Translation Memory.  Note in the translation results window there are currently no results showing.

Step 4
Run a batch task to update your TM.  Easy way to do this from here is to use File – Batch Tasks – Update Main Translation Memory (not forgetting to save the project first – see earlier blog on using Open Document)

And Bob’s your Uncle… all done.  I now get results from my translation memory window:

My Translation Memory now contains all these Translation Units:

And if I wish to export this as a TMX I can right click on the TM in this view and select export:

If I was only interested in the TMX, then there are also a couple of applications on the SDL Open Exchange that can create a TMX directly from an SDLXLIFF.  So you would only need to open the file in Studio, save it, and then convert the SDLXLIFF…. also very simple and one of them can save in either direction so English to Portugese or Portugese to English (in my example):

SDLXliff2Tmx by Costas Nadalis, TMServe

SDLXLIFF to Legacy Converter by Patrick Hartnett, Logos s.p.a.

Once you get familiar with this filetype you’ll see there are other interesting applications for it, for example:

  • Use Studio to QA your terminology
  • Use Studio to create your terminology by translating glossaries and having the target text placed into the second column automatically
  • Use Studio to add another translation to an existing glossary by placing the translation of either the first or second column into the third column.. etc.
  • Use Studio to translate excel files where the source is in one column and the target is partially complete in another.  Here you can make use of the locked segment ability to confirm and lock the segments already completed so you don’t waste time working on them unnecessarily
  • Take comments with instructional material that are in a separate column and convert to structural context to help with the file… for example this:

    Can look like this in Studio with this CSV filetype:

    Where we see the existing translation is locked and confirmed already and the comments are now reflected in the right hand column as COM.  You can also click on these to see this:

So just a few interesting uses of this filetype… no doubt you can think of many more.

26 comments
  1. Laura Hernandez said:

    Simple, brilliant and free , Thank you vey much, Laura

    Like

  2. renata v koerber said:

    Really super!!

    Like

  3. Christian said:

    Very helpful. Thank you.

    Gets a little more difficult if there are commas in the source and/or target text. There may be an easier way, but I opened the Excel in OpenOffice, Saved As a CSV file but selected semicolon as the delimiter. You then need to make semicolon (Other) the delimiter in the Studio File Types options.

    Like

    • Hi Christian, I could be wrong but I think excel places quotes around the text if there are lots of commas in the text. You can then use the option in the CSV filter “Text is enclosed in double quotes” and all should be well.
      But your workaround works too…

      Like

      • Christian said:

        That is easier, thanks.

        I’ve been doing a lot of this lately and have trouble with certain languages’ (mainly Asian) characters becoming corrupt by the conversion to CSV.

        To get around this, I have been saving the Excel files as “Unicode (.txt).” The file is then opened as “Tab Delimited Text” in Studio. This system seems to be working well.

        Can you think of any pitfalls to doing it this way?

        Thanks!

        Like

  4. Miguel said:

    This is brilliant! Thanks! The only thing is that, similar to Christian’s comment, for languages that use accents, working from CSV results in garbled text. To solve this, I pasted the content of my Excel glossart into a text file and applied the settings given in this blog to “Tab Delimited Text” in Studio. The only pitfall I can think of is that if your encoding is not right, the text in the resulting txt.xliff file might be garbled. To avoid this, I saved the .txt file in Unicode (UTF16). Of course the best encoding to use could vary depending on the languages involved.

    Like

    • Another way that works for me, and I was reminded to add this here today, is to use a neat Excel Addin that was developed by a chap called Jaimon Mathew: http://goo.gl/H9bmvX
      This just adds a neat addition to your ribbon in Excel and allows you to open/save Excel files as CSV UTF-8 which solves this issue most of the time for me… in fact every time for me so far.

      Like

  5. Andrzej said:

    Laura, you’re great! You saved me a lot of time! Thanks.

    BTW, isn’t it a shame that Trados is mentally uncapable to produce such cear and simple guidelines? And – shouldn’t we all users of Trados revolt against them and bump them with demands (by mail or otherwise) that they provide reasonably edited help files instead of their quite general and almost useless help products? They demand huge money for their products as if the products were fully usable and functional, but Trados does not seem to be aware that they owe us huge money for time lost because of their errors, deficits, nonsense concepts etc.

    Like

    • Hmnnn… I wonder if you’re confused, or have I missed something Laura said? I may be in a little trouble if I revolt against SDL… but I’m supportive of an uprising against Trados!

      Like

  6. Kristina said:

    Finding this very useful. Just wondering is it possible to link online multilingual dictionaries such as http://www.termdat.ch to Studio as a termbase?

    Like

    • Hi Kristina, currently you’d have to convert this to Multiterm if you wanted it to work exactly as MultiTerm does in Studio. But you could create a lookup via the OpenExchange. This is something many people have created plugins for, but it does require a developer. Very soon we will have extended the API capability so that a developer could create a terminology provider for anything, and then that could be used instead of, or in addition to MultiTerm. So the latter solution will be the best one… I also think there will be many such things too since this is something many people have been asking for. So in the same way you see lots of Machine Translation Providers as plugins, you will see lots of Terminology Providers as plugins.
      Do you have access to a developer?

      Like

  7. Robert said:

    Perfect solution. Thank you! I did that with a tab delimited txt file and it works fine. But trying the same with a 3 column txt file (1 source, 2 target, 3 comments) does not show me the comments… I have activated “Extract comments as structure information”. Any idea what I am doing wrong?

    Like

    • Did you click on the document structure column on the right of the target segment? This is where the comment should be. You can’t see it as you translate, which is a drawback of this filtype. One interesting option might be to create an xml file from the excel file… fairly straightforward for a file like this… and then you can have a custom stylesheet that allows you to preview the comments and see the translation in the realtime preview window.

      Like

  8. Robert said:

    Thanks for your quick response. Yes, I click on the document structure column and see a kind of “generell” comment, like “a paragraph of text” but not what is mentioned in the txt file. The preview works already with the txt file. No need to convert to an xml. Thank you !

    Like

    • Thank you… until right now I have never even looked at the preview in a tab delimited text filetype! Good to know… every day you learn something new 🙂

      Like

  9. Vojta said:

    Hi, I’m not sure if my question is not an off-topic, but if I don’t want to create TM from termbase but I want to include the termbase to the file analysis (I don’t want to count words which are translated in termbase as ‘No match’ of ‘Fuzzy match’). I’m now to Trados and I’m struggling a bit with SDL Support. Thanks very much.

    Like

    • Hi, probably a little off topic but interesting nonetheless!
      This is actually not possible because even though the wordcount is possible the analysis is done at segment level from your Translation Memory. If we were to start looking up a termbase too then we have a number of problems to contend with… at least these three and probably many more:

      1. An 80% fuzzy, for example, would have to be reconstructed using the term, and reported as a 90% fuzzy perhaps. Not simple.
      2. If you had synonyms in your termbase how would the analysis know which one was correct… if any?
      3. If the tense for a particular word changed in translation how would the analysis cater for this?

      Interesting… and fraught with complexity I think. It’s more the kind of thing you don’t cater for before you start, but then you can use the resources you have to make it possible to handle the work more productively afterwards. I can just imagine the feedback from a translator who was offered less for their proposed translation as a result of them having matching words in their termbase! Personally, I don’t think this is a very good idea.

      Like

      • Vojta said:

        Thanks Paul, I work for a LSP and my bosses asked me whether it was possible or even a good idea. I’ll explain it to them thoroughly.

        Like

  10. Halim R said:

    What if my CSV file contains many languages, i.e. many delimiters are included? Any idea to create TM from such CSV file?

    Like

    • If you use the Glossary Converter this is very straight forward. You just drag and drop the csv into the Glossary Converter using the switch to convert to TMX. This will create a multilingual TMX. You can then upgrade the TMX in Studio and this will create as many bilingual SDLTMs as you have language pairs in your file.

      Like

      • Halim R said:

        Great. Good to know there’s Glossary Converter. Thanks sir.

        Like

  11. Christian said:

    I am trying to use this process to update a TM with content that was later edited outside of Studio. I am working with a two-column (source-target) Excel file. The problem is that the .sdlxliff is coming out segmented by the Excel cell, not sentence. Since the original content was segmented by sentence, this new .sdxliff file is not going to help me overwrite the existing TM segments (for the cells/segments with multiple sentences).

    Can anyone think of a solution?

    Like

  12. Maggie Zou said:

    Hi, Paul,

    I couldn’t find the “minimum number of columns” option in that dialog box. I unloaded the Trados 2015 upgrade patch some time before. Is this why? Does this matter?

    Maggie

    Like

    • Hi Maggie, this article is almost 4-years old now and the current version of the CSV filetype doesn’t use that option. Don’t worry about it.
      Paul

      Like

      • Maggie Zou said:

        Thank you, Paul. I have successfully converted the CSV file into the TM the way I wanted.

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: