Ever since the release of Studio 2009 we have had the concept of Language Resource Templates, and ever since the release of Studio 2009 I’d risk a bet that most users don’t know what they’re for or how to use them. To be fair this is hardly a surprise since their use is actually quite limited out of the box and access to the goodies inside is pretty hard to get at. It’s been something I used to see users complain about a long time ago but for some years now I rarely see them mentioned anymore. This article, I hope, might change that.
If there’s one thing I firmly believe it’s that I think all translators should learn a little bit of regex, or regular expressions. In fact it probably wouldn’t hurt anyone to know how to use them a little bit simply because they are so useful for manipulating text, especially when it comes to working in and out of spreadsheets. When I started to think about this article today I was thinking about how to slice up text so that it’s better segmented for translation; and I was thinking about what data to use. I settled on lists of data as this sort of question comes up quite often in the community and to create some sample files I used this wikipedia page. It’s a good list, so I copied it as plain text straight into Excel which got me a column of fruit formatted exactly as I would like to see it if I was translating it, one fruit per segment. But as I wanted to replcate the sort of lists we see translators getting from their customers I copied the list into a text editor and used regex to replace the hard returns (\r\n) with a comma and a space, then broke the file up alphabetically… took me around a minute to do. I’m pretty sure that kind of simple manipulation would be useful for many people in all walks of life. But I digress….
Is English (Europe) the new language on the other side of the Channel that we’ll all have to learn if Brexit actually happens… will Microsoft ever create a spellchecker for it now they added it to Windows 10? Why are there 94 different variants of English in Studio coming from the Microsoft operating system and only two Microsoft Word English spellcheckers? Why don’t we have English (Scouse), English (Geordie) or English (Brummie)… probably more distinct than the differences between English (United States) and English (United Kingdom) which are the two variants Microsoft can spellcheck. These questions, and similar ones for other language variants are all questions I can’t answer and this article isn’t going to address! But I am going to address a few of the problems that having so many variants can create for users of SDL Trados Studio.
Using segmentation rules on your Translation Memory is something most users struggle with from time to time; but not just the creation of the rules which are often just a question of a few regular expressions and well covered in posts like this from Nora Diaz and others. Rather how to ensure they apply when you want them, particularly when using the alignment module or retrofit in SDL Trados Studio where custom segmentation rules are being used. Now I’m not going to take the credit for this article as I would not have even considered writing it if Evzen Polenka had not pointed out how Studio could be used to handle the segmentation of the target language text… something I wasn’t aware was even possible until yesterday. So all credit to Evzen here for seeing the practical use of this feature and sharing his knowledge. This is exactly what I love about the community, everyone can learn something and in practical terms many of SDLs customers certainly know how to use the software better than some of us in SDL do!
The handling of numbers and units in Studio is always something that raises questions and over the years I’ve tackled it in various articles. But one thing I don’t believe I have specifically addressed, and I do see this rear its head from time to time, is how to handle the spaces between a number and its unit. So it thought it might be useful to tackle it in a simple article so I have a reference point when asked this question, and perhaps it’ll be useful for you at the same time.
I have a background in Civil Engineering so when I think about this topic I naturally fall back to “The International System of Units (SI)” which has a clear definition on this topic:
“More power to the elbow”… this is all about getting more from the resources you have already got, and in this case I’m talking about your Translation Memories. In particular I’m talking about enabling them for upLIFT. upLIFT, in case you have not heard about this yet despite all the marketing activity and forum discussions since August this year, is a technology that is being used in SDL Trados Studio 2017 to enable some pretty neat things. I’m not going to devote this article to what upLIFT is all about as Emma Goldsmith has written a really useful article today that does a far better job than I could have done. You can find Emma’s article here, called “SDL Trados studio 2017 : fragment recall and repair“. But a quick summary to get us started is that upLIFT enables things like this:
- fragment matching
- whole Translation Units
- partial Translation Units
- fuzzy match repair
- from fragment matching
- from your termbase
- from Machine Translation
Back in July 2013 I wrote an article called “Fields and Attributes in Studio” which was all about adding different types of metadata to your Translation Units every time you confirmed a segment to make it easier, or more complex depending on what you’ve done, to manage your Translation Memories. If you’re not sure what I mean by this take a look at the article as I won’t repeat a lot of that here… at least I’ll try not to! This capability in Studio is probably quite familiar to most users of the old SDL Trados 2007 and earlier, and was even essential to some extent because you could only use a single Translation Memory at a time.