001Ever since Trados came about one of the most requested features for translators has been merging across hard returns, or paragraph breaks.  Certainly for handling the translation it makes a lot of sense to be able to merge fragments of a sentence that should clearly be in one, but despite this it’s never been possible.  Why is this?  You can be sure this question has come up every year and whilst everyone agrees it would be great to have this capability, Trados has not supported it through the product.  The reason for the reluctance is that when you merge a paragraph unit (the name given to translation units separated by a paragraph break) you probably need to be able to decide how this change to the structure of the file should be handled in the target document.  Sometimes this might be simple, other times it might not be, and the framework that Trados products use is not designed in a way that supports the ability to alter the look and feel of the target file across every filetype the product can open.  Even the release of the Studio suite of products still uses the same basic idea of being able to handle the bilingual files directly rather than importing them into a black box and whilst this does offer many advantages, this problem of merging over paragraph units remains… until now.

The Concept

I wanted to write about the concept behind this so that it’s clear what is happening when you use this new feature in Studio 2017.  So think back to the days when you could not edit the source in Studio either.  One of the reasons many users wanted to edit the source was so that they could resolve poor segmentation, there are other reasons of course, but I’m focusing on this one.  The image below explains the way you used Edit Source to resolve this by enabling the feature in the Project Settings first and then editing the segment, in this example #2, cutting it to your clipboard and then pasting it into #1.


So now you have dealt with the problem of merging with this workaround, but you are left with an empty segment in #2.  In Studio 2017 you no longer need to use Edit Source to achieve this you simply enable the feature to merge across paragraph units, which is disabled by default, and you can then merge these same segments to achieve this:


Now it looks as though I was able to do this with a simple operation and no longer have to deal with the empty segment… but this is not the case.  If you pay attention to the segment numbering you’ll see that #2 is not there anymore; at least it’s not visible anymore.  If I disable the option to Hide empty segments that have been merged in the Automation settings then you’ll see this:


So it’s basically an automation of the manual workaround, which does save a lot of time, and it has the additional benefit of locking the segment and setting the status to whatever you like automatically.  You can do this for any filetype too.  But do you want to?

The effect on target

This brings me back to where I started in describing the initial reluctance to support this in Trados products.  If you merge across paragraph units and then save your target file what will that mean for your target document, and are you able to do anything about this before you send the file to your customer?  Let’s take a look at two examples, the first in Word and the second an XML file.  If I merge the segments so I now have a complete sentence that will be simple to handle what will that do to the target file:


You could probably have guessed it, particularly if you were used to implementing the Edit Source workaround in the past:


The empty segments will be sent through to the target file and now you need to clean it up.  This probably isn’t much of a problem, and if the reason for this poor segmentation was because the user needed to make something fit the space available in the document in the first place then it’s fairly likely some clean up will be required in the target file afterwards anyway because of text expansion or contraction in the target language.  It’s also something that is fairly simple to achieve in Microsoft Word because everyone, as we know, is a Microsoft Word expert!

But what if it’s an XML file.  What happens then?  Taking exactly the same example, but where the paragraph units are created by separate elements in the XML file you could find yourself with this:


Note that all the elements that were “emptied” are now empty elements in the XML file.  This might not be acceptable to your client at all and the effort involved in attempting to correct the XML file afterwards, as you would with the Word file, might not be worth it at all.

In fact both of these files could potentially create quite a bit of work for anyone trying to align files afterwards as you will have changed the original structure of the files so they could be quite different to the original source.  Of course I may be considering an extreme case where the alignment would not work, and you would have a point asking me why you would align them as you already have a bilingual file.  But I just want to reinforce the point that when you do this you are changing the structure of the target file and it will no longer be the same as the one your client provided.  Most of the time it probably won’t matter at all… but be aware that sometimes it might!

The Options

I know this article might seem a little out of order, but I wanted to just cover the concept and it’s effect first.  So if you’re still up for this nice automation of the previous more manual approach then here’s the settings you need to know about.  First of all you have to allow this in your Project Settings:


The default is that source editing is not allowed, and merging across paragraphs is disabled.  So you do these two things:

  1. Enable source editing
  2. Disable the “Disable merging segments across paragraphs”

Once you have done this you will be able merge across paragraph units in your current project.  You have to do this every time as there is no possibility to set this as the default in your Project Templates.  But hopefully this is the exception as opposed to the rule.

The next thing you might want to do is display the empty segments.  It’s not essential, but if you want to merge across an empty segment you need to unlock it first and to do this you have to be able to see it.  You enable this in the File Options under Editor -> Automation:


In here you can two things:

  1. Disable the hiding of the empty segment so you can see it, and
  2. set the translation status for empty segments to something other than “Translated”

What’s missing here is the ability to set whether the segments should be locked or not..  I expect this to be available in an update to this initial release.  But perhaps worth thinking about why you want this unlocked?  I reckon it’s probably because you might want to merge across multiple segments.  So in my example if I merge #1 and #2, and then try to merge #1 and #3 and then #1 and #4 I won’t be able to unless I unlock these segments first.

However, if I do it in this order, #3 and #4, then #2 and #3, then #1 and #2 then I don’t have a problem.  So perhaps this is a useful way to look at it until things are changed.  Or just select all four segments at the start and merge all four in one go… that’s probably the easiest way!

Important edit: 23 Nov 2016

It’s also important to note that as a Project Manager preparing Projects/Packages that you have some control over whether this feature can be enabled for the translator receiving the Project/Package.  If you do not enable merging across paragraphs then the options above will be greyed out making it impossible for the translator to merge in this way.  So this is a good precaution to take if you have any doubts over whether you want to see merging of this nature in the SDLXLIFF files:


The Video

I also thought it might be useful to have a video on this process looking at a few filetypes as well (Word, PowerPoint, XML and XLIFF) as it’s quick and might suit some people more to see this in practice.

Video: approx. 8 minutes long

001“More power to the elbow”… this is all about getting more from the resources you have already got, and in this case I’m talking about your Translation Memories.  In particular I’m talking about enabling them for upLIFT.  upLIFT, in case you have not heard about this yet despite all the marketing activity and forum discussions since August this year, is a technology that is being used in SDL Trados Studio 2017 to enable some pretty neat things.  I’m not going to devote this article to what upLIFT is all about as Emma Goldsmith has written a really useful article today that does a far better job than I could have done.  You can find Emma’s article here, called “SDL Trados studio 2017 : fragment recall and repair“.  But a quick summary to get us started is that upLIFT enables things like this:

  • fragment matching
    • whole Translation Units
    • partial Translation Units
  • fuzzy match repair
    • from fragment matching
    • from your termbase
    • from Machine Translation

Read More

001CAT tools typically calculate wordcounts based on the source material.  The reason of course is because this way you can give your clients an idea of the cost before you start the work… which of course seems a sensible approach as you need to base your estimate on something.  You can estimate the target wordcount by applying an expansion factor to the source words, and this is a principle we see with pseudotranslate in Studio where you can set the expansion per language to give you some idea of the costs for DTP requirements in the finished document before you even start translating.  But what you can’t do, at least what you have never been able to do in all the Trados versions right up to the current SDL Trados Studio, is generate a target wordcount for those customers who pay you for work after the translation is complete and are happy to base this on the words you have actually translated. Read More

01It’s all about the termbase definition when you want to merge termbases, or import data into MultiTerm termbases.  The XDT… otherwise known as the MultiTerm Termbase Definition file is the key to being able to ensure you are not trying to knock square pegs into round holes!  I’ve written in the past about the flexibility of MultiTerm and it’s this flexibility that can make it tricky for new users when they try to merge their collections of termbases together, or add to their data by importing a file from a colleague.

02So what do we mean by definition?  Let’s think about keys as I think this is quite a good analogy… the four keys in the image on the right will all open a lock, but they won’t all open the same lock.  If you want one of these keys to open another lock then you need to change its shape, or it’s “definition”, to be able to open the lock.  A termbase definition works in a similar way because MultiTerm is flexible enough to support you creating your own lock.  That lock might be the same as someone else’s, but theirs could also have a different number of pins and tumblers which means your key won’t fit.

Read More

01Everyone knows, I think, that an SDL Trados Studio package (*.sdlppx) is just a zip file containing all the files that are needed to allow you to create your Studio project with all the settings your customer intended.  At least it’ll work this way if you use Studio to open the package… quite a few other translation tools these days can open a package and extract the files inside to use but not a single one can help you work with the project in the way it was originally set up.  One or two tools do a pretty good job of retaining the integrity of the bilingual files most of the time so they can normally be returned safely, others (like SmartCAT for example… based on a few tests that verified this quite easily) do a very poor job and should be used with caution.

Read More

001Wow… how time flies!  Over three years ago I wrote an article called AutoCorrect… for everything! which explained how to use AutoHotkey so you had a similar functionality to Microsoft Word for autocorrect, except it worked in all your windows applications.  This was, and still is, pretty cool I think and I still use autohotkey today for many things, and not just autocorrect.  Since writing that article we released Studio 2015, and in fact Studio 2017 is just around the corner, so it was a while back and some things have moved on.  For example, Studio 2015 introduced an autocorrect feature into Studio which meant things should be easier for all Studio users, especially if they had not come across autohotkey before.

Read More

001… and hundreds or thousands of heads are better than two!!

I wrote an article a little while back called “Vote now… or have no say!” which was a follow up to the SDL AppStore competition SDL ran for a few months.  I wanted to remind everyone to go and vote if they wanted to have an opportunity to see an app developed that would be useful for them.  Well the competition is over now and we have a winner, so now we can move onto the task of creating it.

The winning idea from Marta, a Spanish freelance translator, was the “Quick Wordcount” idea and we have encouraged all users to contribute to this so it’s as useful as as we can make it for as many users as possible whilst ensuring we deliver the intent of the original idea.

Read More

%d bloggers like this: