Ever since Trados came about one of the most requested features for translators has been merging across hard returns, or paragraph breaks. Certainly for handling the translation it makes a lot of sense to be able to merge fragments of a sentence that should clearly be in one, but despite this it’s never been possible. Why is this? You can be sure this question has come up every year and whilst everyone agrees it would be great to have this capability, Trados has not supported it through the product. The reason for the reluctance is that when you merge a paragraph unit (the name given to translation units separated by a paragraph break) you probably need to be able to decide how this change to the structure of the file should be handled in the target document. Sometimes this might be simple, other times it might not be, and the framework that Trados products use is not designed in a way that supports the ability to alter the look and feel of the target file across every filetype the product can open. Even the release of the Studio suite of products still uses the same basic idea of being able to handle the bilingual files directly rather than importing them into a black box and whilst this does offer many advantages, this problem of merging over paragraph units remains… until now.
I wanted to write about the concept behind this so that it’s clear what is happening when you use this new feature in Studio 2017. So think back to the days when you could not edit the source in Studio either. One of the reasons many users wanted to edit the source was so that they could resolve poor segmentation, there are other reasons of course, but I’m focusing on this one. The image below explains the way you used Edit Source to resolve this by enabling the feature in the Project Settings first and then editing the segment, in this example #2, cutting it to your clipboard and then pasting it into #1.
So now you have dealt with the problem of merging with this workaround, but you are left with an empty segment in #2. In Studio 2017 you no longer need to use Edit Source to achieve this you simply enable the feature to merge across paragraph units, which is disabled by default, and you can then merge these same segments to achieve this:
Now it looks as though I was able to do this with a simple operation and no longer have to deal with the empty segment… but this is not the case. If you pay attention to the segment numbering you’ll see that #2 is not there anymore; at least it’s not visible anymore. If I disable the option to Hide empty segments that have been merged in the Automation settings then you’ll see this:
So it’s basically an automation of the manual workaround, which does save a lot of time, and it has the additional benefit of locking the segment and setting the status to whatever you like automatically. You can do this for any filetype too. But do you want to?
The effect on target
This brings me back to where I started in describing the initial reluctance to support this in Trados products. If you merge across paragraph units and then save your target file what will that mean for your target document, and are you able to do anything about this before you send the file to your customer? Let’s take a look at two examples, the first in Word and the second an XML file. If I merge the segments so I now have a complete sentence that will be simple to handle what will that do to the target file:
You could probably have guessed it, particularly if you were used to implementing the Edit Source workaround in the past:
The empty segments will be sent through to the target file and now you need to clean it up. This probably isn’t much of a problem, and if the reason for this poor segmentation was because the user needed to make something fit the space available in the document in the first place then it’s fairly likely some clean up will be required in the target file afterwards anyway because of text expansion or contraction in the target language. It’s also something that is fairly simple to achieve in Microsoft Word because everyone, as we know, is a Microsoft Word expert!
But what if it’s an XML file. What happens then? Taking exactly the same example, but where the paragraph units are created by separate elements in the XML file you could find yourself with this:
Note that all the elements that were “emptied” are now empty elements in the XML file. This might not be acceptable to your client at all and the effort involved in attempting to correct the XML file afterwards, as you would with the Word file, might not be worth it at all.
In fact both of these files could potentially create quite a bit of work for anyone trying to align files afterwards as you will have changed the original structure of the files so they could be quite different to the original source. Of course I may be considering an extreme case where the alignment would not work, and you would have a point asking me why you would align them as you already have a bilingual file. But I just want to reinforce the point that when you do this you are changing the structure of the target file and it will no longer be the same as the one your client provided. Most of the time it probably won’t matter at all… but be aware that sometimes it might!
I know this article might seem a little out of order, but I wanted to just cover the concept and it’s effect first. So if you’re still up for this nice automation of the previous more manual approach then here’s the settings you need to know about. First of all you have to allow this in your Project Settings:
The default is that source editing is not allowed, and merging across paragraphs is disabled. So you do these two things:
- Enable source editing
- Disable the “Disable merging segments across paragraphs”
Once you have done this you will be able merge across paragraph units in your current project. You have to do this every time as there is no possibility to set this as the default in your Project Templates. But hopefully this is the exception as opposed to the rule.
The next thing you might want to do is display the empty segments. It’s not essential, but if you want to merge across an empty segment you need to unlock it first and to do this you have to be able to see it. You enable this in the File Options under Editor -> Automation:
In here you can two things:
- Disable the hiding of the empty segment so you can see it, and
- set the translation status for empty segments to something other than “Translated”
What’s missing here is the ability to set whether the segments should be locked or not.. I expect this to be available in an update to this initial release. But perhaps worth thinking about why you want this unlocked? I reckon it’s probably because you might want to merge across multiple segments. So in my example if I merge #1 and #2, and then try to merge #1 and #3 and then #1 and #4 I won’t be able to unless I unlock these segments first.
However, if I do it in this order, #3 and #4, then #2 and #3, then #1 and #2 then I don’t have a problem. So perhaps this is a useful way to look at it until things are changed. Or just select all four segments at the start and merge all four in one go… that’s probably the easiest way!
Important edit: 23 Nov 2016
It’s also important to note that as a Project Manager preparing Projects/Packages that you have some control over whether this feature can be enabled for the translator receiving the Project/Package. If you do not enable merging across paragraphs then the options above will be greyed out making it impossible for the translator to merge in this way. So this is a good precaution to take if you have any doubts over whether you want to see merging of this nature in the SDLXLIFF files:
I also thought it might be useful to have a video on this process looking at a few filetypes as well (Word, PowerPoint, XML and XLIFF) as it’s quick and might suit some people more to see this in practice.
Video: approx. 8 minutes long