Working with Studio Alignment

#01The new alignment tool in Studio SP1 has certainly attracted a lot of attention, some good, some not so good… and some where learning a few little tricks might go a long way towards improving the experience of working with it.  As with all software releases, the features around this tool will be continually enhanced and I expect to see more improvements later this year.  But I thought it would be useful to step back a bit because I don’t think it’s that bad!

When Studio 2009 was first launched one of the first things that many users asked for was a replacement alignment tool for WinAlign.  WinAlign has been around since I don’t know when, but it no longer supports the modern file formats that are supported in Studio so it has been overdue for an update for a long time.

It wasn’t until SDL released Studio 2014 in the third quarter of 2013 that a new alignment tool was released.  The new tool was based on the premis that most of the time aligning files is a waste of your time!  Many users find themselves being provided with a bunch of files, maybe even hundreds, with matching translations where to make matters worse one is often a PDF while the other is a DOC file so the alignment effort, and the value of the resultant Translation Memory is out of proportion.  Many translators have told me they have spent days aligning files (normally over the weekend!) to create translation memories they may never get any value from again, and got very little value from in translating the project they aligned for in the first place!

So the idea behind the original Studio 2014 alignment tool was to allow you to very quickly create a usable Translation Memory based on a sliding scale of alignment quality.  So you threw your hundred document pairs into the alignment tool, made a decision on what sort of quality you wanted, which in practice could be a little tricky and it paid to do a couple of trial runs with some smaller documents to make sure you had this right, and then with almost one click your Translation Memory was magically created and you could concentrate on the real work of translating.

#02

This desire to make things easy for the translator was a worthy one and most of the time it produces a pretty good alignment and it does this quickly.  But it didn’t allow for these:

  • Really poor quality files that needed some sort of manual touches to ensure a decent alignment and useful Translation Memory.
  • Alignment Projects!  It’s not uncommon for a translator, or a company to be tasked with creating the best possible Translation Memory you can get from all the monolingual documents available in both source and target languages.

So, when Studio SP1 was released SDL added an alignment editor to allow both of these things to be catered for.  The SP1 release is the first one, and there will be continual improvements to the editor, but the first incarnation does a reasonable job; especially if you know a few simple tricks and ways to work with it.

So, I have created a video, around 17 mins long where I aligned a couple of files and explained some of these tricks as I went along.  Hopefully by the time you have watched it to the end you will have a better idea of how to get the most from the existing version and can use it happily while waiting for future enhancements that will improve it even more.  I put the video at the end of this post because first of all I thought it would be helpful to just note a few important things that are useful to know when working with the alignment tool.

  1. You can still use the quick alignment and just throw 500 document pairs into the tool and have the Translation Memory created without any effort on your part at all.  All aligned segments will have a quality value added as a Field Attribute so you can further refine how you work with this Translation Memory on your Project as well as recording the filename of the source and target files used in the alignment process:
    #03
  2. You can have two kinds of alignment projects.  One for working with multiple files and have all the alignment files created and saved ready to open and work with, or you can do a quick file pair alignment that opens immediately in the editor after you select the files and a Translation Memory:
    #04
  3. You can change the segmentation rules and other language resources for the source and/or the target file by selecting the appropriate language in the Translation Memory settings:
    #05
  4. The filetype settings for the files being aligned are based on the active Project in Studio.  This is because the assumption is that you will always be aligning for a specific Project and so this will be active before you start.  However, we know from the reasons above that this might not always be the case, so a useful tip might be to create a Project just for use in Alignment Projects and then you can change the filetype settings in your dummy Project as needed, and make it the active Project (select it and press the Enter key so it goes bold), whenever you carry out any alignment work:
    #06
  5. If you start the alignment and find that the segments are not automatically aligned very well at all because of differences between the source and target files, then use the Realign function.  This is a very powerful way to improve the alignment quite quickly by doing the following:
    #07

    1. Disconnect all or some of your segments.
    2. Connect some of the segments in your alignment projects around the worst affected areas and then click Realign.
    3. This will take advantage of your “help” and vastly improve the alignment process, reducing your effort and making it easier to work through the file.
  6. Alignment can be carried out using the icons in the ribbon, with the mouse, with keyboard shortcuts or using the Alignment Edit mode (described in the video)
  7. The finished alignment can be imported into a Translation Memory, or saved as an SDLXLIFF.  If you use the latter you can then use Studio to perform quality assurance checks and do any further refined editing you consider necessary if you are preparing a high quality Translation Memory for yourself, or your client.  The Quick Import will just import everything into your Translation Memory that you have confirmed… so only quality values of 100.  The Advanced Import will allow you to import based on the quality values of the aligned pairs; so you use the slider to set the value and everything above that value will be imported:
    #08
  8. When you do the alignment the coloured lines have a meaning.  Solid green lines are confirmed and have a quality value of 100.  Dotted lines are unconfirmed.  Colours in red are poor quality value and the greener they get the better the software believes them to be.
    #09
  9. There is a limit to how many segments you can select to merge.  In both Alignment Edit Mode and normal aligning mode you cannot select more than three segments at a time.  In Edit mode the Connect n:n command greys out, and in normal mode you will see a small no entry symbol displayed when you try and select the fourth one:
    #10
  10. You cannot split segments.  If you need to split segments because you want two Translation Units then this could be carried out by saving the file as an SDLXLIFF and making the changes in there.
  11. You cannot delete or insert segments.  So in the example video where I have added new segments to one of the files on purpose you would simply align around them.  You could not insert segments to provide a source translation for them, nor could you delete them to avoid the misalignment.
  12. Alignment penalty… Studio adds a 1% penalty to all Translation Memory results that come from an alignment by default.  So you will only get a 99% match even though you confirmed the alignment and it now has a quality value of 100.  You can change this to zero and turn your 99% matches into 100% matches here:
    #11

I have tried to explain anything else I thought was important in the video.  17 minutes is longer than anything I would normally expect you to sit through, but I hoped it would be useful to work through a complete file and tackle the sort of things you are likely to come across in the process, and the 17 minutes were over almost as soon as I started… or at least it felt like that!

4 comments
  1. Christophe said:

    Every time I aligned documents with WinAlign it was worth it. However, before I started working with WinAlign, I always analyzed the old source files first, and then I analyzed the new source files NOT against the TM but against the previous analysis. You remember the “Use TM from previous analysis” option in TWB?
    If the analysis would give me a lot of full and fuzzy matches, I knew it would be worth aligning the old files.
    With Studio, there is now a way to align files, but is there way to use the “Use TM from previous analysis” option?

    • Two ways I guess… in Studio you have what’s referred to as an internal analysis. So if you create a project and add both files, probably make sure the original source is first in the list, and then analyse the files against an empty TM. The internal analysis will be based on the assumption that you are getting value from the first file in the list. So it assumes you translated it and then analyses the second file on this basis. I think this will achieve the same result.
      The other way is a little more manual. Open the first source file with an empty TM. Copy source to target and update the TM. Now analyse the second file against that TM. This should also do it.
      I can see your usecase though and I think it would be pretty cool to have a feature specifically for quickly checking this and returning a value that gave you a quick idea of whether it would be worth it or not. Maybe something as a plugin for the OpenExchange. I’ll look into it.

    • Maybe also worth remembering that the new alignment in Studio also does a reasonable job of aligning files quickly with no effort. So this is also an option to consider as you don’t have to spend any time on the alignment at all, other than to run it. It’s not always perfect, but it does generally do a fair job.

      • Christophe said:

        The second solution is more manual indeed, but it works better. The first solution only reports repetitions between the first and the second file. It is not accurate, but still a good indication to know whether aligning is worth doing or not.
        Thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,801 other followers

%d bloggers like this: