The PerfectMatch…

mage: A heart made of puzzle pieces with a piece missing.In the world of translation, Trados Studio’s PerfectMatch feature is like the overachieving student who always gets straight A’s, and its academic partner is the brilliant but slightly disorganised professor.  PerfectMatch, with its meticulous and precise matching capabilities, often finds itself patiently sorting through the professor’s vast but somewhat chaotic repository of knowledge.  Picture PerfectMatch as the diligent student, poring over texts late into the night, determined to find that one perfect translation match.  Meanwhile, the academic partner is the genius who wrote the book on translation but can’t quite remember where they put it. Together, they form an unlikely but unstoppable duo – the PerfectMatch feature meticulously cross-referencing every word while the academic partner brings a wealth of knowledge, albeit sometimes hidden under a pile of papers.  It’s a partnership where precision meets wisdom, creating translations that are not just accurate, but also enlightened!

A lovely thought… but perhaps not quite reality.  I was prompted to write this article after a couple or three coinciding events.  The first was a user in the RWS Community asking how to make a particular feature related to the use of Perfect Match work; the second was a new course released by our training teams called “Working with Perfect Match“; the third was the Localization Engineer course I went through in my last article.  The feature in question is one of the three options you have when PerfectMatching files in a project… and I should add that the PerfectMatch feature is only available in the Professional version of Trados Studio.  Freelance users have a neat feature called “Update file” that achieves a similar result and uses PerfectMatch in the background to do it, but is only for one file at a time and will actually replace the files in your existing project.  The professional version is a little more sophisticated and is great for managing mid-project updates when working with large projects and multiple target languages.

The PerfectMatch feature works something like this:

  1. Document Analysis: PerfectMatch starts by analyzing the new version of a document that needs updating, comparing it with the previous version of the same document.
  2. Identification of Matches: It identifies segments (sentences or phrases) in the new document that exactly match those in the previous version.  This includes both untranslated source text and already translated target text.
  3. Locking of Matches: Once these perfect matches are identified, PerfectMatch can optionally lock these segments.  Locking means these segments will not need to be translated again, as they are identical to what was previously translated.
  4. Focus on New and Changed Content: The translator can then focus on translating only the new or changed segments in the document.  This significantly reduces the amount of work required, as previously translated content is reused.
  5. Consistency and Efficiency: This process ensures consistency in translation, especially for documents that are frequently updated, like manuals or product descriptions.  It also saves time and resources by avoiding re-translation of unchanged content.

So in a nutshell it’s a very handy feature for a translation office.

The very specific features this article is about are how you use it in practice, and in particular this part:

Screenshot of Trados Studio's PerfectMatch feature, showing menu options 'Matching Previous Documents from Folder...', 'Specific Previous Document...', and 'Previous Documents from Map...'

The three options for adding matching files… although the first two are simple and well documented in the help, videos explaining how to use it, and also the new training course.  But what’s not so well documented… even if it’s in the product help when you know exactly what to search for (PerfectMatch map files), and it’s not part of the new training course (it’s only Level 2 so there may be more in the pipeline) as far as I can see, is how to use “Previous Documents from Map…”.  This feature is particularly useful for a translation office who probably automate a lot of the work they do and when you have projects going into 20+ target languages, and files being sent to different locations for translation, the management of this task and creation of the map files isn’t simple… or rather it’s a logistically challenging task rather than a complex one.

So coming back to the triggers for this article they boil down to being able to provide an answer to the community question, an explanation of the feature that is omitted from the available documentation and training, and an opportunity to show why the localization engineering course is so useful.  But first let me explain how this is intended to work.

Some Product Help…

To make this simple I’m going to cheat a little and this is because the tech pubs team pointed me to the 2022 SR2 version of the help where they do explain quite nicely how this works:

Key concepts -> About Perfect Match -> PerfectMatch map files

The basic idea being you create a tab delimited file with a .txt extension formatted like this:

Sample Folder A\SampleFileAV1.docx [...]\Users\jsmith\Documents\Studio 2015\Projects\Project 7\de-DE\SampleFileAV2.docx.sdlxliff

Sample Folder B\SampleFileBV1.docx [...]\Users\jsmith\Documents\Studio 2015\Projects\Project 7\de-DE\de-DE\SampleFileBV2.docx.sdlxliff

Sample Folder C\SampleFileCV1.docx [...]\Users\jsmith\Documents\Studio 2015\Projects\Project 7\de-DE\SampleFileCV2.docx.sdlxliff

The key concepts being that this is a 2-column file with the first column being a relative path to the file in your project (without the sdlxliff extension) and the second column being an absolute path to the previously translated bilingual file (that does have an sdlxliff extension).

Seems simple enough until you consider an example of how this might be used in practice.

My Translation Office…

So to test this out I mocked up an office that has a localization engineer preparing projects and four different translation teams.  These teams all work with project templates prepared by the localization engineer to ensure that all projects are managed internally using the appropriate customised templates to suit the work being carried out.  This also means in my office we don’t need to work with project packages and we instead work with the bilingual files.  We could work with packages, but I quite like the file-based approach as it’s light and convenient and I can easily manage the work this way without any difficulty at all.

So my Translation Offices are set up like this on my drive:

A screenshot depicting a folder structure related to translation work. The main folder '[TranslationOffices]' contains subfolders for different countries: '[France]', '[Germany]', '[Italy]', and '[Spain]'. Inside the '[France]' folder, there is a further subfolder named '[New_Projects]', which itself contains '[003 - Files Only]' and '[004 - Files and Folders]'. The folders are depicted with traditional folder icons, and the structure indicates an organized file system for managing translation projects.

Nothing over complicated and I only work with 4 target languages in my fictional company.  I have a location on my hard drive for each office, a folder for new projects within each office and a list of the projects being worked on in the new projects folder.

My Localization Office…

I also have a Localization Office where I have folders matching the project names of those I create in Trados Studio and these will contain my Map files:

Screenshot of a directory tree with a main folder titled '[LocalizationOffice]'. Within it are two subfolders: '[003 - Files Only]' and '[004 - Files and Folders]'.

Again nothing complex and I’m using this folder for my map files in this PerfectMatch process.  The original Trados Studio projects are created by me in the default Trados Studio projects location and I copy the sdlxliff files from the target language folders into the appropriate locations for my Translation Office.

All nice and simple, but could be a logistic nightmare and easily prone to errors without careful management and control of the process.  So let’s talk about what the Localization Office can do here to automate this process somewhat.  To do this let’s look at what we need to do in practice for a Perfect Match process to work:

  1. Create a Trados Studio Project with the original source files
  2. Copy the sdlxliff files from this project into the Translation Office folders based on language so each office can manage their own translation process
  3. Create a Map file so Trados Studio will know where these files are for a PerfectMatch operation if needed later in the process

To try and help visualise this a little better I mapped it out… roughly:

Flowchart detailing the process for managing translation projects in different European countries. The central loop outlines the core steps: creating Trados Studio projects, copying bilingual files to translation offices, creating map files for a perfect match, and updating projects. Four spokes lead to individual processes for Italy, Spain, France, and Germany, each involving creating country-specific projects, translating and reviewing, and exporting bilingual files.

This then boils down to just three simple things.  The first, create your Studio Project, I briefly discussed as I introduced powershell automation to my blog many years ago in 2013 with an article called “The Powershell what?“.  It explains briefly using Powershell to work with the APIs in Trados Studio to really speed up the things you do.  It would also be remiss of me not to mention an excellent github project created by Evzen Polenka called “SDL Trados Studio Automation Kit (STraSAK)“… although it might need rebranding these days to “TraSAK”.  It needs updating as Evzen has retired from localization and it only supports up to the 2021 version of Trados Studio, but the content provides a nice overview of the sort of things a localization engineer can do for you if you hire one!  In fact I’ll reproduce the table Evzen created here as it’s a nice eye opener for anyone who wasn’t aware:

Command Description
New-Project Create new project – optionally based on project template or another project – in specified location, using specified source and target languages and TMs from specified location. Get source files from specified location and automatically convert them to translatable format and copy them to target languages. Optionally also pre-translate and analyze the files, saving results to Trados 2007-formatted log.
PseudoTranslate Pseudo-translate specified project, using specified pseudo-translation options. Optionally also export the pseudo-translated target files.
Export-Package Create translation packages from specified project, using specified package options, and save them to specified location
Import-Package Import return packages from specified location in a specified project
ConvertTo-TradosLog Convert Studio XML-based report to Trados 2007-formatted log
Export-TargetFiles Export target files from specified project to specified location
Update-MainTMs Update main translation memories of specified project
New-FileBasedTM Create new translation memory in specified location, using specified options
Export-TMX Export one or more Trados Studio translation memories to TMX, optionally applying a filter
Import-TMX Import content from TMX file in a specified TM, optionally applying a filter

For this article however, I’m not going to include the project automation part as I want to show the Studio UI as we work through this.  The second and third items I can address with another script of some sort, not using the Trados Studio APIs, but just copying files around and creating a tab delimited text file for the mapping in PerfectMatch.

Not being an expert on scripting, but at least knowing what I need it to do, I used ChatGPT to create this script.  It took me at least a dozen iterations of the script to get it right by working in small chunks exactly as I would do the process manually.  I did have to go back and edit a few things but all in all this was a surprisingly simple task.  I actually started by asking ChatGPT what the best way to do this would be and it came back with some suggested code for moving files around written in Python.  This felt like overkill and so I asked if using a DOS script and Robocopy might be easier.  Based on my initial question the answer was positive, but then I realised I wanted to point to an sdlproj file in the Trados Studio project folders and actually parse it to see what the project name was, where it was and record the relative paths to the files, and of course what languages were in the project.  So for this Python was looking preferable.  But as I have used Powershell in a limited way before I opted for that since Powershell is also pretty good at parsing XML… and ChatGPT of course obliged.

A dozen iterations later by building up the script in bite sized chunks (you cannot just ask ChatGPT to do the whole task) and I have a script that allows me to work like this:

  1. Create my Trados Studio Project (the manual way for this exercise)
  2. Run my script and provide the path to the created project
  3. The script then copies the files into the appropriate Translation Office language folders, creates the Map files (one for each language) which it places into the appropriate Localization Office folder

To run it I just saved the file as a *.ps1 file, associated Powershell with this extension in Windows and then I can double click my script file and the whole thing takes a few seconds to run after I created my initial Trados Studio project.  My Translation Offices all have their files, I have my mapping files and now I’m ready for when I receive updated files from my client.

I’m not going to document everything I did because there are many ways to work in a Translation business and I am not suggesting for a second that this is what everyone should do.  This is just an exercise to explain how the Mapping files in Perfect Match work, and also (I hope) a further intro to what benefits employing a localization engineer could bring you in your business today.

But all of this is best demonstrated live, so here’s a video of the process in operation and examples of how to PerfectMatch with a Map file.

Approx. 25 mins… so get a cup of tea before watching!

Post Article Update:
And to conclude with a post published article update… if you use this feature and agree that having the ability to update a project with one map file as opposed to one map file for each language then vote for this idea:

Improve the PerfectMatch “Previous documents from Map…” feature

Back to school… again!

The image depicts a stylised portrait of a person wearing academic attire. The individual has a neatly trimmed beard and moustache, and they are wearing a blue graduation cap with a tassel on the right side. They appear to be smiling contentedly, with their eyes closed in a serene expression. The figure's graduation gown is dark blue, and they are wearing a white shirt with a green tie underneath. The image has a clean and modern vector art style, with flat colours and simple shapes for features.After I did my last studies, apart from all the endless mandatory HR type training we have to endure these days, I thought that would be it for any sort of formal training for me.  In fact the main reason for me doing my last formal studies, TCLoc Masters degree at the University of Strasbourg, was to fill the gaps I thought I had given a complete lack of education in the field I’ve been working for the last 17-years.  That degree was very useful and I definitely learned a lot and filled some gaps, but whilst there was an element of technical localization to it I think it only scratched the surface and didn’t really cover the sort of skills that I think are needed, not just for localization engineers, but also for professional translators and project managers, working in technical localization today.

Continue reading “Back to school… again!”

XML… unravelling chaos

Image of a ball of wool unravelling around the letters XMLWhilst I would definitely not claim to be an expert, writing this blog has allowed me to learn a reasonable amount about XML over the years.  Most of the articles I’ve written have been about explaining how to manage the many amazing features in the filetypes that are supported by Trados Studio… and of course how to deal with the many changes over the years as the filetypes have become more and more sophisticated catering for the demands of our customers and the changes in the technologies applied to XML in general.  The result of these changes has led to some… let’s say… less than user friendly interfaces and features and you’d certainly be forgiven if you thought things were becoming a little chaotic!

Continue reading “XML… unravelling chaos”

Working under a cloud!

Image of a cloud of a thunderstorm with rain.In the heart of LingoVille, translator Trina was renowned for her linguistic prowess but was a bit behind in the tech world.  When her old typewriter finally gave out, she received a sleek new laptop, which came with OneDrive pre-enabled.  Initially hesitant about this “cloud magic,” she soon marvelled at the convenience of securely storing her translations online, accessible from anywhere, safeguarding her precious work from life’s unpredictabilities. This modern twist turned Trina from a tech-sceptic into a cloud enthusiast overnight.

And then she woke up!!

Continue reading “Working under a cloud!”

Going, going…. gone!

@paulfilkin twitter profile imageIt may be a little small to read but my social highlights for twitter  were:

      • joined in July 2010
      • tweeted 24.3K times
      • follow 16 users
      • followed by 1878 users

With the exception of youtube twitter was the only social media account I had retained.  youtube is more of a place to host and share videos and less of a platform I have to visit for anything else, so my exposure to the material in there is limited.  Twitter on the other hand… I can’t bring myself to call it X which is one of the most stupid marketing decisions I have seen in years… was a tool I liked to use because through tweetdeck I could easily filter out the nonsense and only be exposed, pretty much, to what I wanted to see.

Continue reading “Going, going…. gone!”

Linguistic Alchemy to unlock AutoHotkey

A photorealistic image of a wizard performing linguistic alchemy, digital art.In the echoing halls of the Tower of Babel, myriad languages tangled, creating a confusion of tongues and leaving humans estranged.  Fast forward to the present day, professional translators stand as the modern-day heroes, bridging linguistic divides and fostering global connections.  Yet, these linguists often grapple with the technical juggernaut of AutoHotkey scripting.

Continue reading “Linguistic Alchemy to unlock AutoHotkey”

Helping the Help!

Image created with DALL·E, an AI system by OpenAI - “Helping the Help in the style of Richard Estes.”I really like this image created by DALL·E of a man… maybe a businessman… on a wall, putting down his newspaper and reaching down to offer help to the worker with a ladder.  Created with only this prompt – “Helping the Help in the style of Richard Estes.”  When we read about how ChatGPT is “only” an advanced autosuggest we really need to think about how it must have some understanding of what was previously said to be able to predict the suggestion.  DALL·E really demonstrates this well because it had to have enough of an understanding of the concept of help in terms of not only helping, but also the use of the word help as someone who could be employed to help (in this case maybe a caretaker or janitor)… and then think about how this could be represented as an image, and in the style of a photorealist painter I mentioned by name.  Then do all that in a matter of seconds.  Quite astonishing really. Continue reading “Helping the Help!”

Unlocking Linguistic Success: Navigating the Path to Translation and Localization Mastery for Academia’s Rising Stars

Created by DALL·E: “Create an ink sketch of the Vitruvian Man wearing a students mortar board in the style of Leonardo da Vinci.”The Studious Translator, a pen-and-ink illustration inspired by Leonardo da Vinci’s style, depicts a student immersed in the world of translation and localization at a University participating in the RWS Campus academic programme.  Just as the Vitruvian Man embodies the ideal human proportions outlined by the Roman architect Vitruvius, this diligent student exemplifies the harmonious balance of linguistic mastery, cultural understanding, and technical acumen required for success in the field.  The drawing showcases the student in two (hidden) overlapping positions—one representing the precision of translation within a square, and another showcasing the adaptability of localization within a circle. This intriguing illustration not only highlights the student’s dedication to comprehending essential concepts but also their aspiration to innovate and refine them.  Although not the first to capture the essence of translation and localization, the Studious Translator gains iconic status as a symbol of the modern Renaissance in language and technology.  It serves as a testament to the interdisciplinary nature of these fields, weaving together mathematics, linguistics, and art.  The original drawing is carefully preserved in a climate-controlled archive at RWS Campus, exemplifying the programme’s commitment to nurturing the next generation of translation and localization professionals.

Continue reading “Unlocking Linguistic Success: Navigating the Path to Translation and Localization Mastery for Academia’s Rising Stars”

The elephant in my room…

“In the style of Dali: the elephant in the room, sitting at the boardroom table discussing artificial intelligence.” DALL·EThe reaction I rarely see when discussing artificial intelligence with anyone is indifference.  The reactions I usually see are split between overflowing enthusiasm and overflowing concern.  I rarely have a conversation about them both.  But after writing a few articles on how useful it is, and obviously I spend most of my time in the overflowing with enthusiasm camp, I wanted to address the elephant in the room.

Continue reading “The elephant in my room…”

The elusive regex with GPT-4

A DALL E generated image of running digitsWhilst the solving of regular expressions with ChatGPT seems like a great way to give yourself superpowers I have stayed away from writing about this usecase till now.  Yes, ChatGPT is great for those simple things that anyone with some basic knowledge could probably write themselves in the time it took to explain what was wanted.  But I like regular expressions… I’m definitely not a real expert, but I do like to play around with them and would consider myself above an average user.  So when I decided to test ChatGPT with a regular expression I asked it to solve something I have never been able to achieve on my own.  In fact I have never seen anyone else do this either… although I’m certain there are many people out there who would be very capable of doing it.  But when I’ve asked I have never had a satisfactory solution without using code, or without using multiple search & replace operations.

Continue reading “The elusive regex with GPT-4”