After attending the xl8cluj conference in Romania a few weeks ago, which was an excellent, and very technical conference for translators, I thought it was about time I wrote an article around the things you can do with the Regular Expression Delimited Text filter since it is so useful for solving all kinds of tasks related to text based files that don’t fit any of the out of the box formats available in the product. Files such as software string files and csv files are common examples of where understanding how to work with this customisable file type can yield many benefits. So this article is food for thought and a few things that might be helpful to you in the future. It’s also pretty long (I’m not kidding!), so maybe grab a cup of coffee before you start to go through it!
The origin of Chad (if you’re British) or Kilroy (if you’re American) seems largely supposition. The most likely story I could find, or rather the one I like the most, is that it was created by the late cartoonist George Edward Chatterton ‘Chat’ in 1937 to advertise dance events at a local RAF (Royal Air Force) base. After that Chad is remembered for bringing attention to any shortages, or shortcomings, in wartime Britain with messages like Wot! No eggs!!, and Wot! No fags!!. It’s not used a lot these days, but for those of us aware of the symbolism it’s probably a fitting exclamation when you can’t save your target file after completing a translation in Trados Studio! At least that would be the polite exclamation since this is one of the most frustrating scenarios you may come across!
At the start of this article I fully intended this to be a simple description of the problems around saving the target file, but like so many things I write it hasn’t turned out that way! But I found it a useful exercise so I hope you will too. So, let’s start simple despite that introduction because the reasons for this problem usually boil down to one or more of these three things:
- Not preparing the project so it’s suitable for sharing
- Corruption of a project file
- A problem with the source file or the Studio filetype
At the beginning of each year we probably all review our priorities for the New Year ahead so we have a well balanced start… use that gym membership properly, study for a new language, get accredited in some new skill, stop eating chocolate… although that may be going just a bit too far, everything is fine with a little moderation! I have to admit that moderating chocolate isn’t, and may never be, one of my strong points even though it’s on my list again this year! But the idea of looking at our priorities and setting them up appropriately is a good one so I thought I’d start off 2018 with a short article explaining why this is even important when using SDL Trados Studio, particularly because I see new users struggling with, or just not being aware of, the concepts around the prioritisation of filetypes. If you don’t understand them then you can find code doesn’t get tagged correctly despite you setting it up, or non-translatable text is always getting extracted for translation even though you’re sure you excluded it, or even files being completely mishandled. Continue reading
There are well over 200 applications in the SDL AppStore and the vast majority are free. I think many users only look at the free apps, and I couldn’t blame them for that as I sometimes do the same thing when it comes to mobile apps. But every now and again I find something that I would have to pay for but it just looks too useful to ignore. The same logic applies to the SDL AppStore and there are some developers creating some marvellous solutions that are not free. So this is the first of a number of articles I’m planning to write about the paid applications, some of them costing only a few euros and others a little more. Are they worth the money? I think the developers deserve to be paid for the effort they’ve gone to but I’ll let you be the judge of that and I’ll begin by explaining why this article is called double vision!!
From time to time I see translators asking how they can get target documents (the translated version) that are fully formatted but contain the source and the target text… so doubling up on the text that’s required. I’ve seen all kinds of workarounds ranging from copy and paste to using an auto hotkey script that grabs the text from the source segment and pastes it into the target every time you confirm a translation. It’s a bit of an odd requirement but since we do see it, it’s good to know there is a way to handle it. But perhaps a better way to handle it now would be to use the “RyS Enhanced Target Document Generator” app from the SDL AppStore? Continue reading
I’m back on the topic of PDF support! I have written about this a few times in the past with “I thought Studio could handle a PDF?” and “Handling PDFs… is there a best way?“, and this could give people the impression I’m a fan of translating PDF files. But I’m not! If I was asked to handle PDF files for translation I’d do everything I could to get hold of the original source file that was used to create the PDF because this is always going to be a better solution. But the reality of life for many translators is that getting the original source file is not always an option. I was fortunate enough to be able to attend the FIT Conference in Brisbane a few weeks ago and I was surprised at how many freelance translators and agencies I met dealt with large volumes of PDF files from all over the world, often coming from hospitals where the content was a mixture of typed and handwritten material, and almost always on a 24-hr turnaround. The process of dealing with these files is really tricky and normally involves using Optical Character Recognition (OCR) software such as Abbyy Finereader to get the content into Microsoft Word and then a tidy up exercise in Word. All of this takes so long it’s sometimes easier to just recreate the files in Word and translate them as you go! Translate in Word…sacrilege to my ears! But this is reality and looking at some of the examples of files I was given there are times when I think I’d even recommend working that way!
A nice picture of a cutie cat… although I’m really looking for a cutie linguist and didn’t think it would be appropriate to share my vision for that! More seriously the truth isn’t as risqué… I’m really after Qt Linguist. Now maybe you come across this more often than I do so the solutions for dealing with files from the Qt product, often shared as *.TS files, may simply role off your tongue. I think the first time I saw them I just looked at the format with a text editor, saw they looked pretty simple and created a custom filetype to deal with them in Studio 2009. Since that date I’ve only been asked a handful of times so I don’t think about this a lot… in fact the cutie cat would get more attention! But in the last few weeks I’ve been asked four times by different people and I’ve seen a question on proZ so I thought it may be worth looking a little deeper.
Years ago, when I was still in the Army, there was a saying that we used to live by for routine inspections. “If it looks right, it is right”… or perhaps more fittingly “bullshit baffles brains”. These were really all about making sure that you knew what had to be addressed in order to satisfy an often trivial inspection, and to a large extent this approach worked as long as nobody dug a little deeper to get at the truth. This approach is not limited to the Army however, and today it’s easy to create a polished website, make statements with plenty of smiling users, offer something for free and then share it all over social media. But what is different today is that there is potential to reach tens of thousands of people and not all of them will dig a little deeper… so the potential for reward is high, and the potential for disappointment is similarly high.
One of my favourite features in Studio 2017 is the filetype preview. The time it can save when you are creating custom filetypes comes from the fun in using it. I can fill out all the rules and switch between the preview and the rules editor without having to continually close the options, open the file, see if it worked and then close the file and go back to the options again… then repeat from the start… again… and again… I guess it’s the little things that keep us happy!
I decided to look at this using a YAML file as this seems to be coming up quite a bit recently. YAML, pronounced “Camel”, stands for “YAML Ain’t Markup Language” and I believe it’s a superset of the JSON format, but with the goal of making it more human readable. The specification for YAML is here, YAML Specification, and to do a really thorough job I guess I could try and follow the rules set out. But in practice I’ve found that creating a simple Regular Expression Delimited Text filetype based on the sample files I’ve seen has been the key to handling this format. Looking ahead I think it would be useful to see a filetype created either as a plugin through the SDL AppStore, or within the core product just to make it easier for users not comfortable with creating their own filetypes. But I digress…
We all know, I think, that translating a PDF should be the last resort. PDF stands for Portable Document Format and the reason they have this name is because they were intended for sharing with users on any platform irrespective of whether they owned the software used to create the original file or not. Used to share so they could be read. They were not intended to be editable, in fact the format is also used to make sure that the version you are reading can’t be edited. So how did we go from this original idea to so many translators having to find ways to translate them?
I think there are probably a couple or three reasons for this. First, the PDF might have been created using a piece of software that is not supported by the available translation tool technology and with no export/import capability. Secondly, some clients can be very cautious (that’s the best word I can find for this!) about sharing the original file, especially when it contains confidential information. So perhaps they mistakenly believe the translator will be able to handle the file without compromising the confidentiality, or perhaps they have been told that only the PDF can be shared and they lack the paygrade to make any other decision. A third reason is the client may not be able to get their hands on the original file used to create the PDF.
If you’ve never come across Microsoft Publisher before then here’s a neat explanation from wikipedia.
“Microsoft Publisher is an entry-level desktop publishing application from Microsoft, differing from Microsoft Word in that the emphasis is placed on page layout and design rather than text composition and proofing.”
It’s actually quite a neat application for newbies to desktop publishing like me, but it’s a difficult tool to handle if you receive *.pub files (the format used by MS Publisher) and are asked to translate them. And I do see requests from translators from time to time asking how they can handle them. The file itself is a binary format and even with Office 2016 (which includes Publisher if you have the Professional version) the only export formats of PDF, XPS and HTML are not importable. So very tricky indeed if you need to be able to provide your client with a translated version of the pub format.