A nice picture of a cutie cat… although I’m really looking for a cutie linguist and didn’t think it would be appropriate to share my vision for that! More seriously the truth isn’t as risqué… I’m really after Qt Linguist. Now maybe you come across this more often than I do so the solutions for dealing with files from the Qt product, often shared as *.TS files, may simply role off your tongue. I think the first time I saw them I just looked at the format with a text editor, saw they looked pretty simple and created a custom filetype to deal with them in Studio 2009. Since that date I’ve only been asked a handful of times so I don’t think about this a lot… in fact the cutie cat would get more attention! But in the last few weeks I’ve been asked four times by different people and I’ve seen a question on proZ so I thought it may be worth looking a little deeper.
Years ago, when I was still in the Army, there was a saying that we used to live by for routine inspections. “If it looks right, it is right”… or perhaps more fittingly “bullshit baffles brains”. These were really all about making sure that you knew what had to be addressed in order to satisfy an often trivial inspection, and to a large extent this approach worked as long as nobody dug a little deeper to get at the truth. This approach is not limited to the Army however, and today it’s easy to create a polished website, make statements with plenty of smiling users, offer something for free and then share it all over social media. But what is different today is that there is potential to reach tens of thousands of people and not all of them will dig a little deeper… so the potential for reward is high, and the potential for disappointment is similarly high.
One of my favourite features in Studio 2017 is the filetype preview. The time it can save when you are creating custom filetypes comes from the fun in using it. I can fill out all the rules and switch between the preview and the rules editor without having to continually close the options, open the file, see if it worked and then close the file and go back to the options again… then repeat from the start… again… and again… I guess it’s the little things that keep us happy!
I decided to look at this using a YAML file as this seems to be coming up quite a bit recently. YAML, pronounced “Camel”, stands for “YAML Ain’t Markup Language” and I believe it’s a superset of the JSON format, but with the goal of making it more human readable. The specification for YAML is here, YAML Specification, and to do a really thorough job I guess I could try and follow the rules set out. But in practice I’ve found that creating a simple Regular Expression Delimited Text filetype based on the sample files I’ve seen has been the key to handling this format. Looking ahead I think it would be useful to see a filetype created either as a plugin through the SDL AppStore, or within the core product just to make it easier for users not comfortable with creating their own filetypes. But I digress…
We all know, I think, that translating a PDF should be the last resort. PDF stands for Portable Document Format and the reason they have this name is because they were intended for sharing with users on any platform irrespective of whether they owned the software used to create the original file or not. Used to share so they could be read. They were not intended to be editable, in fact the format is also used to make sure that the version you are reading can’t be edited. So how did we go from this original idea to so many translators having to find ways to translate them?
I think there are probably a couple or three reasons for this. First, the PDF might have been created using a piece of software that is not supported by the available translation tool technology and with no export/import capability. Secondly, some clients can be very cautious (that’s the best word I can find for this!) about sharing the original file, especially when it contains confidential information. So perhaps they mistakenly believe the translator will be able to handle the file without compromising the confidentiality, or perhaps they have been told that only the PDF can be shared and they lack the paygrade to make any other decision. A third reason is the client may not be able to get their hands on the original file used to create the PDF.
If you’ve never come across Microsoft Publisher before then here’s a neat explanation from wikipedia.
“Microsoft Publisher is an entry-level desktop publishing application from Microsoft, differing from Microsoft Word in that the emphasis is placed on page layout and design rather than text composition and proofing.”
It’s actually quite a neat application for newbies to desktop publishing like me, but it’s a difficult tool to handle if you receive *.pub files (the format used by MS Publisher) and are asked to translate them. And I do see requests from translators from time to time asking how they can handle them. The file itself is a binary format and even with Office 2016 (which includes Publisher if you have the Professional version) the only export formats of PDF, XPS and HTML are not importable. So very tricky indeed if you need to be able to provide your client with a translated version of the pub format.
What the heck is a good bug? I don’t know if there is an official definition for this so I’m going to invent one.
“An unintended positive side effect as a result of computer software not working as intended.”
I reckon this is a fairly regular occurrence and I have definitely seen it before. So for example, in an earlier version of Studio you could do a search and replace in the source and actually change the source content. This was before “Edit source” was made available… sadly it was fixed pretty quickly and you can no longer do this unless you use the SDLXLIFF Toolkit or work in the SDLXLIFF directly with a text editor. In the gaming world it happens all the time, possibly the most famous being the original Space Invaders game where the levels got faster and faster as you killed more aliens. This was apparently not by design but it was the result of the processor speed being limited, so as you killed the aliens the number of graphics reduced and the rendering got faster and faster… now all games behave this way! Another interesting example in the Linux/Unix world is using a dot at the start of a filename to hide it from view. This was apparently a bug that was so useful it was never “fixed”.
“Gabriela descended from the train, cautiously looking around for signs that she may have been followed. Earlier in the week she’d left arrangements to meet László at the Hannover end of Platform 7, and after three hours travelling in a crowded train to get there was in no mood to find he hadn’t got her message. She walked up the platform and as she got closer could recognise his silhouette even though he was facing the opposite direction. It looked safe, so she continued to make her way towards him, close enough to slip a document into the open bag by his side. She whispered ‘Read this and I may have to shoot you!’ László left without even a glance in her direction, only a quick look down to make sure there was no BOM.”