“Tags” are something we normally like to avoid, whether it’s graffiti or documents prepared for translation in a CAT tool, and you can find articles and forum threads all over the internet about how to avoid them. But what if you want them… the ones in a CAT tool? Let’s say you receive a project from your client in a package, and they didn’t prepare the files as well as you would have liked, leaving you to deal with strings you’d rather have protected as tags, or even tags you don’t want to have to tackle at all. In a nutshell, if you’re using Studio you’re stuffed! You can prepare the files again as you like (possibly), translate them in your own project, and then pre-translate the real project afterwards from your TM, correcting any tag differences before returning the package to your client.
But how many people really do that? My guess is not too many. But judging by the number of times I’m asked whether it’s possible to tag up the files in Studio after the project has been prepared, or remove tags from a file after the project has been prepared, I think the number who’d like to do this is a little more. Well, you still can’t do this in Studio out of the box, but you can do it with the help of a clever plugin developed by a translating developer, or developing translator (not sure which way around this should be), Jesse Good. The application Jesse developed does a lot more than just this but I’m going to focus on a couple of interesting tagging components.
It’s called “Cleanup Tasks” and it’s freely available from the SDL AppStore (make sure you have at least version 1.2 if you already downloaded this). You can read all the details in a very useful blog article written by Jesse where he explains, “in a nutshell”, how you can handle the following:
- You can lock segments based on structure or content
- You can remove unwanted tags in the source
- You can modify the source or target text as you like and create “settings” files for easy reuse
- You can create placeholders for fixed words or phrases
- You can use StrConv from visual basic to perform many useful tasks such as;
- case conversion,
- full-width to half-width character conversion,
- convert between Hiragana and Katakana,
- convert between simplified and traditional chinese characters
The plugin creates two new batch tasks in Studio that support you running a “cleanup” on your project file(s) before or after they’ve been translated. I’d really recommend you take a careful look at the article Jesse has written because he does a better job than I could of explaining how it works. But I just want to explain how it could benefit you when faced with a file like this for example:
Clearly I made the file up to show a couple of typical problems, but I think it works for this example. There are three things to note in my file.
- It’s a good example of an awful PDF conversion for example (segment #1)
- It contains embedded HTML in a file (segments #3 – #11)
- There are some product names I may wish to protect (segments #12 & #13)
You can find plenty of examples all over the internet on how to prepare files to tackle this properly, well when I say properly I mean deal with an improperly prepared file!..For example:
How to get rid of a tag soup in Trados Studio – an article by Emma Goldsmith on dealing with the tag soup
Regex for Microsoft Word… is there no end? – an article by me on using regular expressions in Word to hide the tags in the embedded content
Both of these require work prior to bringing the files into Studio. Not too helpful if you get a package and the Project Manager didn’t prepare them well for you, or if the files in question require a separate product that you don’t own and don’t wish to purchase either! But now thanks to the help of this little plugin from Jesse tackling these types of things can be a lot easier. It should be noted that care should be taken with this because you could easily remove things that should really be there, but used carefully this application could become one of the most popular apps on the SDL AppStore. So read his blog article carefully.
The process is really simple… you just run a couple of new batch tasks created when you install the plugin. “Cleanup Source” to deal with the tags and other things, and “Cleanup Target and Generate Files” if you added tags to protect text from translation afterwards. If you don’t, and if the save target will work (this is where you must take care), then you can simply save the target files as normal. But you should have a play with this before using in anger to make sure you understand the implications for your work.
Running a single batch task on my test file above converted it to this in a couple of seconds and I can save the settings file I created to reuse in future files:
Now this is a very crude example, but this is going to be much easier to handle and as you can see the clean up is even smart enough to retain some of the formatting you might want to keep, such as bold and italic tags. The html conversion was also pretty clever because with one expression it’s even been able to distinguish between tag pairs and placeholders. Very smart… but remember the earlier warning because if you have not prepared your expressions carefully the target file might not be what you expect. I know the html part isn’t the best, let’s face it… this isn’t really the most appropriate way to handle html files in the first place so we are just looking at different ways to tackle situations which are undesirable and could easily be avoided if a little more thought were given to the localisation process in the first place.
I’m not going to reproduce the excellent article Jesse has written or explain how to do all the things it’s capable of, but I will show you a quick video using the file I created above. I think you’ll get the point… and I think you’ll like it!
Excellent job Jesse! I’m going to enjoy using this one to solve all kinds of interesting problems and here’s another link to his article in case you missed it on the way down:
Cleanup Plug-in Tool by Jesse Good.
… and the plugin itself!
Cleanup Plug-in on the SDL AppStore