My son asked me how my day had gone and before I could answer he said in a slightly mocking tone “blah blah blah… XML… blah… XML … blah blah”. Clearly I spend too much time outside of work talking about work, and clearly his perception of what I do is tainted towards the more technical aspects I like the most! Aside from the note to self “stop talking about this stuff after I leave the office!” it got me thinking about why I probably think about XML as much as I apparently do and how I could help others avoid the very same compulsion! I’ve written articles in the past about how to use regular expressions in Studio, and an article on using XPath, and I’ve probably touched on handling XML files from time to time in various articles. But I don’t think I’ve ever explained how to create an XML filetype in the first place, or why you would want to… after all Studio has default filetypes for XML and this is just another filetype that the CAT tool should be able to handle… right?
On the first day of Christmas my Studio gave to me…
12 Verification SDK and API!
11 QA Checker Profiles
10 Segments to exclude
9 Punctuation checks
8 Regular Expressions
7 Terminology Verification
6 Trademark checks
5 Number checks
4 Segment Verifications
3 Length Verification checks
2 Word Lists
… and no Inconsistencies by default
The Quality Assurance features in Studio are quite extensive, and they are often loved and hated all at the same time. Loved because when used correctly they can provide excellent assurance that you’ll have happy clients… hated because the automated recognition of numbers, dates etc. in Studio follow the settings of your computer and sometimes these are not what you need.
Since Studio 2014 was launched it’s been interesting to see what some users were waiting for. Did they want the Quickmerge, Alignment, AutoSave, improved navigation, blistering speed, automatic concordance search, improved filters, enhanced locking functionality, custom TM user ID, improvements to the term recognition threshold, more options in the display filter, auto-substitution for acronyms and a host of other improvements? No… and I genuinely don’t mean this in a mean way… it seems for some users an easier way to handle typographical quotes is the order of the day and this hasn’t radically changed since TagEditor.
The release of Studio 2014 will bring a number of new OpenExchange applications to the App Store. One of these is already becoming well known based on the name alone… the SDLXLIFF Toolkit! The name suggests this is a tool for working with an SDLXLIFF and being able to take it to pieces and interact with all of it’s components… and this is probably a good explanation of what it actually does.
By taggy files I mean “embedded xml or html content” that is written into an Excel file alongside translatable text. In the last article I wrote I documented a method sometimes used by people to handle tagged content in a Word file… funnily enough I came across a Word file containing the XML components of an IDML file today and I guess it must have been prepared in a very similar way judging by the enormous number of tags using the tw4win style to hide them when opened by any SDL Trados version! Proof for me that this practice is sadly alive and well. But I digress… because this time I want to cover how to handle a similar problem when you find HTML or XML tagged content in an Excel file. This crops up quite a bit on ProZ so I thought it might be better to document it once and for all so I have something else to refer to in addition to the Studio help.
Unfortunately the practice of being asked to translate a Microsoft Word file that contains HTML code doesn’t look as though it will go away any time soon for some translators. But it’s not the end of the world and it’s often all in the preparation of the Word file before you translate it. Continue reading
When I first started adding articles about how to use regular expressions I thought I’d only write three… but I had an interesting question from one of our resellers, Agenor (actually Agenor always asks me the hardest questions!), about how to use the display filter to find segments that contain one word, but not another. It was tricky, but once you have it you can use the expression all the time. I have a collection of such things from when people ask me, so I thought I’d share how this problem was solved and also post a list of some of the useful regular expressions I have saved for the display filter in Studio 2011.
When I started writing this blog the first article I wrote was about the SDL OpenExchange. I thought I’d start this year off by sharing my favorite applications … my favourite FREE applications. We had a fair few of these over the course of the year but I’ll pick out six that I think are well worth a look. In no particular order (well… alphabetical order) these six are:
- Glossary Converter
- Package Reader
- SDLXLIFF Compare
- SDLXLIFF to Legacy Converter
Handling number only segments is a question that comes up a fair bit, and for a number of reasons. Mostly it’s the more simple question of how to handle them at all; sometimes they are recognised and Studio can auto-localize them; sometimes they aren’t recognised and you need to work around this a little. This question I’ve addressed a few times, so here’s a few links as a reminder.
A few articles ago I spent time explaining how to use the TermInjector OpenExchange application from Tommi Nieminen which allows you to create dynamic variables based on regular expressions.
It is a pretty complex article and I had to reread it a couple of times to get my head around it again, and I needed expert help from Tommi, but it was worth the effort because this tool could prove to be invaluable for users who regularly have to deal with numbers in a document that are not recognised by Studio, or currencies that are not used in a way that Studio can automatically localize them for you.