According to wikipedia there are some 9.6 to 12 million people speaking Haitian Creole worldwide. I had no idea it was such a widely spoken language until I was asked a question this week about why the Google Translate machine translation provider in Studio returned French translations when the project was en(US) – fr(HT) (French-Haiti).
In fact I had no idea that French-Haiti was most likely intended to be the language that should be used in Studio for Haitian Creole as this isn’t a language I come across very often.
But before I can ask a developer to fix this problem I have to be able to understand it myself, so the first thing I wanted to know was whether French-Haiti was the same as Haitian Creole or not. And for anyone interested, as I was, to read more on this I found these three interesting links below explaining how the language came around and it does have a very interesting history:
- wikipedia – Haitian Creole
- Indiana University Bloomington – Creole: The National Language of Haiti
- about world languages – Haitian Creole
The next thing I did was see if it was really so different to French… using a few basic sentences. I used Google Translate to check this since the request was to make Googles Haitian Creole available when fr(HT) is the target language in a Studio Project. Based on this example alone it certainly looks like a different language and not just a few words here and there as we have between English US and English UK:
I guess Google would not have provided it if 12 million Haitians were wrong! I can’t judge whether the translation is good or not, but that doesn’t matter as much as making sure that Haitian Creole is returned when you use fr(HT) in your project. Perhaps it’s also wrong for the language to be called French-Haiti in the first place and not Haitian Creole? Maybe we should have ht(HT), ht(BS), ht(CA) etc. I don’t know, and more to the point I can’t do anything about that either as Studio uses the Windows Language Code Identifiers (LCID) which are all fully qualified with a country code and a language code. There is no Haiti on its own, only fr(HT). But based on Google and these basic sentences it certainly looks as though it could be a separate language altogether.
So the task was how to resolve the problem of the machine translation results being French since Studio does not have Haitian Creole available to it because of the LCIDs. I turned to the SDL AppStore team, and Andrea-Melinda Ghişa in particular as she had done some work in bringing Google NMT and Microsoft NMT to the MT Enhanced plugin on the SDL AppStore. In the space of an hour or so Andrea had the answer! After a little research she found this site, Google Web Interface and Search Language Codes, which shows that Google is not using the fully qualified LCIDs, only a web interface code which in this case is ht. When Studio used fr(HT) as the target language the code being sent to Google was fr which makes sense as there are 47 varieties of French in Studio where most are very similar to each other and Google certainly doesn’t have 47 corresponding varieties to apply machine translation to. But fr(HT) is clearly an exception so Andrea changed the code in the MT Enhanced plugin so that when Studio presented fr(HT) the plugin would send ht to Google:
As simple as that, and the result really shows the benefit of having the API approach for these kinds of things because it means anyone who knows how to develop can implement a solution in their own time. There’s no need to submit a request to the product management team for SDL Trados Studio and then work through the request lifecycle based on its position in the overall priority list and then all the process and controls that take place after that. You can implement a solution right now! Under your own control without the need to engage the development team for the core product at all. This is a very powerful capability offered by the Studio platform… of course you do need to be, or have access to, an Andrea!!
In Studio, if I use the out of the box Google Translate alongside the updated MT Enhanced plugin (now v1.7) for my fr(HT) project you can see the difference:
Longer term the community API team would like to implement a mapping feature to this plugin so if Google adds more languages in the future, which is quite likely, then it will be simple for a user to change the mapping table in a simple UI themselves. Now that would be cool! But all good things have to wait their turn even in the appstore team. But as the code is opensource, if you are a developer and would like to implement this idea and share it with others then please go ahead and do it… the sourcecode is here! In fact you can find more information about the SDL OpenSource Community here:
Looking forward to any contributions!!