Hunspell dictionaries in Studio

When I write these articles I always start with thinking about the image at the top.  I do this for two reasons, the first is because it usually helps me think of some bizarre introduction (like this!) that helps me start writing, and the second is because every now and again I like to play around with Gimp which is the free image software I occassionally use.  It’s always nice to spend a little time doing something frivolous because it’s good thinking time without being distracted by the job!  I don’t really know how to use this software at all, but it’s fun seeing what turns out… and I confess I often use a combination of powerpoint and Gimp simply because some things are just easier in powerpoint!  Eventually I might actually learn how to use it properly… I’ll keep practicing anyway.

Now, onto the point of the article which is Hunspell Dictionaries.  I’ve mentioned these before in this article (scroll to the end to find the relevant section) where I explained how to add new Hunspell dictionaries to Studio because the number of languages Studio supports is significantly more than the number of available dictionaries for spell checking.  I used English as the example as there are 94 variants in Studio and only 7 supported by Hunspell.  The effect of this is that you end up with this rather annoying effect on the right when you want to check your spelling in an unsupported variant:

You can add new dictionaries based on the existing ones by following the process in the article or the video I created to explain it.  However, this is still a process which many users are unable to follow because of the things that can get in the way like not having admin rights, being unable to see the relevant files and folders because they are hidden in Windows, not having a good text editor, not knowing what language code to use, being unsure about making changes to an XML file etc.  Interestingly enough there was even an idea on the SDL ideas site logged yesterday, or the day before, where the user wanted more dictionaries or an easier way to add them based on the process I just mentioned.  The timing was good because the SDL AppStore team has had this on their todo list for a very long time and finally managed to make time to fit this in, so now the process is really easy!

Hunspell Dictionary Manager

The application is a standalone application which you can find on the SDL AppStore.  You can run it from an icon on your desktop, or from the shortcut you’ll find in the Studio navigation menu after installing it.  The process is simple and based on this idea:

  1. Select the existing dictionary that will be the baseline spellchecker for your new one
  2. Select the language variant of your new one
  3. Click “create” and the dictionary is ready
  4. Restart Studio so the new dictionary is available for use

That’s certainly a lot easier than the manual way (check this wiki for a more detailed step by step or watch the video at the end) and I think this will be a very valuable tool for anyone who works with languages not supported by Hunspell in Studio, but can sensibly be based on an existing one.  This last point “sensibly based on an existing one” is important to understand because this application is not going to create spellcheckers for any main language that is not already supported as there is nothing to base it on… well, you could do but it may not make sense.  The current Hunspell dictionaries are actually quite old and based on a different technology to the new ones, so changing them to be able to use newer and possibly more flexible formats as far as languages is concerned is a serious bit of work for the Studio development team.  This meant that the AppStore team were unable to find an easy way to use Hunspell dictionaries from various websites around the internet that might offer better options for languages that are not supported at all.  What do I mean by this… there is one Arabic flavour for example, Arabic (Algeria), and you could use this as the basis of a new dictionary for any of the other 26 variants of Arabic that are supported by Studio.  But you could not create a dictionary for Burmese or Irish for example because there is nothing in there to base the dictionary on.  So there is still a need for the Studio development team to improve this situation I think but when they do it will depend on all the other priorities they have.

In the meantime the Hunspell Dictionary Manager is going to help you make sensible changes, such as create Arabic (Morocco) from Arabic (Algeria) because they are both Western Arabic belonging to the Maghrebi Arabic language family (note this is quite an assumption on my part… but I hope it illustrates the point):

If you don’t like to watch a video then visit the wiki, that is also linked from the About tab in the app, and that will explain in detail how to use the application.  But if you’re a video person then here’s a short one to explain how it works:

Duration: 5 mins 18 seconds

 

0 thoughts on “Hunspell dictionaries in Studio

  1. Hi Paul,

    On a somewhat related subject, could you explain (or point me to a reference that explains) the advantages of using the Hunspell Spell Checker vs. the MS Word Spell Checker in Trados Studio? Our team only uses the English (Canada) and French (Canada) language variants. We do have Microsoft Word but don’t use the Word dictionary for terminology.

    Thanks!

    1. Hi Christine, I think it comes down to personal preference in your case. I don’t think either spellchecker is useful for terminology, but perhaps that’s not what you meant. Maybe something like Antidote would be more useful for you as this is considered an excellent solution for French and they also have an English checker… so might be very useful in your case.

      1. Hi Paul,

        Thanks for your reply! No, indeed, when I mentioned terminology in Word dictionaries, I was just repeating something I read in SDL’s resources… But I think you’ve confirmed that for general spell checking, there are no major differences between the two options that we should know about. That’s all I wanted to know. 🙂

        We do also use Antidote and we love the plugin! I hadn’t started reading multifarious back in 2016 so I hadn’t seen your article about it – very interesting. Thank you for your hard work in making the plugin happen.

        Christine

        1. That’s great… and to be fair to the Antidote team they have been very helpful since then. They even gave us some licences for testing and demonstrating the product.

  2. Dear Paul,

    Thanks a lot for this wonderful post! 🙂

    Do you know if it’s possible to use the check spelling for the following languages:

    Hindi, Nepali and Tamil

    We would like to run such check spelling on Studio and apparently there is nothing we can use to verify the files.

    Kind regards,

    David

    1. Hi David, Hindi and Marathi are supported by Hunspell in Studio, but Nepali and Tamil are not. I don’t know if either of these can be used as the basis for a new dictionary to support what you need. Perhaps you can tell me?
      Microsoft offer support for all of these and more with their proofing tools if you download and install the appropriate language pack and then Studio will use that.
      LanguageTool also offers support for Tamil and we have a plugin for Language Tool on the appstore too.
      Perhaps some of this information will help you?

Leave a Reply