Adding some clarity…
I wrote my first article where I started to dabble in the use of AI in January 2023… seems a long time ago. Thankfully, here we are over 2 years later, and we haven’t been replaced by AI yet! Back then my foray into AI, like many others was more about seeing how good it was at making up silly poems and trying hard to catch it out to see how much it hallucinated and made up answers. This slowly moved into translating documents, playing with images, and seeing how it could help with some of the basics we use in our daily localisation work… such as regular expressions, XPath, and SQL statements. Eventually realising how useful it was for creating XML previews with stylesheets; also how bad it was at working with AutoHotKey version 2! Eventually becoming a goto resource for asking about information on just about anything! It does still lie occasionally, and can get things wrong, but since I know this and never take it as gospel, it has become a tool that gives me instant gratification for so many things that I can focus on doing what I want, and not so much researching how to get there.
Now this of course isn’t always a good thing. If I had children today it’s definitely questionable how much they should be exposed to because of the value in that research journey and doing things themselves… but at my age I value the opportunity to do things I simply don’t have the time to learn myself. What sort of things… well at the end of 2023 I wrote an article on a localization course delivered by Carlos García Gómez where I was encouraged to dabble in some things I knew little about… like programming in Python for example or writing PowerShell scripts. I had dabbled in these before and had some understanding of the absolute basics, but I hadn’t thought much further than that. I knew the sort of things clever localisation engineers could do with these technologies but putting my ideas about how something could be solved into practice was beyond my skillset. But not anymore… at least it’s not beyond my reach anymore!
I have had a GitHub account for some years, but mainly used it to help write/edit readme files and documentation, and occasional AutoHotKey scripts. In the last 12-months this has changed and I’ve been able to tackle more and more complex projects with the help of AI and been able to create solutions for many of the things I knew were the solution to a problem, but would have had to ask a developer. Today I have shared many of the things I’ve been building solutions for, and maybe they have helped others… I don’t know… but you can find them at git.filkin.com where I was even able to build a nice little index page (I think so anyway!) with the help of AI that updates itself whenever I add something new. Some of the technologies I’ve dabbled in I have written about in recent months, but there is a lot more, and my GitHub site isn’t a complete list of everything I’ve been playing with:
- .NET applications
- AutoHotkey scripts
- PowerShell scripts
- Python scripts
- SQLite
- JavaScript and web technologies
- VB scripts (mostly for Office applications)
It’s a whole new world where everything you always wanted to have but didn’t know how… you can have a go!! Over time the sort of things I have built with these AI fuelled capabilities has grown in complexity and I’ve been able to build things like a plugin for Trados Studio, and an add-on for NVDA. In fact it’s the add-on for NVDA that I want to talk about today because I did manage to build a working add-on, but it was still difficult and didn’t run very well. Some of the problems I was trying to solve around accessibility were very difficult to resolve with an add-on alone and all the research I carried out suggested a browser extension would be needed to support NVDA by surfacing the things I needed to access.
But before I get carried away let me wind back a bit because you may not know what I’m referring to at all when I mention NVDA and accessibility. So let me start there!!
Introduction
What is NVDA?
NVDA (NonVisual Desktop Access) is a free, open-source screen reader for Windows. It reads out what’s on the screen using synthetic speech or Braille, allowing blind or visually impaired users to interact with software, documents, and the web.
What is Accessibility?
Accessibility means designing digital tools (websites, apps, software) so everyone can use them, including people with disabilities. In this context, it often means making sure:
- Buttons, links, and menus can be accessed with a keyboard.
- Screen readers like NVDA can read out what’s happening on the screen.
- Visual information has text alternatives (like image alt-text).
In short:
- NVDA is a tool that makes Windows usable without sight.
- Accessibility is the practice of ensuring your software can be used by everyone and not just those who can see, hear, or use a mouse.
Why would a website be a problem for a screen reader?
Before I get into the nuts and bolts of this it’s probably helpful to understand the specific challenges that screen reader users face with many modern web applications. Screen readers are assistive technologies that convert digital text and interface elements into speech or braille. They work by reading the underlying code (HTML) of web pages and then announcing that content in a linear, sequential manner. A good way to relate to this is to think about having someone read a book aloud to you… they start at the beginning and work their way through systematically.
The Challenge with Modern Web Applications
Visual Layout vs. Linear Reading
Traditional websites are relatively simple because they have a clear beginning, middle, and end that screen readers can follow naturally. But modern web applications are built more like desktop software, with:
- Multiple panels and sections displayed simultaneously
- Dynamic content that appears and disappears without warning
- Complex navigation structures with nested menus and toolbars
- Interactive elements scattered across the interface
Imagine trying to navigate a busy airport terminal while blindfolded, with someone only able to describe one small area at a time. That’s what screen reader users experience with complex web applications.
The “Tab Trap” Problem
Screen reader users primarily navigate using the Tab key to jump between interactive elements. On a complex web page this might mean:
- Pressing Tab 50+ times just to reach an important button
- Getting lost in navigation menus and toolbars
- Losing track of where they are in the interface
- Missing important content that’s not in the tab order
It’s like being forced to walk through every single room in a large office building, floor by floor, to reach the conference room… instead of taking the elevator directly to the right floor. In fact if you try and work using the keyboard only on an application like Trados Cloud for example then you’ll see what I mean. It’s difficult without a mouse!
Silent Interface Changes
Modern web applications frequently update content dynamically, especially since they tend to be SaaS solutions where you are always on the latest version whether you like it or not:
- Dropdown menus appear without announcement
- Modal dialogs open silently
- Status messages flash briefly without being read aloud
- Page content changes without notification
For screen reader users, this is like having someone rearrange furniture in a room while their eyes are closed – they have no way of knowing what changed.
Missing Context and Landmarks
When you can see what’s on the screen you can quickly scan a page to understand its structure:
- “The main navigation is at the top”
- “The file list is in the center”
- “Action buttons are on the right”
Screen reader users however cannot do this. So they need the same structural information provided programmatically through proper HTML markup and ARIA labels. When this is missing, it’s like being given a 500-page document with no table of contents, headings, or page numbers.
Chrome Extension
So, with that introduction out of the way let’s come back to the Chrome Extension I mentioned and why I went that route. NVDA add-ons are plugins that extend what the screen reader can do. They’re powerful and can provide deep integration with NVDA’s features, but they would only help NVDA users. People using JAWS, VoiceOver, or other screen readers would still be stuck. In addition to this modern web applications often bury their interactive elements so deep in complex JavaScript frameworks that NVDA add-ons struggle to find and interact with them reliably, whereas browser extensions work directly inside the web page, where they can inject JavaScript and modify the page’s structure from the inside. At least I struggled with this!
Notwithstanding that… building a browser extension should work for everyone – NVDA users, JAWS users, VoiceOver users, and even people who just prefer keyboard navigation.
So in the end, and after many long nights and weekends trying to progress with the NVDA add-on, I decided to look at building a Chrome Extension instead. Why Chrome and not another browser… primarily for one reason and that is that I’m targeting the use of Trados Cloud and not websites in general. Chrome is the browser of choice for the Trados development team and it’s the browser they recommend for users as well.
How did I get there?
This has been an interesting journey, and I have learned so much in a short period…especially about web technologies and what’s now available to anyone with a bit of technical curiosity, the right questions, and the ability to recognise when something isn’t quite right. For example…
- F12 – Developer Tools
- these are built-in debugging tools that come with every modern web browser. Press F12 in Chrome, Firefox, or Edge and a panel opens showing you the website’s underlying code – the HTML structure, CSS styling, and JavaScript that makes the page work. You can even run code on a website to make it behave in a different way, or simply test what it’s doing.
- TamperMonkey
- this is a browser extension that lets you run custom JavaScript code on any website. It’s a way to modify websites on-the-fly, so you can write a script that automatically fills out forms, changes how a page looks, or adds new functionality.
These are two things that have been instrumental in helping me in being able to build a Chrome Extension… well ok there are of course three things. AI being the biggest since I would not have been able to work with the developer tools or TamperMonkey without AI in the first place. I think F12 is the go to tool for any website developer because it provides transparency into what’s going on, and you can generate logs that help to find problems as you’re trying to make your scripts work. To give you a very simple explanation of how F12 works… right-click on an image from this blog and then select “Inspect”. You’ll see something like this where you can read the alt-text that has been added when the image was loaded to ensure that a screen reader would be able to inform the user what that object was about:
Now that was a very simple use-case, it can be used for a lot more than that. But to really get more from it you do need to be a developer or have good experience in understanding how to read the information, find what you’re looking for, and interact with it. For very simple tasks I have become reasonably adept at this while working on some of the projects I started; but I have found it quite complicated when I was troubleshooting or needed to provide the AI with more information about particular components to help me build something. So this is where I found TamperMonkey to be an invaluable tool. Using this I was able to create scripts that would let me very easily click on the elements of a page I wanted to get information about, and the script would gather chapter and verse, then save that information to a file I could provide to the AI. I created the script using AI to deliver this:
- Complete Accessibility Data – Captures the full accessibility tree including ARIA attributes, computed roles, accessible names/descriptions, focus states, and screen reader compatibility information
- Comprehensive Element Geometry – Records all positioning data including bounding rectangles, offset properties, scroll positions, viewport coordinates, and exact dimensions for layout analysis
- Exhaustive Styling Information – Extracts complete computed styles covering layout properties, box model details, typography, flexbox/grid settings, colors, backgrounds, and visual effects
- Multiple Selector Types – Generates various ways to target the element including ID selectors, CSS selectors, XPath expressions, nth-child selectors, and data attribute selectors for automation
- Element Relationships & Context – Maps parent/child hierarchies, sibling relationships, ancestor chains, and form associations to understand element placement within the DOM structure
- Dynamic Properties & Performance – Monitors real-time changes through mutation observers, tracks visibility states, captures performance metrics like layout shifts, and records interactive properties like event listeners
The script essentially creates a “bible” of information about any web element I click, making it particularly valuable for providing complete element context to help the AI understand how to solve the problems I was asking. Looks like this, for example:
Here I just hover over the part I want more information about, click and it’s captured into a location I can pick it up from later. Like this I can quickly explore lots of things in one go and export all the information that the AI needs to help me. That was a massive time saver!
tradosClarity
tradosClarity is the name I have given to the Chrome Extension I’ve been building. I have made this an opensource project because my goal is to try and encourage developers who really do know what they’re doing to contribute to this and make it a really useful extension for screen reader users working with any of the Trados Cloud offerings available from the Trados portfolio of products. You can find the project in my GitHub site under its own repository here:
https://github.com/paulfilkin/tradosClarity
I’m certainly no expert, but I have tried to set this up so it’s ready to make it easier to manage as a project, and I’m fairly sure I’ll adapt this over time if I do get any traction and attract the attention of developers who really know what they’re doing! So far I have started an issues list where you can see what I’m working on already, and a project tracker in the form of a kanban board – all out of the box GitHub features:
- issues list: https://github.com/paulfilkin/tradosClarity/issues
- project tracker: https://github.com/users/paulfilkin/projects/1
So far the features I have added are these:
- 🧭 Quick Navigation Dialog (Alt+Shift+N) – Opens visual menu to jump directly to main sections (M=Menu, S=Tabs, A=Actions, T=Table)
- 🏠 Direct Navigation Shortcuts – Alt+Shift+M (Main Menu), Alt+Shift+S (Section Tabs), Alt+Shift+A (Action Buttons), Alt+Shift+T (Content Table)
- 🎯 Focus Action Button (Alt+Shift+A) – Instantly finds and focuses important buttons like “Accept Task” or “Complete Task”
- 🔄 Restart Product Tours (Alt+Shift+R) – Clears tour progress data and refreshes page so all tours restart from step 1
- 🧭 Tour Navigation – Use arrow keys to navigate tour steps, Escape to exit, with full screen reader announcements
- ⚙️ Customizable Shortcuts – Change any keyboard shortcut through settings, with conflict detection to prevent duplicates
- 🎵 Smart Adaptation – Automatically detects available sections on each page and adapts to different Trados Cloud layouts
- 🔊 Audio Feedback – Provides screen reader announcements for navigation actions and context awareness
You can see from the issues list in GitHub we already have a growing list of enhancements that I intend to work on when I have time, but what would be really great would be to have other developers who have an interest in this topic contributing to, and probably improving the code that I have written already. So I’m looking forward to seeing that happen! I’m also looking forward to seeing it get used by more screen reader users as the only feedback I get so far is from the RWS accessibility experts who have tested it and are doing a great job of identifying the work that needs to be addressed in Trados Cloud to make it even more accessible than it already is. So thanks, and massive kudos to Anna Rita de Bonis who is a blind translator working as an accessibility expert with RWS, and Ana-Maria Matefi who has many hats at RWS working with the customer experience team to support users looking for help, also working on the RWS Campus program supporting the RWS academic partners, and in the time she has left testing the Trados products for accessibility with Anna Rita.
The extension itself is only going to work when using Trados Cloud as it targets that application, but it looks like this when the extension icon is clicked:
And this for the full settings interface:
You can find the user documentation here if you want to take a look without trying the extension yourself: tradosClarity Help Documentation
Chrome Store
My plan was to get this to a sensible point where I have features worth working with, and then submitting this to the Google Chrome store so we could make this available to more users. When I investigated this it not only looked complicated (for a non-developer), but I was also under the impression it could take some weeks to get published. So I duly completed all the necessary steps, paid the $5 fee, and submitted the extension for approval expecting to add a few more features before it gets published. But Google astonished me because when I logged on a few hours ago the extension was approved and it’s now live! So this article has been written in its entirety in the last few, well 4 or 5, hours on a Sunday evening!! Hopefully readable and interesting nonetheless.
You can find the extension here if you wish to test it out: tradosClarity
It is best used with a screen reader of course, so take your pick, but here’s the two I’m targeting specifically:
A note on Trados Cloud accessibility
To finish up I want to write a brief note about Trados Cloud. The fact I have created this extension is not to make a statement about the inaccessibility of Trados Cloud. In fact the development team at RWS have done some fantastic work to try and ensure that Trados Cloud, and Trados Studio for that matter, are accessible products. Are they perfect…no. Can they be used in practice… yes. But as I have already mentioned in the earlier parts of this article, modern web applications are complicated beasts and true accessibility is not just about a screen reader being able to read what’s on the screen. It’s also about making what’s readable usable! This means providing a way for a screen reader user to work with the least amount of keystrokes as possible. When you are sighted this isn’t a problem because you can click anywhere you like with the minimal effort. If you are forced to use the keyboard it’s not the same.
So the tradosClarity extension will be mostly focused on addressing the usability aspects based on feedback from screen reader users. There will be some things that are more related to how the application was built, so some things that a screen reader has difficulty seeing, and I hope we can use this extension to address the most important of these until they get fixed in the product. But for now, the focus here is to add onto the features the Trados Cloud team have already built into the product. It’s much faster and easier for a small team to do this with an extension like tradosClarity that doesn’t have to fit into the rigours and complexity of a much bigger development organisation.
Trados Cloud Accessibility Features
Trados Cloud has an optimised setting for screen reader users which can be enabled through this navigation:
- User Settings > Appearance > Accessibility features > Keyboard navigation and screen reader optimization
This is great, and it will be enhanced over time to include more support than it does already to help optimise user interface elements for screen readers. I am convinced there will be a time when tradosClarity isn’t needed at all!
There is just one small thing I would like to see improved as soon as possible… and that would be to make this option available on the first screen you get to when you start a Trados Cloud instance. It should be the first thing a screen reader calls out when a new user goes into Trados Cloud for the first time so it’s easy to enable, and helps them navigate around the product immediately. It’s a bit hard to get there when you cannot see what you’re looking at!!
A final few words
Accessibility isn’t just a problem for developers to solve. I’d recommend that everyone tries working with a screen reader because it really will give you the best experience of what it is like for blind and visually impaired users who work in a digital environment. Until you have tried this yourself it’s really quite difficult to imagine the problem and it might encourage you to do a little to improve things for these users, just by adopting a more accessible approach to the way you work:
- Use proper headings: Apply real heading styles (e.g. Heading 1, Heading 2) instead of bold or enlarged text to organise content.
- Add descriptive alt-text: Provide meaningful alternative text for images, diagrams, and icons so screen readers can convey their purpose.
- Avoid vague link text: Use clear labels like “Download the report” instead of “Click here”.
- Use lists properly: Format bullet points and numbered lists using the correct tools, not just manual dashes or numbers.
- Keep tables simple and labelled: Use headers and avoid merging cells unnecessarily so screen readers can navigate them correctly.
- Ensure good contrast and readable fonts: This supports users with low vision and helps screen reader users orient themselves on the page.
- Test with a screen reader: If possible, try using NVDA briefly to hear how your content is read out.
I started this article with talking about how my own use of AI to help create small scripts to solve small problems here and there has slowly but surely moved onto far more complex solutions. The extension I have managed to get working here has (so far) got 4251 lines of code created from scratch. Not much compared to others… the Trados Studio plugin I built has 35k lines of code… but that is mostly due to the visual studio extensions RWS provide to help you get started. So the real geniuses are the developers who built all of that without AI in the first place!! However, the point I’m making is that AI is enabling me to build a growing number of ambitious projects I could not have ever contemplated beforehand. When you only have a few hours in the evenings, your weekends, it would take a very long time to be good enough to develop anything substantial in addition to a day job!
So whilst I do have mixed feelings about AI and my mind is full with ethical questions about how far we should go, right now it’s opened the doors to a world I always wanted to enjoy, but didn’t have the time and probably the technical ability, to do what my colleagues in our R&D teams can do on their own!