Can AI write your accessibility documentation?
After wanging on to Amy Hupe about the perils of letting AI write design system documentation, my brain went wandering into a dangerous place: could you get AI to write your accessibility docs?
Design systems docs should be tailored to the needs of the consumers and the brand. While there are many similarities between generic design system components, their application into products varies. It’s also important to consider your audience. The docs I write for engineers are succinct and technical, while folks in marketing might appreciate a more brand-focused approach.
But accessibility guidance for components?
There’s some variation based on context but the core principles remain consistent: component usage, keyboard and focus order, WCAG and platform compliance standards. These don’t change much from system to system. I thought I’d have a pop at aggregating all this into a prompt to save me searching the web every time.
Prompt 1: Aggregate
When I’m researching the accessibility guidelines for a component, I generally start with WCAG, then pull in various sources from bookmarks. I got ChatGPT to help me write a prompt to do this, thus demonstrating how the world is slowly eating itself. Here’s what we came up with:
☞ Generate accessibility guidance
The results are helpful and mostly accurate. The danger is that everything is formatted neatly so it looks authoritative. Despite my prompt explicitly referencing only trusted sources, it still makes ridiculous shit up, hallucinating entire components. Here are some of my favourite fabrications:
My favourite hallucinated components
Prompt 2: Write docs
Once I’ve gathered guidance from credible sources, I need to condense it into succinct, actionable accessibility guidance for design system sites. So I wrote a second prompt that takes all that research and outputs clean markdown ready to copy and paste.
If there’s already a Storybook, zeroheight site or component docs, I’ve tried to tailor results to the system.
You could easily switch up the output by referencing a brand tone of voice or specifying a specific format. Test it out and let me know what you think, I’d love some feedback on whether this works for you and if it’s useful.
How I built this
With a site built in Astro, it’s surprisingly straightforward to set up a little app like this. I’m a designer and I’m not super technical. I wrote a JS function that calls the OpenAI API with my prompt. Netlify runs the function and keeps the API key tucked away safely in environment variables. I’m finding Astro is so fast and light for little experiments.
To keep things neat, I put the core logic and prompt structure in a shared JS module. Both of my functions import it, so I don’t have to duplicate everything and can keep adding new bits in one place.
To appease my conscience, I’m using the PT-4o-mini model, which is a smaller and more efficient version of OpenAI’s GPT-4o model. Because it’s lighter, each request apparently uses less computing power, which in turn reduces the electricity required and lowers the carbon footprint compared to larger models. While the exact savings depend on the data-centre hardware, GPT-4o-mini is designed to deliver useful results with a much smaller environmental impact. Apparently.
What struck me most about this whole ‘vibe coding’ experiment is how much you can build without actually understanding what you’re doing. It’s genuinely unsettling. I don’t know how the code works, or if it even works the way I think it does. I’m copying and pasting ChatGPT’s suggestions, wrestling with Copilot until the compiler errors disappear, then iterating until something functional emerges. Seems harmless enough for a side-quest experiment, but the thought that significant chunks of the web are now being built this way? Absolutely terrifying.
Where I landed
This is genuinely useful for quick aggregation and a solid time-saver. The possibilities are interesting, but the risks are very real. Hallucinations are paired with the overconfidence of a crypto influencer explaining NFTs to your Nan. LLMs are significantly better than they were a year ago, but they still regularly get it wrong. Given the material they’re fed is flawed, can we ever rely on them to be 100% correct? The WebAim Million still shows that 94% of the internet’s top million homepages have WCAG fails, so I feel like we’re asking the impossible. I’m keen to investigate RAG or retrieval-augmented generation, where the model pulls in up-to-date info from sources I specify. This could help mitigate hallucinations and improve accuracy. Maybe I’ll test this next.
More importantly, it’s got me thinking about how, as an accessibility practitioner, I can use AI responsibly amidst such serious ethical and environmental costs. It’s hard to ignore that powerful AI systems rest on the invisible work of underpaid labour. I also worry that an over‑reliance on AI can erode critical thinking, turning me from a mildly engaging human into a zombified prompt engineer. I’m determined to keep writing blog posts from scratch so I don’t lose my own voice.
Cue: my existential AI crisis
You know, I actually enjoy researching and writing this stuff from scratch. That way I know it’s correct. While I could potentially bend this prompt to a bunch of other things like:
- A designer’s checklist for handover
- Accessibility acceptance criteria for component tickets
- A VPAT row generator for procurement docs
My concern is that if this gets packaged up neatly, people will treat the AI output as absolute truth. These tools work best when they handle the heavy lifting, maybe that first 60%, while you bring the human insight that understands actual users, real-world context, and the stakes involved. When someone’s ability to access critical information or complete essential tasks hinges on our guidance being accurate, there’s no substitute for thoughtful human oversight and lived experience.
While I was writing this, Mike Monteiro published a piece called How to not build the Torment Nexus. He talks about the importance of not just building things because we can, but considering the impact of what we create. It’s a timely reminder that while AI can be a powerful tool, it should be used thoughtfully. Without getting all mystical about it, I firmly believe the universe puts things in my path for a reason, and this encouraged me to park this firmly in the experiment zone.
I have huge respect for anyone brave enough to swerve the hype wagon and decide not to engage with AI. My brain currently swings from being quietly excited to simultaneously guilt-spiraling. My inner Geri, the one who rescues spiders from baths and feels bad for unused emojis, is horrified that I’m publicly admitting to even this level of moral compromise.
Resources
For transparency, here are the sites my prompt references. I rely on these sites for reliable accessibility guidelines. All links open in a new tab.
- WCAG 2.2 Quick Reference
- Mobile Content Accessibility Guidelines
- WCAG plain-English explanations
- ARIA Authoring Practices
- Apple Human Interface Guidelines
- Material 3 components
- Atomic A11y
- MDN Web Docs
If you have opinions, hit me on BlueSky