Case study

Designing documentation for retrieval, not just reading

As LLMs became increasingly integrated into support and product workflows, I noticed a recurring pattern of technically correct documentation still producing unreliable AI answers.

I led an initiative to rethink documentation structure through the lens of retrieval and chunking, creating a "chunkability" framework that evaluated whether content could survive fragmented retrieval while still remaining understandable, trustworthy, and actionable in isolation.

AI Content StrategyDocumentation IARAGInformation Architecture

Impact

Made documentation survive fragmented retrieval

Reduced major AI inaccuracies by up to 83%

After restructuring documentation using the chunkability framework, several test sets saw major inaccuracies drop from 6/10 responses to as low as 0-1/10 across multiple LLMs.

Improved complete-answer rates from 0/10 to 7/10

In one enterprise connectivity test set, models that initially failed nearly every retrieval question became consistently reliable after structural documentation updates.

Created a reusable framework for AI-readable documentation

Built a 'chunkability' audit system that evaluated documentation against retrieval-focused criteria like identity persistence, table survivability, scope locality, and boundary clarity.

Reframed hallucinations as information architecture problems

The project demonstrated that many unreliable AI responses were caused less by missing information and more by documentation structure that failed under fragmented retrieval.

The problem

Documentation that worked for humans was failing under fragmented retrieval

As the company expanded AI-assisted support and retrieval workflows, documentation quality became increasingly difficult to evaluate. Pages that worked well for human readers often produced incomplete or misleading AI responses once ingested into retrieval systems.

Large comparison tables lost context when chunked. Generic headings like Selectors or Allow became meaningless in isolation. Important constraints were frequently separated from the actions they governed. Even when the correct information existed, models struggled to consistently retrieve and reconstruct it.

The challenge was compounded by the lack of shared standards for AI-readable documentation. Most existing content guidance focused on readability, completeness, or writing quality rather than retrieval survivability.

The solution

A chunkability framework for AI-readable documentation

To better understand why technically accurate documentation was producing unreliable AI answers, I researched retrieval-augmented generation systems, chunking strategies, and retrieval behavior in modern LLM pipelines. I also completed specialized coursework focused on RAG architecture, retrieval evaluation, and context management.

Using that research, I developed a chunkability evaluation framework designed to test whether documentation could remain understandable and trustworthy after retrieval fragmentation. Instead of evaluating content purely for readability, the framework evaluated how well information survived chunk isolation, missing surrounding context, truncated tables, and retrieval ordering issues.

I then built an AI-assisted auditing skill that analyzed existing documentation against retrieval-focused criteria and generated structured findings, risk summaries, severity ratings, and proposed rewrites tied directly to retrieval failure patterns.

To complement the auditing workflow, I developed a remediation skill that translated findings into retrieval-friendly structural revisions. This created a repeatable feedback loop where documentation could be audited, revised, retested against LLM prompts, and continuously improved based on measurable answer quality outcomes.

Identity persistence

Each retrievable section carries enough product, version, or surface context to identify itself in isolation.

Heading clarity

Headings describe the specific product, action, or outcome a user might search for.

Table survivability

Captions and row labels preserve meaning when tables are chunked without headers or surrounding text.

Scope locality

Prerequisites, warnings, and constraints stay near the action or decision they govern.

Procedural atomicity

Each step contains one action and one outcome, with branching logic clearly separated.

Query-term alignment

Documentation uses formal product terms alongside the language users are likely to search with.

Private network interconnect documentation before revision with a generic connection types heading and table without a descriptive caption — The revised version improves identity persistence by repeating the full Private Network Interconnect context in the section heading and table setup. It also improves table survivability with a descriptive caption and row labels that remain meaningful if the table is retrieved without surrounding page context.

Private network interconnect documentation after revision with clear product identity, a specific section heading, and a self-describing connection types table — The revised version improves identity persistence by repeating the full Private Network Interconnect context in the section heading and table setup. It also improves table survivability with a descriptive caption and row labels that remain meaningful if the table is retrieved without surrounding page context.

Metrics

Small structural changes produced large answer-quality gains

The work showed that AI answer quality could improve dramatically without rewriting the underlying technical content. In several test sets, restructuring documentation around chunkability reduced major inaccuracies from 6/10 responses to as low as 0-1/10 across multiple LLMs.

In one enterprise connectivity test set, complete-answer rates improved from 0/10 to 7/10 after structural documentation updates. The improvements came from making product identity, scope, constraints, and relationships more resilient when retrieved as fragments.

Up to 83% reduction in major inaccuracies

0/10 to 7/10 complete-answer improvement

Reusable retrieval audit criteria

Repeatable audit, revise, retest workflow

Reflection

AI-ready documentation is fundamentally an information architecture problem

Here are principles I would carry into any content system that needs to remain accurate when retrieved, fragmented, or read out of order:

AI-ready documentation is fundamentally an information architecture problem

The biggest lesson from this work was that retrieval quality depends heavily on structure, not just writing quality. Documentation now has to function both as a human reading experience and as a machine-readable knowledge system.

Retrieval failures are often predictable

Hallucinations consistently mapped back to structural weaknesses like anonymous headings, fragmented tables, missing scope, or disconnected constraints.

Small structural changes can dramatically improve AI reliability

Some of the highest-impact fixes were deceptively lightweight: adding product identity to section openers, rewriting generic headings, adding table captions, moving constraints earlier, and using more retrieval-oriented phrasing.