Building an AI Chat for the Nexus UI Docs

7 min read

•Jun 2026

For the past couple of months, I've been building Nexus UI, an open-source component library for AI interfaces. The docs site was already in good shape. Every component has install steps and examples, and the search bar works well when you know what you're looking for.

What search can't do is answer questions. It can't tell you which components fit together for a specific pattern, or walk you through a follow-up like "okay, how does this component relate to that one?"

So I built Nexus AI: a chat panel inside the docs where you can talk to the documentation directly. Press ⌘/ on any page, ask in plain language, and get a streamed answer with links back to the real docs.

Nexus AI on the docs — hit ⌘/, ask a question, answer streams in

Starting with search and a model

My initial approach was quite simple. FlexSearch on the docs, pipe the user's question to a model through OpenRouter, and then stream back a reply. I knew how to use the AI SDK from earlier work, and FlexSearch was already on the site for search. So it seemed that building up the chat UI would be the hard part and the backend would be much quicker.

It wasn't. I noticed that the model's responses sounded great but were often wrong. The model would invent npm packages, guess import paths, and describe props that don't exist. This was mainly because the model knows nothing about Nexus UI, and so it could only assume facts from the little it could search up. This was a big problem, and so I figured I needed to build a more robust system.

That's when I started reading about RAG. I learned that the better way to ground models is to augment their training data with external knowledge. I spent a lot of time in Cursor going back and forth, trying things, breaking things, until the responses actually tracked with what I'd written in the documentation.

The retrieval pipeline

I spent time building a RAG pipeline to stop the model from making things up. What I ended up with has three steps: chunk the docs into a searchable corpus, find the right chunks per question, inject them before the model answers.

Building the corpus

On server startup, I pull every docs page, split it by section heading, and chunk each section into ~1,000 character pieces. I also index component source files and a small hand-written facts block with the correct install command and import path.

…

Source code chunks help the model but shouldn't become links in answers. So each chunk has a citeable flag. Docs and facts are citeable. Source files are internal only. I also specified in the system prompt that links should be directed to /docs URLs, never to raw source paths.

Finding the right chunks

FlexSearch indexes the corpus in memory. For a component library where people literally ask about "Prompt Input" and there's a page called Prompt Input, keyword search gets you further than I expected. I'm planning on using embeddings for vaguer questions, but this was a fine starting point.

One bug that I ran into early was that I was only searching on the latest message. Follow-ups like "show me an example" retrieved nothing because the query was just that phrase, with no mention of the component we'd been discussing. I fixed this by concatenating all user messages in the thread to form the search query:

…

After search, I rank hits and cap how many chunks any single page section can contribute.

Getting it in front of the model

What mattered most here was that retrieval happens on every message, before the model runs. It's not an optional tool called only when the model asks for it.

…

I'd initially given the model a search tool and hoped it would use it. It often didn't. It would answer from memory, sound confident, and be wrong. Injecting context upfront fixed that. The search tool still exists, but as a backup for when the first pass misses something or the user pivots the conversation. Most of the time pre-retrieval was enough and the answer would just stream in.

Answer with citation badges and source links

The UI, built from the library

I'm a frontend-leaning engineer, so this was the fun part. The chat is built from Nexus UI components: Thread and Message for the conversation, PromptInput for the input, Suggestions for starter prompts, Citation for source links, ChainOfThought when a search runs, Toaster for feedback. I also used shadcn/ui underneath for buttons, tooltips, the chat shell (Sheet on desktop, full dialog on mobile).

I care deeply about craft and all the little details that make a product feel polished and professional, so I added transitions and contextual animations to some components in the panel.

Nexus AI is a working example of what Nexus UI components look like when used in a real product. Building this revealed a few rough edges I wouldn't have caught from writing the doc examples alone.

Nexus AI UI — panel, suggestions, and prompt input

Rate limiting on a public site

Nexus AI obviously has no authentication, so I had no user account to attach limits to. Any visitor could decide to overuse the chat, which would cost me money. I had to add guardrails after the chat itself worked.

I store daily limits per IP in Upstash Redis, resetting at midnight UTC. On Vercel, I read the real client IP from a forwarded header. When I can't resolve an IP, the client falls back to a random ID from localStorage—with a lower cap, since it's easy to reset.

I also show the remaining messages in the prompt input so the limit isn't a surprise. When someone hits the cap, I show a warning toast. I also capped messages at 1,000 characters on the client.

What's next

There are a few things that could be improved: embeddings for better semantic search, trimming conversation history so long threads don't get expensive, and server-side input validation.

I'm happy with the results so far. It's working, and people are actually using it.

The chat is live at nexus-ui.dev/docs. Open it or press ⌘/, and ask something. The source code is available on GitHub.

Keep building!

Building an AI Chat for the Nexus UI Docs

Upload in progress...

Starting with search and a model

The retrieval pipeline

Building the corpus

Finding the right chunks

Getting it in front of the model

Upload in progress...

The UI, built from the library

Upload in progress...

Rate limiting on a public site

What's next

How I built a Next.js MDX table of contents