Why the chatbot on this site uses two different AIs
One AI answers your questions fast. A different AI grades every answer for quality. Here's why that split exists and how it works.
The chatbot on this site has two AIs behind it. A fast one (Cerebras Llama) answers the user because speed matters more than depth for a portfolio chatbot. A thoughtful one (Claude) scores every answer afterward on accuracy, voice, privacy, and helpfulness. Low-scoring answers get flagged for review. Over time, those become test cases that prevent bad answers from shipping.
Different jobs need different strengths. The answering AI needs to be fast enough that visitors don't bounce. The judging AI needs to be honest about quality, including about answers from its own kind. Using the same AI family for both creates a bias where it's too kind to its own output. Cross-family judging is harder to game.
Claude is also the AI I trust most for following rules. The chatbot has strict guidelines: never name specific employers, redirect politely instead. Claude follows that instruction more reliably than the answering AI would on its own.
Start with one AI for answering. Add quality scoring later when you have enough conversations to see patterns. Use a cheaper AI model for judging, and judge a sample of conversations rather than every single one (unless your volume is low enough that it doesn't matter).
The real investment is in the system prompt. A detailed, well-written set of instructions is worth more than upgrading to a bigger model. Tell the AI exactly what it knows, how it should talk, and what it should refuse to answer.
If this chatbot handled thousands of messages a month instead of fifty, I'd flip the architecture. Use a cheap AI for routing (figuring out what the user wants), a powerful AI for the answers that matter most, and judge only a random sample. The current setup judges every call, which is fine at low volume but wasteful at real scale.
Product leader shipping across enterprise SaaS, AI in production, and 0→1. Writing about what actually ships — not what sounds good in a deck.