Blogs
Mar 2025AI in product5 min read

When AI Is a Liability, Not a Feature

The cases where adding AI actively makes the product worse. Recognize them before you spec.

The pressure to add AI to existing products has produced a wave of features that are actively worse than the deterministic versions they replaced. The team replaced a perfectly serviceable rules-based system with an LLM that sometimes works, often doesn't, and now requires a quality monitoring pipeline that didn't exist before. Net product quality went down. AI got added. Everybody calls it progress.

Three patterns are particularly common. Recognize them. Refuse to ship them.

Pattern 1: AI replacing a working deterministic system

The product had a workflow. The workflow had rules. The rules were boring but reliable. Someone proposed replacing the rules with an LLM. The argument: 'the AI can handle edge cases the rules don't.' The team built it.

Now the feature is wrong 8% of the time on the easy cases the rules used to handle perfectly. The team has gained handling of a few new edge cases — but they've lost reliability on the common cases that drove 90% of feature usage. Users notice. The rules were boring; they also worked. The AI is exciting; it also doesn't.

The right move here is layered: keep the rules for cases they handle well. Use AI only for the cases the rules can't handle. The AI is augmenting, not replacing. Net quality goes up because the rules' reliability is preserved AND new edge cases get handled.

Replacing a working deterministic system with an LLM is the most common form of product regression in 2025.
Pattern 2: AI adding a step instead of removing one

An AI feature is proposed. The team builds it. After launch, the feature works as designed, but adoption is low. The team is confused.

Usually the problem is that the feature added a step instead of removing one. The user used to take three steps to accomplish task X. The team added an AI that helps with task X, but the AI requires the user to first specify what they want, then review the AI's output, then accept or edit. Task X now takes five steps with AI, three without. Users see the AI as friction.

The principle: AI features that remove an existing step adopt fast. AI features that add a step adopt slowly or never. Before specifying any AI feature, audit the user journey. Count the steps. If your feature adds steps net, redesign or kill it.

Pattern 3: AI in regulated or high-stakes domains without designed fallbacks

AI in a medical, legal, or financial workflow has a different cost-of-wrong than AI in a copywriting tool. Wrong is not just 'bad UX.' Wrong is malpractice exposure, regulatory penalty, lost money.

In these domains, AI as a primary decisioning layer is almost always a liability. AI as a suggestion layer, with human-in-the-loop and clear ownership, can work — but it has to be designed carefully. Most teams underestimate the design work and ship something where the AI's role is fuzzy. The human assumes the AI checked something. The AI assumes the human will catch errors. Neither happens. The product becomes a liability the legal team finds out about.

The rule of thumb: if a wrong AI output would trigger a lawsuit, a regulatory review, or a customer refund > $1,000, the AI should be a suggestion layer with a deterministic guardrail underneath. Not the primary decisioner. Not even close.

The honest test

Before specifying any AI feature, run this test. Imagine three months after launch. Adoption is 12%. NPS is unchanged. Internal champions are quietly disappointed.

Which one of three explanations is most likely?

1. The model isn't good enough. (You'd improve the model.) 2. The feature added a step instead of removing one. (You'd redesign the flow.) 3. The deterministic version was already good enough and you replaced something working. (You'd roll back.)

If you can't honestly answer this question before building, you don't have enough conviction to ship the feature. Either kill it or do another round of validation. The worst case isn't 'feature flops.' The worst case is 'feature flops AND the team commits to making it work for two quarters because nobody wants to admit it was the wrong feature.' That's the trap.