Deflection rate is the wrong metric. Here's what to measure instead., Keloa

Every AI customer support vendor in 2026 will show you a deflection-rate dashboard within the first minute of a demo. Big green number, usually 60%-something, sometimes 70%-something. The story is that the AI handled most of your tickets so your humans didn't have to. The conclusion the slide nudges you toward is that you should buy the product.

We've sat through enough of these to say plainly: deflection rate, as most vendors compute it, is a marketing number. It tells you very little about whether your customers got help. Worse, it can quietly go up while your CSAT goes down, and the dashboard will still call that a win. Here's what's wrong, and what to track instead.

What deflection rate actually measures

A "deflection" is supposed to mean a ticket that didn't need a human. In practice, vendors compute it in one of three ways, all of them flattering:

Self-service avoidance. A user visited the help center and didn't open a chat or submit a ticket. Count it as deflected. This counts people who left because they couldn't find the right page.
Bot-only conversation. The AI replied and the user never asked for a human. Count it as deflected. This counts people who gave up on the bot and emailed your founder directly.
No follow-up within N hours. The AI replied and the user didn't write back within 24, 48, or 72 hours. Count it as deflected. This counts the customer who got a wrong answer and walked.

None of these is the same as "the customer's problem was solved." All three of them go up when you make the AI more aggressive, when you bury the human handoff, or when you raise the response threshold so the bot answers more questions it shouldn't. The vendor's dashboard rewards exactly the behaviors that customers complain about.

We've seen this in real deployments. A 2026 CXToday piece quoted multiple support leaders whose CSAT dropped within weeks of a deflection-focused AI rollout. The root cause was almost always the same. The bot was answering questions it should have escalated. Customers who couldn't find a human were giving up. The dashboard was green.

Three metrics that are harder to fake

If deflection rate is the calorie-free number, what's the real thing.

1. Re-contact rate within 48 hours

This is the single most important AI-support metric and almost no vendor leads with it. The definition is simple: of customers who had an AI-only interaction, what percent contacted you again about the same issue within 48 hours.

If your "deflection rate" is 70% and your 48-hour re-contact rate is 25%, you're not deflecting; you're delaying. The same customer is back, more annoyed, and now requires a human anyway. The work was just moved into the future and into a worse mood.

Good benchmarks for retail SMBs: under 10% re-contact is excellent, 10-15% is fine, anything above 20% means your AI is hiding problems instead of solving them. Track it by topic. Re-contact on "where is my order" should be near zero. Re-contact on "refund a specific item from a multi-line order" might legitimately be higher because that's a harder ask.

2. CSAT split by AI-only vs human-touched

CSAT as an average is useless when you have AI in the mix. The average buries the truth. What matters is the gap between two cohorts:

Customers whose entire interaction was AI.
Customers who got a human at any point.

If the gap is small (say, 0 to 4 points), your AI is doing what AI should do: handling the questions it can handle well, and getting out of the way for the others. If the gap is wide (10+ points), your AI is answering questions it shouldn't be, or the handoff to a human is broken. The way you fix this is not by tuning the bot harder. It's by lowering the confidence threshold for handoff, so the bot answers fewer questions but answers them right.

A 2026 Notch study covering 200 mid-market support teams found the median AI-only CSAT was 7 points lower than human-touched. The 90th percentile teams had pulled that gap to under 3. That gap is the work.

3. Cost per resolved ticket, by topic

Not "cost per resolution." We've already explained in a separate post why vendor-defined resolutions are not the same as your customer being helped. Cost per resolved ticket, properly measured, is your total support cost (AI bill, human payroll, tool stack) divided by the number of tickets where the customer's problem was actually fixed and they did not come back within 48 hours.

The interesting move is to slice this by topic. Most teams discover three patterns:

AI dramatically cheaper: order tracking, return policy, shipping windows, account access. Pure factual lookup. Send everything here to the bot.
AI slightly cheaper: product fit questions, basic troubleshooting, simple refunds. Worth automating, but the savings are smaller than the deflection dashboard suggested.
AI more expensive: complex returns, custom orders, anything involving exceptions. The bot makes a mess, a human cleans it up. You're paying for both. Route these to a human immediately.

Once you have this view, your automation strategy stops being "deflect everything" and starts being "automate the things where the math actually works."

The dashboards you already have are lying to you a little

This isn't a vendor's fault, exactly. Deflection rate is easy to compute, easy to graph, and easy to put in a sales deck. Re-contact rate requires tying together two conversations from the same customer, often across two days. CSAT-split requires you to ask the right people the right question. Cost-per-resolved-by-topic requires categorization that most teams aren't doing rigorously.

But the consequence of the easy number being the default is that AI support teams are getting promoted on a metric that has only a loose relationship to whether customers are happy. We'd rather everyone, including our customers, be looking at the metrics that actually correlate with retention, repeat purchase, and word of mouth.

At Keloa, we ship the three metrics above in the standard reports. Not because they make our product look better than the dashboards. They don't, in the first month. They make it look honest, which is a different argument and the one we want to be having.

What this means for you

If you're already running an AI support tool, the test is simple. Ask your vendor to show you 48-hour re-contact rate by topic for the last 60 days, broken out by whether the conversation was AI-only or had a human touch. If they can't, you don't have a measurement problem; you have a vendor problem. The metric exists; the data exists; the absence is a choice.

If you're evaluating tools right now, ask for these numbers in the demo. The vendor that pulls them up without hesitation, and shows you the gaps as well as the wins, is the one to take seriously.

If you'd like to see what these reports look like in the wild, book a 20-minute demo and we'll run them against a sample workspace. Or start with the Free Starter plan, wire up your help center, and watch your own re-contact rate stabilize over a couple of weeks. That number, once you've seen it, is hard to unsee.

Deflection rate is the wrong metric. Here's what to measure instead.

What deflection rate actually measures

Three metrics that are harder to fake

1. Re-contact rate within 48 hours

2. CSAT split by AI-only vs human-touched

3. Cost per resolved ticket, by topic

The dashboards you already have are lying to you a little

What this means for you

More from the blog

The per-resolution pricing trap, and why we charge per reply instead

The EU AI Act becomes enforceable on 2 August 2026. Here's what SMBs actually need to know.

Want to see how this works in our product?