The Margin LabsThe Margin Labs

Case study

Customer Service AI

Led by NunoCS
  • 3 to 5x volume absorbed
  • No added headcount
  • Faster responses
  • ROI-led tool choice

Challenge

Support volume swung three to five times higher in peak season. Scaling headcount for the peaks was slow and expensive, and quality slipped under pressure.

What was built

A customer-service AI strategy and tooling stack: response automation, AI triage and storefront chatbot integration, all chosen against clear ROI criteria so a lean team could absorb the spikes.

The story

How it ran.

The peak-season pattern was predictable and painful. Support volume swung three to five times higher than baseline for roughly eight weeks. Scaling headcount for the peaks was slow, expensive and the quality slipped anyway because new agents could not learn the product fast enough. The team was burning out by week three and the leadership team was looking at the same problem coming round again.

We started with the volume forecast rather than the tooling. The peak was not uniform: roughly half the ticket increase was order status, a quarter was returns and exchanges, and the remaining quarter was genuinely novel questions that needed a human. That breakdown framed the work. Anything in the first half was a routing and automation problem. Anything in the second half was a triage and tooling problem.

Tool selection ran against explicit ROI criteria. For each candidate (response automation platform, AI triage layer, storefront chatbot) we wrote the cost per ticket avoided, the implementation effort and the failure mode if it answered wrong. Two of the early candidates failed the failure-mode test outright. The shortlist that survived was smaller than the original brief, but every tool on it had a defensible business case.

Response automation went first because it had the cleanest payback. Order status questions got handled end to end without human touch: the customer asked, the bot pulled live tracking from the carrier, the response went out within seconds. Returns and exchanges got templated responses with the policy applied automatically, leaving the agent to handle the edge cases.

AI triage was the next layer. Incoming tickets got categorised, prioritised and routed before an agent saw them. The agents stopped triaging and started resolving. The storefront chatbot caught a meaningful share of pre-sales questions that would otherwise have hit the inbox, and the ones it could not answer arrived at the agent already pre-qualified.

Peak season arrived. The team absorbed the spike without adding headcount. Response times improved against baseline rather than degrading. The lessons were less about the AI and more about the discipline that came with it: route before you automate, automate the boring before the interesting, keep AI inside safe categories until the audit log says otherwise.

Methodology

The sequence we ran.

  1. 1

    Volume forecast: break the peak into ticket categories before selecting any tool.

  2. 2

    ICP review: confirm which customer questions belong in self-service and which need a human.

  3. 3

    Tool stack against ROI: cost per ticket avoided, implementation effort and failure mode for every candidate.

  4. 4

    Rollout: response automation first, triage second, storefront chatbot last; expand only when the audit log clears.

Architecture

What sat behind it.

Routing

  • Gorgias inbox with rule-based first pass
  • Carrier API integration for live order status
  • Returns policy engine with templated responses

AI Triage

  • Category and priority classifier
  • Sentiment flag for escalation
  • Pre-agent enrichment with order and customer context

Bot

  • Storefront chatbot for pre-sales questions
  • Safe-category answer set with explicit handoff rules
  • Conversation log into the ticket history

Reporting

  • Cost per ticket avoided by channel
  • Resolution time by category
  • AI failure log with weekly review

Lessons

What we would carry forward.

  • Route before you automate. Most of the win comes from getting the right ticket to the right place first.
  • Put AI in safe categories first. Order status is safe; refund decisions are not.
  • Failure mode beats accuracy when you pick tools. A high-accuracy tool that fails badly is worse than a lower-accuracy tool that hands off cleanly.
  • Forecast the peak by category. The blunt headline number hides where the real leverage is.

Know exactly where AI pays before you commit a budget.

Start with the free AI Profit Roadmap. We map the highest-leverage automation and systems work in your business, with no obligation.