Case study

Customer Service AI

Led by NunoCS

3 to 5x volume absorbed
No added headcount
Faster responses
ROI-led tool choice

Challenge

Support volume swung three to five times higher in peak season. Scaling headcount for the peaks was slow and expensive, and quality slipped under pressure.

What was built

A customer-service AI strategy and tooling stack: response automation, AI triage and storefront chatbot integration, all chosen against clear ROI criteria so a lean team could absorb the spikes.

The story

How it ran.

The peak-season pattern was predictable and painful. Support volume swung three to five times higher than baseline for roughly eight weeks. Scaling headcount for the peaks was slow, expensive and the quality slipped anyway because new agents could not learn the product fast enough. The team was burning out by week three and the leadership team was looking at the same problem coming round again.

We started with the volume forecast rather than the tooling. The peak was not uniform: roughly half the ticket increase was order status, a quarter was returns and exchanges, and the remaining quarter was genuinely novel questions that needed a human. That breakdown framed the work. Anything in the first half was a routing and automation problem. Anything in the second half was a triage and tooling problem.

Tool selection ran against explicit ROI criteria. For each candidate (response automation platform, AI triage layer, storefront chatbot) we wrote the cost per ticket avoided, the implementation effort and the failure mode if it answered wrong. Two of the early candidates failed the failure-mode test outright. The shortlist that survived was smaller than the original brief, but every tool on it had a defensible business case.

Response automation went first because it had the cleanest payback. Order status questions got handled end to end without human touch: the customer asked, the bot pulled live tracking from the carrier, the response went out within seconds. Returns and exchanges got templated responses with the policy applied automatically, leaving the agent to handle the edge cases.

AI triage was the next layer. Incoming tickets got categorised, prioritised and routed before an agent saw them. The agents stopped triaging and started resolving. The storefront chatbot caught a meaningful share of pre-sales questions that would otherwise have hit the inbox, and the ones it could not answer arrived at the agent already pre-qualified.

Peak season arrived. The team absorbed the spike without adding headcount. Response times improved against baseline rather than degrading. The lessons were less about the AI and more about the discipline that came with it: route before you automate, automate the boring before the interesting, keep AI inside safe categories until the audit log says otherwise.

Methodology

The sequence we ran.

1
Volume forecast: break the peak into ticket categories before selecting any tool.
2
ICP review: confirm which customer questions belong in self-service and which need a human.
3
Tool stack against ROI: cost per ticket avoided, implementation effort and failure mode for every candidate.
4
Rollout: response automation first, triage second, storefront chatbot last; expand only when the audit log clears.

Architecture

What sat behind it.

Routing

Gorgias inbox with rule-based first pass
Carrier API integration for live order status
Returns policy engine with templated responses

AI Triage

Category and priority classifier
Sentiment flag for escalation
Pre-agent enrichment with order and customer context

Bot

Storefront chatbot for pre-sales questions
Safe-category answer set with explicit handoff rules
Conversation log into the ticket history

Reporting

Cost per ticket avoided by channel
Resolution time by category
AI failure log with weekly review

Lessons

What we would carry forward.

Route before you automate. Most of the win comes from getting the right ticket to the right place first.
Put AI in safe categories first. Order status is safe; refund decisions are not.
Failure mode beats accuracy when you pick tools. A high-accuracy tool that fails badly is worse than a lower-accuracy tool that hands off cleanly.
Forecast the peak by category. The blunt headline number hides where the real leverage is.

Related services

Where this work usually starts.

Service

AI & Automation Builds

Agentic automation, custom AI skills, integrations and workflow automation that take the repetitive work off your team's plate, end to end.

Read the service →

Service

KPI & Reporting Automation

Live dashboards and on-demand reporting that pull from every source into one trusted view, so decisions move at the speed of the business.

Read the service →

Related insight

Background reading.

Insight

The evidence for operating-model-first AI (and why 95% of pilots fail)

MIT's NANDA initiative found that 95% of generative AI pilots produce no measurable bottom-line impact. BCG found 74% of companies cannot scale AI value. The pattern is consistent, and it is not about the models.

Read the article →

Know exactly where AI pays before you commit a budget.

Start with the free AI Profit Roadmap. We map the highest-leverage automation and systems work in your business, with no obligation.

Get my free AI Profit Roadmap