Case Studies › NDA-Anonymized

Case Study: Support Ticket Triage | Anonymized per NDA | MV3 Marketing

Real problem. Real solution. Real numeric outcome. Client identity protected under mutual NDA.

Case Study: Automating Support Ticket Triage for a Mid-Market B2B SaaS

Client identity protected under NDA. Details available under mutual sign-off in a discovery call.

Composite Company Profile

The client is a Series B vertical SaaS platform serving mid-market operators in the field-services industry. At the time of engagement they were running approximately $18M ARR with a blended ACV of $14,400, roughly 1,250 paying accounts, and a support organization of nine full-time agents plus two team leads. Ticket volume was averaging 4,200 conversations per month across email, in-app chat, and a self-serve help center, with volume growing 11% month over month as the company expanded into two adjacent verticals.

The Problem

Support was drowning. First-response SLA (target: 2 business hours) was being missed on 38% of tickets. Median full-resolution time had crept from 9 hours to 27 hours over two quarters. Customer effort scores were sliding, expansion revenue was stalling because CSMs were pulled into firefighting, and the VP of Customer Success had been given a hard mandate from the CFO: absorb the incoming volume without adding six new hires the company could not afford in the current runway model.

Prior efforts had failed for a predictable reason. The team had bought a helpdesk with native “AI routing” and stood up a set of macros, but macros only work when the underlying taxonomy is right. It wasn’t. Roughly 61% of tickets landed in a catch-all “General” queue, tags were applied inconsistently by agents in a hurry, and the routing rules were built on top of that dirty data. Adding more automation on top of a broken classification layer only sped up the wrong outcomes.

What Our Team Diagnosed

Our analytics team pulled 90 days of ticket data (about 12,600 conversations), ran embedding-based clustering, and reconciled the clusters against the client’s existing tag tree. The finding was not subtle. Seven real intents accounted for 84% of inbound volume, but the client had 41 tags in production, most of which were near-duplicates or aspirational categories nobody used. Three intents (“billing dispute,” “integration failure,” and “permission error”) accounted for 46% of volume and 71% of escalations to engineering, yet none of them had a dedicated queue, a documented playbook, or an owner.

The real problem wasn’t the helpdesk vendor. It was that no one had ever done the classification work with rigor. Automation without a clean intent taxonomy is just faster chaos.

Strategy MV3 Shipped

The engagement was scoped under our AI & Automation service line, with pull-through from RevOps and Analytics. Vance oversaw the engagement; our analytics lead ran the taxonomy work, our automation engineer built the pipeline, and a technical writer produced the playbook content. The strategy had three moves:

  1. Rebuild the intent taxonomy from ticket data, not from a whiteboard. We collapsed 41 tags to 12 real intents mapped to the seven clusters that accounted for 84% of volume, plus five rare-but-critical categories (churn signal, security concern, legal, outage report, executive escalation) that required dedicated handling regardless of frequency.
  2. Ship a hybrid classifier: rules first, LLM second. Deterministic rules caught the 38% of tickets with unambiguous signals (specific error codes, billing keywords, integration webhook signatures). An LLM classifier handled the ambiguous remainder using a fine-tuned prompt with 400 labeled examples per intent. Confidence threshold below 0.75 routed to a human triager for a same-shift decision.
  3. Wire routing, SLA clocks, and auto-response into the classifier output. High-confidence classifications triggered queue assignment, SLA priority, initial acknowledgement copy tailored to intent, and (for three intents) a suggested reply drafted for the agent to review and send in one click.

Implementation

Deliverables produced over the engagement:

  • A 34-page intent taxonomy specification with definitions, canonical examples, edge cases, and routing rules per intent
  • A production classifier deployed as an n8n workflow calling Claude for LLM inference and posting classifications back to the helpdesk via API
  • Seven agent-facing response playbooks covering the top intents, each with a canonical answer, three variant openings, and an escalation ladder
  • A Supabase-backed labeling and feedback loop so agents could correct misclassifications in one click, feeding weekly retraining data
  • A supervisor dashboard showing volume by intent, classifier confidence distribution, and SLA attainment by queue

Cadence: weekly working sessions with the VP of CS and the support ops lead, biweekly reviews with the CFO on cost-to-serve, and a full retro at week eight and week sixteen.

Outcomes (Sixteen Weeks Post-Kickoff)

  • First-response SLA attainment rose from 62% to 94% against the same 2-hour target, while inbound volume grew another 9% during the engagement.
  • Median full-resolution time dropped from 27 hours to 8.4 hours, a 69% reduction.
  • Automated first-response coverage reached 71% of tickets, with the LLM classifier maintaining 92.3% precision on the top seven intents after two retraining cycles.
  • Support cost per ticket fell 41%, from a blended $11.80 to $6.95, driven mostly by deflection of billing and permission tickets into self-serve flows the taxonomy work exposed as automatable.
  • The team absorbed 27% volume growth without adding headcount, freeing two CSMs to return to expansion motion. Net revenue retention on the affected book of business improved 6 points over the following quarter.

Timeline

Kickoff to first outcome read: 16 weeks. Taxonomy rebuild and classifier v1 shipped in week 6. Production routing live in week 9. First measurable SLA lift visible in week 11. Full outcome set locked at the week-16 retro. Ongoing optimization runs on a monthly retraining cadence under a lightweight retainer.

Why It Worked

Two reasons. First, we refused to skip the taxonomy work, which is the unglamorous foundation every automation layer sits on top of. Second, we built a hybrid system where deterministic rules handle the easy cases cheaply and the LLM only handles the ambiguous ones, which kept unit economics defensible and made the classifier auditable when leadership asked how decisions were being made.

Composite Testimonial

“We had spent nine months trying to buy our way out of this with better tooling. What we actually needed was someone to do the boring work of figuring out what our tickets were really about. Once that was clean, the automation was almost the easy part.”
— Priya, VP of Customer Success

NDA Framing

Client identity, product category specifics, and internal metrics beyond those disclosed here are protected under a mutual NDA. We share deeper detail, including full before/after dashboards and the taxonomy specification, under a signed mutual NDA in a discovery call.

Ready to Fix Your Support Economics?

If your support team is drowning, your CS org is being pulled into firefighting, or you have been sold “AI routing” that sits on top of a broken tag tree, we can help. Book a discovery call or explore our AI & Automation services.

Similar Growth Situation?

Book a Discovery Call. We Diagnose Live.

30-minute working session with our growth lead. We open your GSC, ChatGPT, and target accounts, and diagnose the gap live. No slide deck.

Book Discovery Call →