All work
Case Study — Invoca Call Review Console  ·  12 min read
B2B SaaS AI Product Research

What Should We Build First?

How a focused week of research turned a vague mandate into a product direction — and how a redesigned call review workflow helped grow a $4–6M contact center product to $12–15M ARR.

Company Invoca
Role Senior UX Designer
Timeline 2023–2025 · ~2 years
Platform Web App · Enterprise SaaS
Invoca Call Review Console
~3x
ARR growth — $4–6M to $12–15M within two years of launch
<6mo
Beta shipped in under six months despite significant organizational headwinds
Core
Redesigned the most-used workflow in the product — the one everything else depended on

A product with customers, but no momentum

In early 2023, the VP of Invoca's contact center product asked a deceptively simple question: What should I build first?

Invoca had a contact center product uncomfortably bolted onto a legacy marketing platform. It was generating only $4–6 million in ARR. It had customers, but it didn't have momentum. That question became the starting point for what would evolve from targeted improvements into a full redesign of the product's core workflow, a shift in how the team worked with engineering, and ultimately a foundation the product could actually grow on.


One week to find the answer

How do you answer a question like that? With research. Not a months-long study — there wasn't appetite for that — but a focused, one-week effort I conceived and pitched to product leadership, looking at the entire contact center product to find where the biggest problems lived.

I started by pulling Pendo session replays to watch how users were actually navigating the product. I talked to the internal CS team — the people fielding customer complaints and workarounds every day. I reviewed existing research materials that had accumulated with minimal synthesis. Within that first week, a clear picture emerged: the Calls Report was the center of gravity. It was the workflow every contact center user touched daily, and it was the one they struggled with the most.

The Pendo data told a stark story. Less than half of customer accounts were even logging in each month. Of those that did, the engagement split was dramatic — a small segment of power users with extremely high event counts, and a long tail of users who barely touched the product. The product's stickiness metric sat at 19.7% DAU/MAU — average for SaaS, but the underlying distribution was unhealthy.

The data also revealed where complexity was creating the biggest barrier. Invoca's planned onboarding was 75 days — the actual average was 92. In 2022, only 11 out of 32 customers completed onboarding within the year. The Calls Report was where users needed to build confidence that the system was working — and it was failing them.

The research caught the attention of Invoca's CEO, Gregg Johnson. He asked to review the project at a high level — we walked him through the Pendo findings, the CS feedback patterns, and the case for the Calls Report as the highest-impact investment. One question he asked stuck with me:

"Can I assume this is more than the typical level of research we conduct before a project?"
Gregg Johnson — CEO, Invoca

Getting buy-in for deeper research was a recurring challenge at Invoca — the culture leaned toward building what users asked for rather than investing time to understand the underlying problem. So we split the research into shorter phases that could each earn continued investment, proving value incrementally rather than asking for trust upfront.


Thirty days of listening

Once we had approval to move forward, my PM partner Erick Montez joined the project. Erick and I would collaborate on call review and adjacent projects for the next year and a half. Jake Rowe and Brittany Choy also joined — supporting research and taking on significant design work. I led UX and collaborated with them regularly to keep the design vision aligned.

We planned a 3–4 week effort: FullStory session analysis, direct customer interviews, competitive design research, internal stakeholder conversations, and a heuristic analysis. We interviewed five customers ranging from 9-agent teams to 100+ agent organizations — eight users total, all focused specifically on their call review workflow and pain points.

Here's what they were doing: contact center QA analysts and sales managers were manually reviewing calls, hunting for coaching opportunities for their agents — looking for bad calls that needed intervention, good calls worth replicating, and anomalies that might signal a larger trend. The coaching was the point. Better call handling meant higher conversion rates. And the pressure was real — some QA reviewers had weekly call review quotas tied directly to their compensation and bonuses. This wasn't optional busywork.

The workflow had two distinct levels. First, find calls of interest within a massive body — for some customers, millions per month. Then, find moments of interest within a specific call — the fumbled pitch, the brilliantly handled objection. The Calls Report was supposed to make both levels efficient. It wasn't.


Fifty percent confidence in a QA product

A QA Manager at Windstream named Matt Dunn captured the experience bluntly: "I don't use Invoca directly because it is too complicated and it overwhelms me." He wasn't alone. Across all interviews, a clear hierarchy of problems emerged.

Transcript issues were the #1 pain point. The transcription quality directly undermined everything downstream. If users couldn't trust the transcript, they couldn't trust the signals or scores built on top of it. Signal and scoring validation was next — users couldn't understand why things were scored the way they were, and the workflow forced constant tab-switching between the transcript and the Analysis tab, losing context every time.

Underneath all of it sat the foundational problem: lack of trust in the data. When we asked users directly how much they trusted the product's scoring and transcription, the answers were sobering.

45%
Christus Health — "accurate maybe 45% of the time in some areas"
10→50%
T-Mobile — started at 10%, worked up to 50%, still not comfortable staking decisions on it
~50%
The majority of customers rated their trust at around 50% or below

Without trust in the data, users were manually validating every call before they'd stake their reputation on sharing insights internally or taking action with agents. The automation Invoca offered was being treated as a suggestion, not a source of truth.

The count is the number of mentions of each pain point.

Twenty-two minutes, but only two per call

I watched hours of real user sessions across eight customer accounts. The average call review session was 22 minutes — but individual calls got just 1 minute 43 seconds each, and 53% got less than a minute. Nobody was listening to full recordings. Users scanned the transcript first and only played short audio snippets if something caught their eye. Some never touched the audio player at all.

Yet these same users were spending entire workdays in the product. One QA Analyst told us she was in it "full eight hours" on days without meetings. A daily-driver tool with a broken daily workflow.

Users also couldn't track which calls they'd reviewed — the onboarding team had built a workaround using a custom signal for "Call Reviewed = Yes / No" — and had almost no actions available after finishing a review beyond copying a call ID to match up with external systems.


What Kayak knows about finding needles in haystacks

I did substantial competitive research — direct competitors like Observe.ai, but also products outside the industry with structurally similar workflows. Gong's conversation analysis for B2B sales teams. Carvana's vehicle search at scale. Kayak's sophisticated filtering and sorting.

The cross-industry patterns were the most valuable. They showed how mature products had solved the same core problem — helping users find the specific thing they need within a massive body of options. Products like Kayak and Carvana that handled high-volume search well all treated filtering as the primary interaction, not an afterthought bolted onto a data table. That pattern became a foundational principle of the redesign.


We stopped trying to fix it

The VP had initially approved smaller, targeted improvements to the existing Calls Report. That's where we started. But as the deeper research findings accumulated and we began working through solutions, it became clear that the problems weren't surface-level — they were structural. The existing page was a lead weight around our ankles.

I'd been advocating for building something new rather than patching the old system — the research made a compelling case that incremental fixes would inherit every bad pattern that had accumulated over the years. One of the engineering directors helped push the decision over the finish line when he put it directly: sometimes it's easier to build something new than to contend with all the baggage and technical debt. We made the call to build a new flow from the ground up — informed by everything we'd learned, unconstrained by everything that wasn't working.


The instinct at most companies is to answer "what should we build first?" with whatever the loudest stakeholder wants. We answered it with a week of research — and that week changed the trajectory of the product.


Helping users pick the right call, not a random one

We explored multiple directions, made deliberate trade-offs about information density versus simplicity, and prioritized the filter and selection workflows that research had identified as the biggest pain points.

The most important design question we asked: what content do users care about most when deciding whether a call is worth reviewing?

The answer became clear fast: AI summaries. The FullStory data had revealed something important — users were racing through calls at under two minutes each because there was no intermediate layer of information between the call list and the full transcript. They had no way to preview what a call was about before committing to it. AI summaries filled that gap — a quick read on what actually happened before committing to a full review. I pushed for summaries to be the dominant content element in the call list — not tucked behind a click or treated as secondary to metadata. It was the single most valued piece of content in the workflow.

The other critical content was scoring. Automated scorecards had existed in the product but the old Calls Report buried them. We made scores far more prominent — giving users an at-a-glance quality signal right in the call list, with the ability to override if they disagreed. Human judgment stayed the final authority while AI did the heavy lifting of initial assessment.

Based on the competitive analysis, I advocated strongly for leading with filtering as the primary interaction — matching users' actual mental model rather than forcing them into a table-first layout that research had shown wasn't working.

Finding moments within a call

The call list was only half the workflow. Once a user selected a call to review, the design challenge shifted — now they needed to find moments of interest within a single call, not calls within a list.

FullStory confirmed the transcript was the single most-used feature on the page. We put it at the center and gave users the tools to navigate it efficiently — without forcing them to read it end-to-end. Critically, we showed where signals were triggered directly in the transcript, eliminating the back-and-forth tab-switching the research had flagged as a major workflow breakdown. Users could review AI evaluations, check and override scores, and leave comments on specific moments — creating a feedback trail for other reviewers or the agent's manager.


When research pays off.

We brought interactive prototypes back to four customers for concept testing. The reactions were overwhelmingly positive across the board.

Sherita Vance at Christus Health — the same user who had rated her trust at 45% — saw the redesigned call list and said: "I'm loving it, it's interactive." When she clicked into the call detail page, she paused:

"Oh my. Gimme a minute to soak it. Soak it in. This is nice."
Sherita Vance — QA Manager, Christus Health

At BBQ Guys, the group reaction was immediate: "I like it. I really do. I like all the filters. I really like the review status. I like it overall. I'm impressed with it."

Ronnie Brown at T-Mobile — who had started the research phase rating his trust at 10% — summed it up at the end of his session: "Honestly, I love it. I think this is better than what we're doing today for sure. It seems very easy to navigate and to use. I say thumbs up, looking good."

The confusion points that did surface were minor — 75% were related to prototype limitations and placeholder content, not the design concepts themselves. The refinements from testing were mostly specifics, not directional changes. The research foundation was solid.


The hard part wasn't the design

We launched the beta after about six months. Getting there was harder than the design work itself.

Invoca's engineering leadership was accustomed to having significant influence over what got built — and what didn't. Engineering leaders questioned the project repeatedly in the early stages — not the design decisions specifically, but whether this was the right investment at all. Political conversations had to happen between product and engineering leadership to keep it moving forward.

There was also a process challenge. Engineers weren't used to having their implementation reviewed by UX during development. The concept of a "UX review" — checking the built product against the design intent before it shipped — was new and initially met with resistance. I had to sell the process to the team, not just the design.

The engineering manager found the framing that made it click:

"It's like code review, but for UX."
Engineering Manager, Invoca

We weren't just completing a project. We were helping parts of the company evolve into a different way of working — one where research informed decisions, where design had a seat at the table through implementation, and where UX quality was checked before shipping, not after. That cultural shift created real friction. But the precedent it set mattered for everything that came after.


Then we listened again

When we released the redesigned Call Review Console, the response was immediate and overwhelmingly positive. The CS team started hearing praise instead of complaints. But shipping isn't the finish line. We iterated based on real usage patterns.

  • Comment filtering — users wanted to find calls they or colleagues had already annotated
  • "Mark as reviewed" and review status filtering — a simple addition that fundamentally changed how people managed their daily review queue, and eliminated the "Call Reviewed = Yes/No" custom signal workaround the onboarding team had built
  • Helper content and contextual guidance — reducing the learning curve for new users
  • "Last viewed" functionality — so reviewers could pick up where they left off across sessions
  • Usability refinements — the kind of iteration that turns a good product into one people actually enjoy using

Many of these features had been identified in the original research — we just hadn't been able to ship them all at once. The post-launch iteration gave us the chance to deliver on the full vision incrementally.


From question to shipped product

What I owned:

  • Led the initial product-wide research sprint — conceived, pitched to product leadership, and executed
  • Designed and ran the deep-dive research plan: FullStory session analysis across 8 customer accounts (58 individual call reviews analyzed), customer interviews with 5 companies and 8 users, competitive analysis across 3 products plus cross-industry references, heuristic review
  • Synthesized findings into a prioritized problem framework that aligned PM, engineering, and leadership on what to build and why
  • Created the opportunity-solution tree and UX menu of proposed enhancements
  • Designed the Call Review Console from concept through shipped product
  • Conducted concept testing directly with customers
  • Established the UX review process with engineering — a practice that started here and carried forward

What I partnered on:

  • Collaborated with PM Erick Montez across the full research and design cycle
  • Worked with Jake Rowe and Brittany Choy on interview execution and Dovetail synthesis
  • Coordinated with CS, Sales Engineering, and Onboarding teams to gather internal evidence
  • Navigated the product–engineering political dynamics with PM and product leadership

A foundation the product could grow on

The Call Review Console redesign wasn't just a UI facelift. It became the foundation the contact center product needed to grow.

Product growth: The contact center product grew from approximately $4–6M ARR at project start to $12–15M ARR within two years of the redesign launching — roughly 3x growth. The Call Review Console was the core workflow that made the product viable for daily use by QA teams and sales managers. It didn't single-handedly drive that growth, but it removed the biggest barrier to it.

Customer response: Strongly positive reception at launch, with the CS team reporting a notable shift from complaints to praise about the calls workflow.

Process precedent: Established the UX review process with engineering — a practice that started with this project and carried forward into subsequent work.

Organizational influence: Demonstrated to leadership that research-informed design could be done in phases, earn buy-in incrementally, and deliver results — changing the conversation about how product decisions were made.


The work that doesn't fit in a case study

The question that started this project — what should I build first? — is one of the most important questions a product team can answer well. Most companies answer it with whatever users asked for most recently or whatever a sales rep needs to close a deal this quarter. We answered it with a week of research, and that week changed the trajectory of the product.

The hardest part wasn't designing the solution — the research gave us clarity on what to build, and the competitive analysis gave us patterns to draw from. The hardest part was the organizational work. Earning buy-in in phases, navigating the tension between product and engineering, introducing a new process in the middle of a high-stakes project. That's the work that determines whether good design actually ships.

Looking back, this project taught me that the most impactful UX work often isn't the most visually dramatic. The Call Review Console doesn't have a flashy hero image. It's a search-and-filter workflow for reviewing phone calls. But it was the workflow that mattered most to the people who used it every day — and getting it right unlocked everything that came after.

← Back to all work Next case study Interos Supply Chain Risk →