logo making sense

Latest posts

Explore our categories

The Cost of Decision Friction When Exceptions Run the Workflow

The bottleneck in scaling is rarely the main workflow. It is the edge cases that force manual judgment, break consistency, and create compounding operational drag.

Mar 11, 2026

Most scaling companies I talk to have already automated the basics. The “happy path” works, and it usually works well, invoices get generated, orders flow through standard steps, onboarding forms route to the right system, and customer requests are logged, tagged, and queued.

Then growth shows up in the form that process diagrams rarely capture: variability. I keep seeing teams reach a point where the work stops being “follow the steps” and becomes “interpret the situation,” because details are missing, inputs don’t align, and context determines what should happen next.

That is where workflows stall. In my experience it rarely comes down to disorganization, it’s that the system doesn’t have enough context or decision logic to move forward on its own.

Decision friction - the cost of pausing a workflow to interpret, reconcile, validate, or approve (1).png

I think of this as decision friction: the cost of pausing a workflow to interpret, reconcile, validate, or approve what should be straightforward, but isn’t in production. Over time, it becomes one of the most expensive sources of operational drag. Why does it get so expensive? Because it compounds quietly. You can see this in delays and repeated work, in results for clients that vary more than they should, and in a growth model that keeps adding headcount because, at the time, it seems cheaper than strengthening the system.

Why happy-path automation stops paying dividends

Traditional automation is great at repeatable steps. Where it tends to struggle is when the work shifts from “execute” to “judge.”

As volume increases, exceptions increase too. Not always as a percentage, but as an absolute count, and often in ways that are more complex than the original workflow, especially once multiple systems and teams are involved.

The result is a familiar pattern:

  • The standard path is fast.
  • The exception path creates queues.
  • The exception queue pulls experienced people into constant triage.
  • The organization starts measuring “throughput” while quietly bleeding margin in escalation loops.

In many scaling workflows, roughly 20% of volume can consume 50–80% of the interpretation, reconciliation, and approval effort handled by experienced operators.

Why happy-path automation stops paying dividends (1).png

It is normal, and appropriate, for managers and directors to make final calls. The part that keeps surprising teams is how early the escalation starts, on decisions that should be consistent, but aren’t, because the workflow doesn’t carry enough context, clear rules, or instrumentation to handle variability.

Leadership rarely becomes the bottleneck by design. It becomes the bottleneck when exceptions turn into the real workflow.

The hidden cost of exceptions

Decision friction rarely makes it onto a roadmap because it doesn’t look like a feature gap. In most orgs it gets filed under “ops being ops,” even when it’s quietly setting the pace of delivery.

But it has real economic consequences:

EXCEPTIONS AT SCALE

  1. Missing context
  2. Unclear decision boundaries
  3. Edge cases

MANUAL DECISION LOOPS

  1. Reconstructing context
  2. Escalations
  3. Rework loops

MARGIN DRAG

  1. Higher unit cost
  2. Delayed revenue
  3. Headcount-driven scaling

When “ops being ops” follows this chain, it starts showing up in the numbers…

  • Higher operational cost per unit of work. Exceptions take longer, require more coordination, and create more touchpoints.
  • Delayed revenue. When fulfillment, onboarding, or case initiation is delayed, revenue is delayed too, and sometimes lost.
  • Inconsistent customer experience. Two customers with the same request get different outcomes based on who handled the exception.
  • Headcount-driven scalability. Work scales linearly with volume because judgment is not embedded in the system.

There is also a human cost. When exceptions dominate the day, teams spend disproportionate time searching for information, recreating context, and duplicating work across systems.The pattern is familiar in most environments I’ve worked in, and APQC’s research puts numbers around it: a surprising amount of time disappears into looking for, recreating, and duplicating information.

Over time, decision friction turns that lost time into a structural tax on growth, because the same ambiguity keeps getting paid for again and again.

Decision friction is not “edge cases,” it is operational reality

A useful way I’ve found to think about decision friction is to stop calling it exception handling and start calling it decision points.

A decision point is any moment where the workflow cannot proceed without interpretation, such as:

  • Validating whether an input is “complete enough”
  • Reconciling conflicting data across systems
  • Determining whether a request qualifies under a policy
  • Choosing which path applies based on subtle context
  • Approving or rejecting based on ambiguous criteria

Most organizations have more decision points than they realize because many of them are informal. They live in tribal knowledge, Slack threads, integration contracts that were never written down, or a senior operator’s mental model of “how this really works.”

Over time, that becomes a scaling ceiling. The workflow simply isn’t designed to handle ambiguity without stopping, so uncertainty gets pushed into people and queues.

What changed, and why AI matters specifically for exceptions

AI is relevant here because modern models can handle variable inputs and still produce structured outputs that a workflow can use, especially when you pair them with clear constraints and ownership.

In practice, this enables a different operating model:

  • Repeatable decisions become playbooks (cheap, auditable, consistent).
  • Ambiguous cases use scoped agents (powerful, but deliberately constrained).
  • Humans remain the override (necessary for true judgment, but not the default path).

This is the shift I care about: AI makes it practical to build a decision layer that absorbs variability, so recurring exceptions stop behaving like bespoke projects.

AI matters specifically for exceptions (1).png

From friction handling to friction intelligence

Many teams treat exceptions as interruptions. A healthier model treats exceptions as signals.

If the same kind of submission keeps failing validation, it’s usually a signal the workflow isn’t set up to handle that variation cleanly. Put simply, friction intelligence means identifying where work slows down, understanding why it happens, and updating the system so the same exception stops consuming time and margin week after week.

The business value tends to be straightforward. Instead of paying for the same reasoning over and over again, you promote that reasoning into the workflow.

A practical decision layer that scales

A scalable decision layer has four properties:

  1. It detects friction events. The system knows when work cannot proceed because something is missing, inconsistent, or unclear.
  2. It triages and proposes the next best action. The workflow does not stop, it branches with intent.
  3. It routes ambiguity to the right owner with full context. Escalation is not a handoff, it is a package.
  4. It learns from outcomes. Every override, correction, and resolution feeds the next iteration of the playbook.

In practice, it looks like a simple flow:

9 (1).png

This is the part teams underestimate until production forces the issue: they start with agents before they have clarity, telemetry, or decision boundaries, and it creates “agent chaos,” where automation amplifies inconsistency instead of reducing it.

The better sequence is foundations before automation:

  • Define the decision points and owners
  • Instrument the workflow to capture friction signals and outcomes
  • Ship playbooks for repeatable decisions
  • Add scoped agents only where ambiguity is real
  • Design human override and governance from day one

What this looks like in real workflows

Two examples from legal operations make the pattern concrete because they are high volume and exception-heavy.

Example 1: notice intake and job order creation

In many legal workflows, notices arrive in unstructured formats, with missing or inconsistent fields. Job creation depends on manual interpretation of each notice. When volume spikes, queues grow faster than teams can absorb, leading to delayed job starts and missed response windows.

A decision-layer approach looks like this:

  • Parse incoming notices and extract critical fields
  • Auto-create jobs when data is complete
  • Route ambiguous cases to reviewers with pre-filled context
  • Feed every correction back into the playbook so accuracy improves over time

What matters here is the design: resolve what is consistent, route what isn’t, and treat overrides as input for the next iteration.

Example 2: vendor delivery submissions and resubmission loops

Vendor submissions often fail for predictable reasons: wrong format, missing credentials, mismatched rates. Operations teams review manually, flag issues, and wait for resubmission. Resubmission rates drive up support hours and delay delivery.

A scalable pattern is:

  • Validate documents and extract data at intake before human review
  • Resolve common frictions consistently, 24/7, without intervention
  • Escalate only true edge cases that require judgment
  • Convert recurring failures into automated playbooks

The value here is eliminating decision loops that repeat every day, especially the ones that quietly create queues and rework.

Making Sense examples that map directly to decision friction

To keep this grounded, here are two examples I like because they show the same pattern: turning repeated judgment into scalable execution.

Esquire: scaling operations by centralizing data and reducing manual workflows

Esquire Depositions needed to support growth through both organic expansion and acquisitions, without increasing headcount. The work included building a centralized data architecture as a single source of truth and introducing automation to eliminate time-intensive manual workflows. The reported outcomes include a 40% gain in workforce efficiency and a 10% boost in enterprise value, which is exactly what you’d expect when ambiguity and rework stop scaling with volume.

From a decision-friction lens, the takeaway is that scaling required removing ambiguity around data and process, so decisions could be made faster and more consistently across a growing organization.

CCI Puesto de Bolsa: reducing inquiry loops with a self-service platform

At CCI, a large share of operational effort was being consumed by investor inquiries that should not have required manual intervention. When customers couldn’t access the right information in the moment, routine requests turned into exceptions, and those exceptions created loops: back-and-forth, escalations, and slower response times.

Making Sense helped CCI launch a self-service investment platform that gave users direct access to key information and actions, so the workflow could resolve more requests without manual interpretation. The result was a 3× increase in user engagement, +30% investor growth after implementation, and an 80–90% reduction in time spent handling investor inquiries.

How to find decision friction in your organization

If you want to locate the highest-impact opportunities, I wouldn’t start by brainstorming AI use cases. I’d start by mapping where decisions stall work.

These questions typically surface the real friction:

  • Where do workflows slow down because inputs arrive incomplete or inconsistent?
  • Which queues grow fastest during volume spikes?
  • Where do people repeatedly ask, “who can approve this” or “who knows how this works”?
  • Which steps create resubmission cycles, rework loops, or manual reconciliation?
  • Where do customer outcomes vary depending on who handled the case?
  • Which “small” decisions are made hundreds of times per week?

The best targets are rarely the most complex processes. They are the most repeated decision points, especially where ambiguity forces humans to reconstruct context, hunt for the source of truth, or reconcile definitions across systems.

How to implement without creating new risk

A decision layer should make the system more reliable, not more fragile. Two guardrails matter:

  • Start with playbooks, not agents. Playbooks force clarity, auditability, and measurable outcomes.
  • Keep humans as override by design. The goal is to use human judgment where it adds value, not as workflow glue.
How to implement without creating new risk (1).png

This is also where workflow design and resource design intersect. I keep coming back to the same idea: bottlenecks rarely disappear through one-off fixes. They move, or they come back in a different form, unless you look at the workflow as a system and understand what is constraining it.

That is why I like to anchor AI work in the boring basics: workflow observability, a clear way to label exceptions, and outcome metrics you actually trust. Otherwise you end up automating motion without improving the decision points that set the pace.

Key takeaways

  • Happy-path automation is table stakes. The scaling ceiling lives in exceptions.
  • Decision friction is the compounding cost of stopping workflows for interpretation and judgment.
  • The goal is not “more AI.” The goal is a decision layer that resolves what is consistent and routes ambiguity with context.
  • Treat recurring exceptions as operational signals, then upgrade the system so the same reasoning is not paid for twice.
  • The most valuable AI deployments keep improving after go-live because overrides and outcomes feed the next playbook.

Where to start when exceptions slow the business

If exceptions are quietly driving delays, rework, or inconsistent outcomes, the fastest next step is not a tool evaluation. In my experience, a focused assessment is the most reliable way to map decision points, quantify where friction is accumulating, and identify where playbooks and scoped automation can reduce latency and protect KPIs.

If you want to see what that looks like in practice, the Esquire and CCI case studies are a good starting point, and we can use the same lens to identify the highest-value decision friction in your workflows.


Mar 11, 2026

Say Hello!

Get the latest news and updates
logo footer making sense

|

Technology Fueling Growth

The Cost of Decision Friction When Exceptions Run the Workflow