How to Know Whether an AI Agent Is Working: The Metrics a CFO Can Read Without Help from the Technical Team

The Real Problem Is Not Technical: It Is Visibility

When a company deploys an AI agent, the initial conversation tends to revolve around what the agent can do. Rarely does anyone discuss how to know whether it is doing it well.

The typical result: the technical team has access to logs, traces, and infrastructure metrics. The CFO or COO receives, at best, a monthly report with screenshots. At worst, they receive nothing and assume it is "working" because no one is complaining.

That is not governance. That is faith.

An AI agent is a business process. And like any business process, it must be measurable using indicators that management can read, interpret, and act on — without requiring technical intermediaries.

This article describes the four metrics that make exactly that possible.

Metric 1: Autonomous Resolution Rate

This is the most direct metric. It measures what percentage of tasks assigned to the agent are completed without human intervention.

If the agent handles expense approval requests, for example, the autonomous resolution rate indicates how many of those requests it processes on its own, versus how many require a team member to step in.

A well-calibrated agent should autonomously resolve between 70% and 90% of the cases it was designed for. If that number drops below 60%, there is a problem: either the agent is not properly trained for the real volume of cases, or the scope defined during implementation does not match what the business is actually asking of it.

This metric requires no technical access. It is a number. It can be placed in a management dashboard alongside the rest of your operational KPIs.

Metric 2: Escalation Rate

This metric complements the previous one. It measures how frequently the agent transfers a task to a human because it cannot — or should not — resolve it on its own.

Escalation is not necessarily a problem. A well-designed agent escalates when appropriate: cases outside its scope, situations requiring judgment, exceptions that were not anticipated. The problem arises when the escalation rate is systematically high or rises without apparent cause.

A sustained increase in the escalation rate is an early warning signal. It may indicate that the volume of atypical cases is growing, that the agent needs adjustment, or that a change in the business process was not reflected in the agent's configuration.

For the CFO, this metric functions as an operational risk indicator. If the escalation rate rises, there is something to review before it becomes a larger problem.

Metric 3: Cycle Time per Task

This metric measures how long the agent takes to complete a task from the moment it receives it to the moment it delivers the result.

It has two uses. The first is comparative: how long did that same process take before the agent? If the invoice reconciliation process took 4 hours with manual intervention and now takes 12 minutes, that is a concrete business data point — not a marketing promise.

The second use is ongoing monitoring. If cycle time begins to increase without any change in workload volume, it may indicate a performance issue, a failing external dependency, or a backlog of unresolved cases.

Cycle time is a metric any COO understands immediately. It requires no technical translation.

Metric 4: Cost per Task

This is the metric that matters most to a CFO and, paradoxically, the one least often instrumented in agent deployments.

Every time an agent executes a task, it consumes resources: calls to language models, compute time, integrations with external systems. That consumption has a cost. If that cost is not measured per task, it is impossible to calculate the agent's true ROI.

The calculation is not complex. If the agent processes 800 requests per month and the total monthly infrastructure cost of the agent is EUR 400, the cost per task is EUR 0.50. If the same process cost EUR 8 per task with human intervention, the saving is EUR 7.50 per task — or EUR 6,000 per month on that specific process.

Those ranges vary depending on the type of process, volume, and complexity. But the calculation mechanism is the same. And it is a calculation the CFO can perform independently, without needing the technical team to translate it.

A Concrete Example: Procurement at an Industrial Company

An industrial distribution company with 80 employees deployed an agent to manage the purchase order validation process: verifying approval limits, cross-checking against available budget, and routing the request to the appropriate approver.

Before the agent, that process took between 6 and 24 hours depending on team availability. With the agent, the average cycle time dropped to 18 minutes for 78% of cases. The remaining 22% are escalated to a responsible party because they involve exceptions or amounts outside policy.

The monthly cost of the agent, including infrastructure and governance, fell in the range of EUR 300 to EUR 500. The procurement team recovered between 25 and 35 monthly hours previously spent on routine validations.

Those four metrics — resolution rate, escalation rate, cycle time, and cost per task — were available in a dashboard the COO reviewed every week. No conversation with the technical team required.

What to Do If Your Company Does Not Have These Metrics Today

If your company already has agents in production and cannot answer these four questions with concrete data, you have a governance problem — not a technology problem.

If you are evaluating deploying agents and no one has discussed how you will measure their performance, that is a signal that the proposal on the table is incomplete.

Metrics are not an add-on. They are part of the agent's design. An agent without business instrumentation is a process without controls.

If you want to review which metrics make sense to instrument in your case — or assess whether the agents you already have are generating the expected return — request a free diagnostic. It is a short form, with no immediate call required, and we will return a concrete evaluation in under 48 hours.

How to Know Whether an AI Agent Is Working: The Metrics a CFO Can Read Without Help from the Technical Team

The Real Problem Is Not Technical: It Is Visibility

Metric 1: Autonomous Resolution Rate

Metric 2: Escalation Rate

Metric 3: Cycle Time per Task

Metric 4: Cost per Task

A Concrete Example: Procurement at an Industrial Company

What to Do If Your Company Does Not Have These Metrics Today

Ready for the next step?

Explore articles

AI-Powered Procurement in Mid-Size Manufacturing: Three Inefficiencies That Persist Even With an ERP — and How an Agent Resolves Them Without Replacing the System

How to know if your AI agent is generating real value: five metrics any COO can review without relying on the technical team

Stay ahead of the agentic future.