The Pattern That Keeps Repeating
A mid-size manufacturing company in Spain invests three months in an AI pilot to automate invoice reconciliation with its ERP. The vendor delivers the agent. The team tests it. It works in 60% of cases. The other 40% generates exceptions that someone has to resolve manually. The pilot is abandoned. The internal conclusion: "AI isn't ready for this."
Twelve months later, another vendor arrives with a similar proposal. The cycle repeats.
The problem was not the technology. It was that no one mapped the real process before building anything.
This pattern is more common than it appears. And it has a structural cause: AI vendors are incentivized to reach the demo quickly, not to ask uncomfortable questions about how the process they are about to touch actually works.
Why the Diagnosis Never Happens
When a vendor walks into a sales meeting, their immediate objective is to demonstrate capability. The demo impresses. The use case seems obvious. The proposal arrives within a week.
What does not happen in that cycle is the most important question: how does this process work today, including everything that is not documented?
Because the answer to that question typically reveals three things that complicate the sale:
1. The process has more exceptions than the team remembers. What appears to be a simple rule — "if the invoice matches the purchase order, approve it" — has in practice a dozen variants the team manages tacitly. No one wrote them down. They exist in the heads of two people.
2. Input data quality is not what everyone assumes. PDFs arrive in different formats. ERP fields contain inconsistent values. Dates are recorded three different ways depending on who entered them.
3. The success criterion is undefined. What does it mean for the pilot to "work"? 80% automation? Zero errors? Reduction in team hours? Without that number agreed upon before the start, any result can be interpreted as failure.
A vendor that asks these questions before proposing is, in effect, delaying its own close. That is why most vendors don't ask them.
What Should Happen Before Any Pilot
The pre-pilot diagnosis is not an academic exercise. It is a two-to-four-hour working session with the people who operate the process, aimed at answering four concrete questions:
What is the real process, not the documented one? This means following a transaction from start to finish, including informal steps, manual corrections, and decisions made "by judgment."
Where are the exceptions, and what volume do they represent? If 30% of cases require human intervention for reasons that cannot be codified, that pilot will have an automation ceiling of 70% from day one. Knowing this in advance allows the agent to be designed realistically.
What is the quality of the input data? An agent that processes invoices requires those invoices to be readable, structured, and consistent. If they are not, the problem to solve first is a data problem, not an AI problem.
What is the measurable success criterion? Not "that it works well," but a number: percentage of cases processed without intervention, team hours freed per week, reduction in errors at month-end close.
With those four answers, the pilot design changes entirely. Scope is adjusted. Expectations are aligned. And the success criterion exists before a single line of code is written.
An Example with ROI Assumptions
Consider a food distribution company with a five-person administration team that spends between 15 and 20 hours per week reconciling delivery notes, invoices, and purchase orders between its ERP and the spreadsheets the sales team uses.
A three-hour pre-pilot diagnosis reveals that 65% of those transactions are entirely routine and always follow the same pattern. The remaining 35% contains exceptions, but 20% of those exceptions follow three specific rules that can be codified.
Expected outcome of a well-designed pilot: automation of 80–85% of total volume. Estimated release of 12 to 16 team hours per week. At a conservative opportunity cost of 25 €/hour, that represents between 15.000 and 20.000 € in recovered capacity annually, not counting the reduction in errors at month-end close.
Without the pre-pilot diagnosis, that same pilot would have been built assuming 100% of the process was automatable. The result would have been an agent that fails in 35% of cases, a frustrated team, and an incorrect conclusion about the viability of AI.
What This Means for Your Next Evaluation
If you are considering a second AI pilot — or evaluating proposals for the first — there is one question you can ask before moving forward with any vendor:
What process diagnosis do you conduct before proposing the scope?
If the answer is "the demo covers that" or "we address it during implementation," you have relevant information about how that project is going to end.
A pre-pilot diagnosis does not guarantee pilot success. But its absence almost guarantees that the problems that surface during implementation were predictable from the start.
The difference between a pilot that gets abandoned and one that reaches production is usually measured in hours of upfront work, not in the sophistication of the technology.
Conclusion
The second pilot fails for the same reasons as the first when no one changes the step that precedes the build. The technology improves. Vendors multiply. But the diagnosis process remains the most frequently skipped link — and the one that most determines the outcome.
If you want to assess whether your next AI use case has the conditions to work in production, the starting point is an honest process diagnosis, not a demo.
Request a free diagnosis with OuroAI. In 15 minutes, we identify whether your case has the conditions for a pilot with a measurable result.