As enterprises rush to put enterprise generative AI into production, many are discovering that early wins in pilots don’t translate into scalable, trusted business value. Executives can point to AI experiments and off‑the‑shelf tools in use across the organization, yet struggle to show measurable ROI once initiatives move beyond proof of concept.
In this conversation with ERP Today, Fern Halper, Ph.D., Founder of the AI Foundations Group, VP of Research at TDWI and author of Data Makes the World Go ’Round: The Data, Tech, and Trust Behind AI Success, explains why so many AI programs stall at the point of scale—and what separates organizations that treat AI as a one‑off tool from those that build it as an enterprise capability grounded in data foundations, governance, and operational readiness.
ERP Today: You have extensive experience helping leaders understand AI success, risk, and trust. What do you see as the biggest gap between an organization’s AI ambition and the outcomes they’re actually achieving today?
Dr. Fern Halper: The first question is always, what do you mean by AI? Until a couple of years ago, AI usually meant machine learning, predictive analytics, or natural language processing. Then generative AI arrived, and now most organizations have deployed it in some way and believe it has made them more productive, often through off‑the‑shelf, consumer tools.
The gap I see is between perceived value and true value. The organizations that actually measure value and get ROI are the ones treating AI not just as technology, but as a set of enterprise capabilities. In contrast, many organizations simply buy a generative AI tool to summarize call center notes or help write marketing copy and expect productivity gains. They eventually hit a value ceiling because they haven’t built the capabilities that traditional machine learning adopters had to develop.
In my TDWI research, about 35–45% of organizations have deployed machine learning and predictive analytics, and to do that they needed to be ready in five areas: organizational readiness, data readiness, skills and tools readiness, operational readiness, and governance readiness. Today, many companies say they are “doing AI,” but they haven’t put these foundations in place, so they are not positioned for long‑term success.
If you had to pick one foundational gap, it’s the data foundation. Over 90% of organizations we survey at TDWI say they are using generative AI, but only about 35% are using machine learning. That tells you they’ve started with consumerized tools without building the underlying capabilities. They quickly discover problems because they lack those foundations. Off‑the‑shelf tools can provide value, but they can also create “work slop.” The real value comes from intentionally implementing data foundations, skills, executive sponsorship, and a cultural understanding of what AI will mean. If you think of this as a system of enterprise capabilities, you are much more likely to succeed and measure ROI over the long term.
Q: That’s a powerful distinction between layering AI on top and actually building for ROI. Can you share an example where a company had strong AI ambition, “layered on” AI, and then struggled to deliver outcomes? When did they realize it wasn’t working, and what business impact did they see before they had to step back and reevaluate?
FH: A pattern I see often starts with pressure from the board or executive team to “do something with AI.” In one case, a CEO announced that AI was here and declared that the organization should be AI‑first. He had strong ambitions and solicited use cases, but there was no strategy, no clearly defined business need, no data infrastructure, and no governance.
Within about six months, the company was deploying tools in a kind of shadow AI manner. It didn’t really know which tools were being used and had provided no literacy training, so people had no guidance on guardrails. Somewhere between six months and a year later, they realized they had a problem.
This was a mid‑sized company, and the business impact was essentially that they couldn’t measure any impact. They hadn’t put thought into how they would do this, let alone how they would measure it. Eventually the head of IT said, “We need training and governance,” but it took them time to get there.
More broadly, when I ask organizations about the impact of AI missteps, maybe 10–20% say there was a major mistake with financial consequences, although they rarely give details. Many more report moderate impacts but say they recovered. Often this ties back to data and data quality: they realize they are getting wrong answers or that no one is monitoring the AI output to see if it is getting stale.
I also see many organizations that say they have data quality problems and conclude they cannot move forward with AI because of them. Once they start measuring what the problems are and defining what “good” looks like—which many organizations struggle to do in terms of meaningful metrics—they can make faster progress. I’ve seen large companies, within a year, turn governance into an enabler rather than a barrier, and then really start doing AI in a way they trust. I’ve seen this repeatedly and measured it in surveys: as you put the foundations in place, that’s what leads to measurable value.
Q: You’ve mentioned data quality, lack of strategy, and lack of clarity on what good looks like as factors that prevent pilots from scaling. Is there a repeatable pattern you see when companies roll out an AI pilot and it stalls? Do they recognize it quickly, or only after months of investment?
FH: There are two main issues. First, we see a lot of pilots and experimentation. In my current research on Agentic AI, it looks like maybe half the pilots make it to production, although it’s early days. Most activity right now is around single‑agent systems; many organizations are not yet working with multi‑agent systems.
Organizations often think their pilots are succeeding because they are very narrow. But when they try to move beyond the pilot, the data foundation becomes a barrier. In TDWI research, when we ask about obstacles, about a third of organizations say that siloed systems and lack of a unified platform are the top barrier. They realize that their systems are siloed, their data doesn’t line up across systems, and they haven’t tried to unify it. “Customer” still means different things in different systems. Quality, governance, and semantics all become blockers, and that’s when pilots start to stall.
The second big issue is that they haven’t thought through what it will take to operationalize AI in production. They focus on the model, not on how to deploy and run it. Take a popular generative AI use case: analyzing call center notes and trouble tickets. Many organizations feed those notes into a large language model and classify types of problems. That’s useful insight. But they haven’t thought about how to take all of those tickets, day after day, and run them through a production‑grade RAG (retrieval‑augmented generation) system. They don’t have the engineers or ops people for that. They’ve stopped at the model and haven’t designed for production.
We saw the same thing with machine learning and predictive analytics: organizations thought about models, not about what happens after you build and deploy them. That lack of operational thinking is a key reason pilots stall.
Q: A lot of this comes back to data. From your experience in data analytics and AI, how do data issues directly limit the success of AI rollouts? What actually breaks?
FH: What breaks is the system, not just the model. Data quality always shows up in surveys, but now organizations are also thinking about context. They’re very concerned about semantic consistency, because newer systems need to understand business definitions and context. Semantic consistency, governance, and observability are all critical.
If you have a definition of revenue, the AI system needs to understand whether you mean gross or net revenue, and what you mean by a customer. It needs that business context. This year, organizations are very focused on how to implement semantics as part of the solution, because otherwise generative AI will hallucinate and produce bad results—still “garbage in, garbage out.”
A couple of years ago, the emphasis was on unifying data, often into a data lakehouse, though you’ll never get all data into a single source. Now many organizations are exploring data fabrics to unify data and adding a semantic layer on top to help them scale. Governance becomes an enabler around that data.
It’s also not only about structured data. Organizations are more advanced with structured data. Many of the key use cases we’ve discussed—call center notes, trouble tickets—are built on unstructured data. In early results from a state of agentic AI readiness survey, we see that everyone is using unstructured data in these applications. If that unstructured data isn’t ready, you have a big problem.
Organizations have invested heavily in measures such as accuracy, completeness, timeliness, and consistency for structured data, but those metrics can mean something different for unstructured data. They’re now considering new metrics for unstructured data, such as document plausibility. Many organizations don’t trust their unstructured data; there is roughly a 20% trust gap between structured and unstructured data. That’s an important part of the data foundation they still need to address.
Q: You’ve described how organizations have shifted their approach to AI implementations, especially around unstructured data and wanting AI grounded in their own semantics. How should an organization approach its AI scaling journey so it goes as smoothly as possible, even if there’s no such thing as a perfect rollout?
FH: The starting point is a clearly defined business need and a measurable outcome. You need to think up front about what you want to measure. Success builds on success: when you can show a positive, measurable result, people will want to do more. So start with a business problem and measurable outcomes.
Next, assess your readiness honestly: Is your data ready? Is your governance ready? Do you have the necessary skills? Do you have the right architecture and a path toward a unified, governed data foundation? That foundation doesn’t have to cover all your data; it needs to cover the data relevant to the business problem and the outcome you want to measure.
You then put that governed data foundation in place, pilot your use case, and run it. If it works, you scale it with monitoring, operations, and clear ownership—something many organizations lack. As you put the application into production, you monitor outputs, make sure they are reasonable, and set alerts when there are problems. You can start small, for example with a RAG application over call center notes, measure the outcomes, pilot it, and then scale.
In practice, we see organizations moving from off‑the‑shelf tools, to custom assistants, to RAG, to agents, and then to multi‑agent systems, with value increasing as they integrate more of their own data. The key is to be intentional and disciplined. Organizations that understand this will take effort, and plan accordingly, are the ones that succeed. Those that expect overnight results generally don’t.
Q: Could you share an example—without naming the company—where an organization moved beyond automation, truly changed how it works, and even how it competes in its space through AI?
FH: One area where I’ve seen significant progress is in contact centers, especially when organizations don’t try to build everything themselves but partner with a provider. More broadly, if you look at ERP and CRM systems, vendors such as SAP are building prebuilt templates that run against your ERP and CRM data. If you trust that data and have prebuilt agents to work with it, that can help you move faster. Other vendors are integrating CRM with additional data sources. SAP’s strong partnership with Snowflake is an example, helping organizations bring sources together and share data without moving it, which is a big advantage.
I spoke with one service provider that helps contact centers transform their operations. They described reducing a four‑week manual analysis of call center notes to four hours. For companies with tens of thousands of agents, that translated into savings on the order of a million dollars a day. It was a workflow transformation story: automating and re‑imagining how call center notes were analyzed. What enabled it was grounding AI in enterprise data, embedding it into workflows, and measuring outcomes.
Looking ahead, I think supply chain is going to be a major area for transformation with generative and multi‑agent systems. Right now, organizations are building agents to request pricing from multiple suppliers or compare bids. I expect many early agentic AI use cases that show real transformation to be in supply chain, because they’re repeatable and still keep humans in the loop.
At the same time, a lot of organizations are using vendor tools in ways that are helpful but not yet truly transformative—it’s part of the journey. Many companies I speak with are focused on getting data foundations and guardrails right and running agentic pilots. Some let teams “go to town” building co‑pilots and agents and ended up with more than 10,000 agents in the company, which they later realized was not ideal.
When you build agents, you need to think in systems terms so you don’t create duplicates and can design an architecture that makes sense. There is a lot of experimentation, but many organizations are now putting more guardrails in place. They’re concerned about how agents interact and are considering where systems might need to be more deterministic rather than purely probabilistic. Agentic AI is the biggest investment area we see in 2026, but companies are approaching it step by step.
Q: For companies just starting their AI journey, while others are already exploring agentic AI, what would you advise they prioritize so their AI implementation is scalable, sustainable, and delivers sustained business value?
FH: I would come back to the idea of treating AI as an enterprise capability, not a tool. That means prioritizing a trusted, unified data foundation; semantic consistency and business context; governance and control; and integration into real workflows. AI value will come from doing the hard enterprise work well enough that you can trust AI at scale.
It may sound boring, but in following companies over the last decade, that is what leads to maturity and measurable success. You can’t just flip a switch and say, “AI is here now, it’s going to be great,” and expect sustained enterprise value. You have to do the work. That’s not to say off‑the‑shelf tools aren’t useful—they can be—but leaders need to understand the broader picture.
That’s one reason I wrote the book. I saw a lot of self‑proclaimed AI experts who didn’t really understand AI, focusing on “new AI” versus “old AI,” when in reality new AI needs old AI. That has always been my perspective, going back to my time in the labs analyzing customer data. The data foundations support everything.





