Five Takeaways from MIT’s “State of AI in Business”: Why 95% of GenAI Pilots Stall
Limitations of technology driving new approaches from partnership model and ecosystem strategy to investments in customization and value creation.
A recent viral study from MIT titled “The State of AI in Business” landed with a stark finding: despite tens of billions in enterprise GenAI spend, roughly 95% of pilots fail to deliver production-level business value.1
The minority that succeed - about 5% - are deploying learning systems that absorb context, adapt to existing workflows, and move operational metrics. This goes to the heart of the GenAI divide, where the constraint is not with model quality, limits around data-sharing, or organizational prioritization, but with building systems that can seamlessly remember, adapt, and integrate deeply to survive real operations.
Other themes emerge in the report: production wins increase in likelihood using external partners over in-house development, back-office use cases succeed over front-office hype, trusted channels and integrator partnerships differentiate more than features alone, and outcome-tied commercial approaches win over classic SaaS.
Here are the five most important takeaways in more detail:
Integration and Customization, not Model Quality or Data Sharing, are why GenAI projects stall
For simple tasks the report found that organizations by a 2:1 margin preferred AI over humans (i.e.: email writing, summarization, basic analyses). But by a 9:1 margin, organizations preferred humans over AI for anything long-term or complex (i.e.: multi-week projects, client management).
Why? Because agentic systems (AI systems suited for multi-turn, multi-step tasks) struggle with tools that don’t learn, with methods for adding context that are cumbersome and difficult to integrate, and with customization that is limited (see graph below). Moreover, aligning these non-deterministic systems so they don’t break is a challenge that increases exponentially with task complexity. Enormous engineering is required for any deployment, and still these systems are described as “brittle” and "inflexible."
There are advances being made in two areas to address this: memory (to remember and learn), and integrations (like the Model Context Protocol, or MCP - a new standard connecting LLMs to applications), to better incorporate context and require less manual work to access systems.
But even with some of these technological advances, aligning systems to business logic and evaluating the acceptability of these systems at production scale (through the deployment of enterprise-specific evals) are major hurdles, resulting in persistent challenges with systems that tend to fail with edge cases, and are a struggle to consistently calibrate.
Outside Vendors Deliver a 2x Advantage for Build vs Buy (with parallels to the earlier cloud wave)
An overlooked factor separating pilot to implementation success: the study found organizations using external vendors deployed about 67% of the time versus ~33% for purely in-house builds, a 2x difference.
Large vs small organizations exhibited key differences in terms of approach. Enterprises, defined as firms with over $100M in revenue, led in pilot counts and were aggressively allocating staff towards AI initiatives (suggesting resourcing for in-house solutions). These firms, however, had the lowest conversion rates from pilots to deployments. Meanwhile, Mid-market firms (<$100M in revenues) moved faster from pilots to production (~90 days vs 9 months for Enterprises) - though presumably with a more outsourced approach.
Likely this mirrors adoption trends from the previous SaaS / cloud software wave. Back then, smaller, more resource-constrained firms were early adopters of emerging cloud-based ERP solutions (ie: Workday for HR, Salesforce for CRM, Netsuite for receivables) compared to Enterprises that developed custom, in-house solutions, building on top of legacy systems, using large in-house IT departments and budgets.
The cloud-based approach was more efficient and agile, ultimately winning the entire market (including large Enterprises). This suggests that a new class of solution providers will emerge to power the AI revolution, versus rollouts by individual companies building bespoke, in-house solutions in isolation.
GenAI Budgets skew towards front office (sales / marketing =50%) but conversion to production is highest for middle/back-office
One of the more nuanced findings from the study showed how in general, organizations were directing AI budgets to front of the house areas like sales, or marketing (~50% of budgets) which could have higher “measurable” results in terms of ROI calculations (ie: increased leads, improved sales activity metrics). But more “boring” areas (like operations, customer support, procurement and finance) converted from pilots to production at higher rates (which ultimately drives impact).
The challenge is one of measurement. Middle and back offices are typically cost centers, and a challenge in the surface-level ROI calculations is capturing the value for “faster closing of books”, “fewer errors in procurement”, or “reduced compliance risks”. These ultimately matter - but they have indirect P&L impacts.
The findings suggest that organizations should reallocate budgets to middle and back office areas (which naturally will happen as organizations realize the pilot-to-deployment rates are higher). But also that success reporting should factor in operational KPIs (like quality improvements, speed of close, even usage across a department) versus ‘hard’ ROI metrics, to avoid the front-office investment bias.
System integrators, marketplaces and board/advisor referrals are critical to breaking through (contributing ~60% to discovery / inclusion within the consideration process of GenAI vendors)
Partly because GenAI systems require so much engineering to properly calibrate, relationships with system integrators, distribution through familiar enterprise marketplaces, and other channel strategies are more important than ever to break into an enterprise’s consideration set.
The report also describes “notable skepticism towards emerging vendors” as organizations see “dozens of demos a year.” To stand out, decision-makers rely less on functionality and feature set differences - instead relying on referrals, board-level advisory recommendations, and VC introductions as filters.
System integrators, marketplaces and board / advisor referrals together accounted for nearly 60% of how GenAI solution providers were discovered / considered by decision-makers at organizations. That finding alone suggests that companies and startups building in the space devote significant upfront resources to channel partnerships and selling through board-level referrals and advisor networks to drive growth.
Motions focused on business outcome alignment and services (vs traditional SaaS sales methods) are required vs prior cycles
Lastly the report also highlighted how organizations seeing success moving to production have relationships with their vendors closer to the relationships they have with consulting firms and BPOs compared to traditional SaaS software vendors from past cycles.
Successful buying processes involve “focusing on operational outcomes”, “demanding deep customization aligned to internal processes and data” and “treating deployment as co-evolution.” By and large, initial software features were less differentiating, versus the customization that was required through loading enterprise specific context, choosing tasks narrow enough to start, and then creating a cycle of learning.
For software providers, these buying approaches likely require more consultative sales approaches, resourcing for customization and integration (forward deployment models are proliferating), and more post-sales support to achieve shorter time to value.
Yet the report suggests that these white-glove services create moats and switching costs, as organizations find that cycles of learning make it hard to move away from a chosen vendor, once that cycle of enterprise-specific integrations, feature development, learning and improvement take hold.
As one CIO is quoted in the study: “Once we’ve invested time in training a system to understand our workflows, the switching costs become prohibitive.”
Conclusion
Current constraints of the technology mean the process to move to production involves more customization / forward deployed engineering, consultative / business aligned selling, investments in integration partners and ecosystem development to be successful.
But for players who are incorporating these learnings, the current GenAI moment is creating real opportunity for value creation at companies, competitive moats for service providers, and gains in significant areas where businesses are operating.
The study makes a distinction around “enterprise-grade systems” that automate workflows and have a P&L impact (the primary focus of the paper and the 95% failure rate claim), vs co-pilot deployments (ChatGPT) which have high adoption rates, but mainly impact individual productivity (not corporate P&Ls).


