Enterprise brands are publicly promising AI-powered, always-on, human-like resolution, while the data shows most AI deployments fall short. Here's what that gap costs you.
The enterprise race to deploy AI customer service has produced a new, largely unexamined form of Promise Drift: brands declare 24/7 AI resolution in their marketing, onboarding flows, and press releases, then deliver chatbots that frustrate the majority of the people who use them. That gap is not a minor operational nuisance. It is a brand-level trust event, and according to the data, it is accelerating in 2025 and 2026.
This post maps the AI customer service over-promise pattern through the Promise Alignment System (PAS) lens, names the specific Drift Zone where the damage accumulates, and describes what responsible AI promise governance actually looks like in practice.
The internal confidence among business leaders is high. According to Twilio's November 2025 Inside the Conversational AI Revolution report, based on a global survey of 457 business leaders and 4,800 consumers across 15 countries, 83% of business leaders believe conversational AI can replace human agents. The customer reality runs in the opposite direction: 64% of customers would prefer companies did not use AI at all, and 50% say they would cancel a service they discovered was solely AI-driven.
The disconnect deepens when you look at actual consumer sentiment. SurveyMonkey's December 2025 study of 2,017 U.S. adults found that 79% of Americans strongly prefer interacting with a human over an AI agent. That same study reported that 56% of people hold negative feelings about companies that use AI as part of their customer experience, and 81% believe AI is used primarily to save money rather than improve service.
On the execution side, the deployment gap is equally stark. S&P Global Market Intelligence's 2025 survey of over 1,000 enterprises across North America and Europe found that 42% of companies abandoned most of their AI initiatives in 2025, up from 17% in 2024. The average organization scrapped 46% of AI proof-of-concepts before they reached production. According to McKinsey's early 2024 State of AI report, 44% of organizations had already experienced at least one negative consequence from generative AI implementation, with inaccuracy cited as the most common harm.
These are not failure rates from experimental edge cases. They are the baseline for enterprise AI in production today.
In the Promise Alignment System, every customer-facing commitment lives inside a Promise Stack, and every place where that commitment can break down is called a Drift Zone. The fifth Drift Zone, AI & Automation, is the newest and, right now, the least governed.
The pattern of drift in this zone is distinct from the others. In the Sales & Marketing Drift Zone, a salesperson over-promises because they are trying to close a deal. In the AI & Automation Drift Zone, the organization itself encodes the over-promise into its product interface, its website copy, and its public announcements. The bot says it can resolve your issue. The press release says it matches human agent satisfaction scores. The marketing page says it is available 24/7 to help with anything. These are Conditional and Supporting layer promises, specific enough to set measurable expectations, broad enough to fail in countless unverifiable ways.
The damage compounds because AI failures happen at the exact moments when customers need the most help. Air Canada's chatbot case is the clearest public example. In February 2024, the British Columbia Civil Resolution Tribunal ruled against Air Canada after its chatbot told a grieving passenger, Jake Moffatt, that he could retroactively apply for a bereavement fare within 90 days of purchase. That was incorrect. When Moffatt tried to claim the discount, Air Canada refused and argued, remarkably, that its chatbot was "a separate legal entity responsible for its own actions." The tribunal rejected that argument and awarded damages. The ruling established a precedent that companies are liable for information their AI tools provide, whether the information comes from a static page or a chatbot.
The Air Canada case illustrates the core risk: the chatbot made a Conditional promise ("you can apply within 90 days") that was inconsistent with the company's actual policy. No governance layer caught the discrepancy before it reached a customer in an emotionally charged moment.
Klarna is the most-cited AI customer service success story, and it is also one of the most instructive examples of promise recalibration. In February 2024, Klarna announced that its OpenAI-powered AI assistant had handled 2.3 million conversations in its first month, two-thirds of all customer service interactions, doing the equivalent work of 700 full-time agents with resolution times dropping from 11 minutes to under 2 minutes.
The announcement was framed as a near-total replacement of human customer service. By May 2025, Klarna reversed course. As CX Dive reported, the company turned back to human customer service representatives after CEO Sebastian Siemiatkowski acknowledged that "cost was a predominant evaluation factor" in organizing support, resulting in "lower quality." Klarna began rehiring human agents and shifting to an explicit human-plus-AI model.
Kate Leggett, VP principal analyst at Forrester, summarized the episode directly: the company "overpivoted to cost containment, without thinking about the longer-term impact of customer experience."
Klarna's numbers were real. The efficiency gains were genuine for routine queries. The over-promise was in how those gains were framed publicly: as proof that AI could wholesale replace human agents for a global financial services platform. That framing set customer expectations the system could not consistently meet for complex or sensitive interactions.
“They overpivoted to cost containment, without thinking about the longer-term impact of customer experience. If you have a poor customer experience, at some point, customers are going to get fed up and leave.”
Forrester's 2026 B2C Marketing, CX, and Digital Business Predictions name this precisely: in 2026, a third of companies will harm customer experiences with frustrating AI self-service, because cost-cutting pressure is causing organizations to deploy customer-facing AI chatbots and virtual agents prematurely, in contexts where they are unlikely to succeed. The result, Forrester states, will erode brand and customer experience, harming both acquisition and retention.
Separately, Forrester's customer service-specific 2026 predictions warn that service quality will dip as organizations struggle to scale the remit of AI while running into operational realities. The Forrester analysis notes that most organizations have eagerly embraced vendor promises that AI will empower customers to self-serve, but that scaling AI across customer service functions exposes operational gaps that were invisible when deployments were small.
This is the structural problem: the promise is made at the platform level ("our AI handles everything"), but the failure surfaces at the individual interaction level (the specific customer, the specific query, the specific moment the bot loops or escalates into a dead end). Customers do not experience the aggregate satisfaction score. They experience the single conversation that did not resolve their issue.
According to SurveyMonkey's 2026 customer service statistics page, 41% of consumers feel customer service has worsened due to AI, and 63% do not believe AI could ever replace human beings in customer service roles. These are not anti-technology positions. They are responses to real, accumulated experiences with under-governed AI deployments.
The Gartner April 2026 survey of 782 infrastructure and operations leaders reinforces the execution gap: only 28% of AI use cases fully succeed and meet ROI expectations. Of leaders who reported at least one failure, 57% said their AI initiatives failed because "they expected too much, too fast." Additionally, Gartner predicted in June 2025 that over 40% of agentic AI projects will be canceled by end of 2027, citing escalating costs, unclear business value, and inadequate risk controls. These are not predictions about exotic use cases. Agentic AI in customer service, the category Salesforce Agentforce, Intercom Fin, and Zendesk AI are actively selling, is exactly the terrain where these cancellations are most likely.
The solution is not to avoid AI in customer service. AI handles routine, high-volume, low-complexity queries well. The solution is to govern what you promise about it.
In PAS terms, that means placing AI customer service commitments inside the correct layer of the Promise Stack before you communicate them externally, and then assigning explicit ownership in the AI & Automation Drift Zone.
Here is what that looks like in practice:
Scope the promise to what the system actually handles. If your AI resolves password resets, shipping status queries, and return initiation at high confidence, promise exactly that. Do not say "AI that resolves any issue." The Air Canada chatbot failed because it was deployed to answer open-ended policy questions it was not reliably trained on, and no governance layer flagged the mismatch.
Make the human escalation path part of the promise, not an afterthought. SurveyMonkey's research found that 89% of consumers believe companies should always offer the option to speak with a human. Promising AI-first service without a clear, low-friction human escalation path is not a cost-saving measure. It is a trust-destruction mechanism. Klarna's 2025 correction was not a retreat from AI. It was the addition of a human layer that should have been part of the promise architecture from the start.
Disclose when customers are interacting with AI. SurveyMonkey found that 14% of consumers would lose trust in a business that uses AI without disclosing it. Disclosure is not a legal nicety. It is a promise alignment action: it brings the customer's expectation ("I know this is a bot with these limitations") into alignment with the actual delivery.
Treat AI outputs as a Drift Zone requiring ongoing monitoring. The Air Canada chatbot gave incorrect information for an undetermined period before the Moffatt case surfaced the problem. Governing the AI & Automation Drift Zone means sampling outputs, tracking escalation rates, and auditing for policy misalignment, continuously, not at launch.
Run a Promise Stack audit before any AI expansion. Before Salesforce Agentforce or Zendesk AI handles anything beyond tier-one queries, verify that the commitments in your Supporting and Conditional Promise Stack layers accurately describe what the system will do, and what it will not. The marketing page, the onboarding email, and the bot's own opening message are all promise surfaces.
| Common AI Promise | Governed AI Promise |
|---|---|
| AI that resolves any issue, 24/7 | AI handles order status, returns, and FAQs. Complex issues go to a human within 2 minutes. |
| Human-level satisfaction scores | Satisfaction on par with human agents for tier-one queries |
| No mention of escalation path | You can reach a human anytime by typing 'agent' or calling [number] |
| AI deployed; monitoring TBD | Monthly output audits; escalation rate tracked weekly |
The difference between the left column and the right column is not product capability. It is promise discipline. Most brands already have the AI. What they are missing is the governance layer that keeps external commitments aligned with actual delivery.
Promise Drift in the AI & Automation Drift Zone is not a technology problem. It is a communication and governance problem, which means it is solvable before the first customer complaint, not after the lawsuit.
If you want to map your AI customer service commitments against your actual delivery capabilities, the Promise Alignment System gives you a structured way to do it. Start with the PAS platform to identify where your AI promises are drifting, and what to do about it.
Retailers promise a unified cross-channel experience in their brand marketing, then deliver siloed tech, disconnected inventory, and loyalty programs that forget who you are the moment you walk into a store.
B2CDTC subscription brands lose customers not because the product is bad, but because the acquisition promise, personalized, curated, endlessly delightful, drifts from reality by month three.
Get a free Promise Drift Report generated from your public data.
Run Your Diagnostic