AI Frenzy Fails to Match Cold Reality: Companies Lower Expectations for AI Agents, Full Automation Still Years Away

```

Media reports indicate that AI is changing the way people work through general-purpose chatbots and AI programming tools, driving revenue growth for companies like OpenAI and Microsoft. Various companies have been trying to delegate employee tasks to artificial intelligence agents (AI agents).

However, many enterprises encounter difficulties when using more complex AI agents, as these agents often "aren't up to the job," forcing AI vendors to personally intervene and troubleshoot issues with clients to prevent AI from "messing things up."

For example, European retailer Fnac faced challenges using AI customer service agents. Fnac had tested models from OpenAI, Google, and other labs, but results were poor. The company’s Chief Digital and Ecommerce Officer, Olivier Theulle, told the media that reliability was a problem: when customers reported product defects, the AI asked for product serial numbers, but then confused these serial numbers with those of other products, even though the numbers differed by only one digit.

Fnac has annual revenues of $10 billion. Theulle said the performance of the AI agent only stabilized after partnering with Israeli company AI21 Labs and receiving assistance from its engineers. AI21 Co-CEO Ori Goshen said,

"The issue is that the models perform very well in benchmark tests straight out of the box, but not in real enterprise environments."

"A significant degree of customization is needed."

Some companies told the media that only after their own software engineers spent months deploying AI agents and received direct technical support from AI providers did they genuinely benefit from them. Nowadays, tech company leaders also say enterprises cannot expect complex AI projects to run smoothly without "hands-on support" from AI vendors.

Venture capitalist Vinod Khosla said in an interview in October,

"It’s like saying ‘we have a race car, anyone can drive it,’ but ordinary people can’t unlock the car’s full performance."

Khosla is an early investor in OpenAI, and recently invested in an AI consulting startup that dispatches engineers to companies like T-Mobile, helping them implement AI within large organizations. This startup, Distyl, is just one of many emerging in this space, providing high-tech consulting services to enterprises in need of support. AI developers and agent providers such as OpenAI, Anthropic, Salesforce, and Snowflake have also started hiring frontline deployment engineers (FDEs) or offering similar consulting services, but this often raises their costs.

Another example is Cox Automotive, which provides software for car dealerships and has annual sales of $9 billion. The company previously developed an AI agent to create marketing websites for dealers. As one of Amazon Web Services’ largest automotive industry clients, Cox received "white-glove service."

Cox’s Chief Product Officer, Marianne Johnson, told the media that AWS engineers, along with Anthropic engineers who provided AI technology for the agent, flew to Cox's Atlanta headquarters to work side by side with Cox’s software developers for several days to build the tool. She declined to disclose how much Cox paid AWS and Anthropic, but estimated that over the coming years the company could save millions in labor costs, since it no longer needs people to manually create websites for customers.

"It speaks nonsense with great confidence"

The goal of AI agents is to handle customer service issues, manage IT systems, and other tasks. AI and cloud service providers are betting on increased enterprise revenue from AI agents, citing this as a reason to invest hundreds of billions of dollars in AI data centers over the next year or two.

But these vendors and some client executives say AI agents are too difficult to configure and often behave unpredictably. This makes them unsuitable for tasks where mistakes can have serious consequences. As a result, clients have lowered their expectations, no longer expecting AI agents to automate too much work, and are delaying deployment of AI agents in critical roles such as customer support and cybersecurity.

For example, IT service giant Kyndryl began testing Microsoft’s Security Copilot this year—a chatbot designed to connect with enterprise IT systems and explain potential security vulnerabilities in plain English, essentially automating the work of cybersecurity analysts. But Scott Owenby, who is in charge of internal cybersecurity, told the media that when Kyndryl employees asked basic questions like "which company devices are running outdated software," Security Copilot’s answers were clearly wrong. Owenby said,

"It confidently spouts nonsense—and I admire the confidence, but I can't trust its data."

Kyndryl spent around $50,000 testing Security Copilot for six months before deciding to stop using the software. Owenby said,

"I basically burned $50,000. That’s not a lot, and we would’ve continued even if there was some usefulness, but we didn't expect it to be totally unusable."

Owenby added that other AI tools performed better, such as software from Palo Alto Networks, which can automatically handle repetitive cybersecurity tasks like investigating staff logins from new locations or capturing screenshots of sensitive data. This allowed him to reduce some staff from the security team over the past year, but he said it is still necessary to have people monitoring these AI tools and not let AI take full control.

"Some hype involved"

Bosch Power Tools has annual revenue exceeding $5.7 billion. The company’s Head of Digital Customer Experience, Florian Haustein, told the media that for over a year the firm has been testing a chatbot to answer customer questions about using tools and troubleshooting problems.

But Haustein said the chatbot still frequently gives incorrect answers, some of which could even cause user injury. Therefore, the project remains in the pilot stage. He also stated that Bosch is testing models from Google, OpenAI, and other labs.

Haustein told the media that Bosch has had better results with a less aggressive customer service chatbot that only answers basic questions, such as where to buy a particular product, and with an AI tool from SAP that reads customer queries and automatically assigns them to the appropriate human staff. Haustein said,

"I think 'fully automating customer service with AI' is a bit of hype."

"You have to ensure answers are close to 100% accurate… We still see hallucinations and incorrect answers. I don't think we've reached the level of confidence needed for full automation."

Some tech vendors also acknowledge that AI agents are not yet mature. Amazon CEO Andy Jassy said on last Thursday’s earnings call:

"At this stage, building AI agents is still harder than expected."

"But over time, much of the value companies get from AI will be from AI agents."

AI agent product revenue is hard to calculate

Currently, the adoption of general-purpose chatbots, programming assistants, AI search, and AI video generation tools has already helped engineering, marketing, and product management teams to improve efficiency, according to company executives interviewed by the media.

This has driven new revenue growth for AI providers: according to the media’s generative AI database, 20 AI-native startups led by OpenAI and Anthropic have achieved an annualized income of $23 billion from AI for office purposes, compared to virtually zero three years ago.

But calculating revenue specifically from "AI agents" is difficult. At cloud giants like Google, Microsoft, and Amazon, most of the revenue growth comes from server rentals by big AI developers like OpenAI, Anthropic, and Meta, rather than enterprise AI applications.

Among enterprise software companies selling AI agents, results are mixed. Salesforce said earlier this year that its Agentforce product—which automates sales emails, invoice tracking, and other tasks—generates annual revenue of over $100 million. ServiceNow stated that its AI software for automatically handling IT service tickets is expected to achieve $1 billion in revenue by the end of 2026. However, both companies have seen revenue growth slow in recent quarters compared to most of 2023.

SAP has yet to disclose standalone AI product revenue, but CEO Christian Klein said on this month's earnings call that AI will bring "double-digit revenue growth" over the next two years.

Many software companies offering AI agents, including Salesforce, Snowflake, and Xero, aren’t even charging for such products yet. They hope to wait until clients truly recognize their value before introducing fees.

ServiceNow’s President of Global Customer Operations, Paul Fipps, told the media that recently customers are less excited about piloting AI functions—they are now more pragmatic, considering which tasks AI agents can actually automate reasonably. Fipps said,

"In the past 12 to 18 months, the rapid development of generative AI pushed many clients to trial these AI capabilities, and the pendulum swung to an extreme."

"Now you’re seeing the pendulum swing back."

He remains optimistic that as AI agents improve, enterprises will continue to invest heavily over the coming years.

Currently, AI agents are most successful in software development. AI coding agents are becoming standard for many engineering teams. But software engineers still need to check AI-generated code, since AI makes mistakes, meaning tasks aren’t yet fully automated.

"Stay realistic"

Palo Alto Networks CEO Nikesh Arora said companies selling AI tools must be cautious and not overpromise how much work AI can automate. He believes full automation in cybersecurity roles will take years to achieve.

"We’re staying realistic—it will take more effort to achieve full automation. We must be absolutely certain that actions taken by AI are correct, since cybersecurity has consequences."

Nevertheless, companies still recognize the benefits brought by AI agents—even if "someone needs to watch them." For example, Canadian company Cirque du Soleil is using an AI agent from SAP to track invoices from costume and stage set suppliers.

When suppliers email asking about invoice status, the AI agent checks SAP to see if the invoice has been processed and drafts a reply email. Previously, the company had two full-time staff doing this; now, both have been transferred to other departments, and only one person needs to review the AI’s draft before sending.

The operating cost of this tool is lower than the salary of a full-time employee, Vice President Philippe Lalumière told the media:

"Sometimes the emails written by AI are not very polite, but suppliers get faster and clearer replies, so overall satisfaction is higher. We haven't made layoffs because of this, but the productivity boost is obvious."

Meanwhile, other AI agent providers are reminding clients to treat these tools as experimental projects rather than immediate-return investments.

Last week, Asha Sharma, Microsoft’s President of Core AI Product Development, said at The Information’s WTF Summit:

"Treat AI agents as part of your R&D budget—a type of investment that will pay off in the next 5 to 10 years."

"I think we are still at a very early stage... We now have millions of AI agents in production, but everyone is still figuring out how to make them truly useful."

Risk DisclaimerThe market has risks; invest with caution. This article does not constitute personal investment advice and has not considered the unique investment goals, financial situation, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article suit their own circumstances. You invest according to this information at your own risk. ```