AI Agents in Enterprise: Trust and Simulation Top Conference Agenda

New York, NY — Despite rapid adoption of AI coding agents, the software they generate cannot be trusted in production, according to leading experts at the AI Agent Conference this week.

Datadog Chief Scientist Ameet Talwalkar set the tone in his opening keynote, warning that “one of the hardest things for humans to do is no longer building production systems. It’s actually reviewing the vibe-coded software that gets shipped into production.”

The remark underscores a growing industry challenge: while AI agents are being deployed at massive scale, the “vibe-coded” output requires rigorous human oversight.

Enterprise Scale Adoption

T-Mobile Director of AI Engineering Julianne Roberson revealed that the carrier now handles 200,000 customer conversations daily using AI agents. That project took roughly a year to complete, she said during a panel.

AI Agents in Enterprise: Trust and Simulation Top Conference Agenda — Source: thenewstack.io

Customer service and assistance remain the most popular enterprise use cases for AI agents, according to conference data.

The Trust Challenge

Zhou Yu, co-founder and CEO of ArklexAI, said many companies build agents in minutes but deploy them with little understanding of real-world behavior. “You can use Claude Code to build an agent in five minutes, but you don’t know what it will do when it goes into production, especially when you have a large group of customers,” Yu told The New Stack.

To address this, ArklexAI pivoted from its original agent framework to a new simulation product called ArkSim. It collects data by simulating agentic interactions with users, since those interactions are non-deterministic.

Simulation as a Solution

“We create simulations of your users so you can get an idea of what the user experience is and how to improve it,” Yu said. ArkSim aims to shorten time-to-market for customer-facing bots while improving quality.

Datadog is also extending its observability product line to model production systems and predict issues with AI agents before they appear, Talwalkar said.

Background: From Speed to Security

Joe Moura, founder and CEO of CrewAI, noted a shift in focus. “Initially, it was all about building and deploying agents. But now it’s all about security and enterprise adoption,” he said in his keynote.

CrewAI added enterprise features in response to customer demand and became a leading agent framework by starting early (2003) and encoding best practices into an opinionated platform.

Framework Commoditization

Despite Walmart still using its original product, Yu argued that agent frameworks have become commoditized, pushing Arklex toward simulation. Moura predicted future focus on “entangled agents” that work together autonomously.

What This Means

The AI agent landscape is maturing fast: enterprises are deploying at massive scale but can’t trust raw agent output. Governance, simulation, and observability are becoming table stakes for production deployment.

For developers and IT leaders, the takeaway is clear: “Vibe-coding” without validation is a security and reliability risk. Expect more tools like ArkSim and Datadog’s predictive models to emerge as the industry seeks trustworthy agent operations.

Stay updated on AI agent deployment best practices by following our Enterprise AI coverage.

Tags: