2025-11-07
Who’s the real boss of your AI?
At the core of any proprietary AI model is an alignment problem that could have serious ramifications for CIOs.In 2025, we are already seeing real-world fallout from gen AI models choosing between doing the best job for the company paying for it, the vendor producing it, the end user using it, or its own hallucinated goals.For example, an AI agent at vibe coding startup Replit deliberately violated instructions, deleted a production database, and tried to cover it up. And xAI’s Grok was found to be searching online for Elon Musk’s opinion before giving answers to certain questions.“It’s not surprising that AI understands who created it,” says EY principal Sinclair Schuller. In fact, it would be difficult to train a model that didn’t know who it worked for, he argues. “You’d have to turn off any access to the internet and remove any hint they were created by a particular company.”And why would an AI company bother? “These aren’t charitable organizations focused on charitable work,” he adds. “They’re organizations with the intention of creating real value in the real world. A company that doesn’t have a bias toward its own offerings won’t exist for long.”Switching to open-source models is no panacea, either. “The problem is security,” says Karen Panetta, IEEE fellow and dean of graduate engineering at Tufts University. “If you go to a community model, there’s no vetting. You don’t know what you’re getting.”And some of the biggest open-source models, such as China’s DeepSeek, come with their own risks of potential bias that will keep many corporate users from adopting them.AI alignment: A growing enterprise riskAccording to a 2025 SailPoint survey, 82% of companies are using AI agents — of those, 80% say agents did things they weren’t supposed to.More specifically, 39% accessed unintended systems, 33% accessed inappropriate data, 31% shared inappropriate data, and 23% revealed access credentials. It’s no surprise then that two thirds of respondents see AI agents as a growing security risk.Governance frameworks and guardrails can help ensure AIs stay within specified boundaries. Still, only 44% of organizations have governance policies in place for AI agents, and only 52% are able to track and audit the data that AI agents access, according to SailPoint’s findings.And the stakes are getting higher: A recent EY survey of 975 C-suite leaders at large enterprises found that 99% of organizations have suffered financial losses from AI-related risks, some over $1 million.To counteract this, some large companies are putting in place continuous monitoring and incident escalation processes for unexpected agentic behaviors. Still, none of this is easy to do, says Chirag Mehta, analyst at Constellation Research. AI is a black box, he says, and it can be difficult to figure out whether a model recommends its company’s products over others, or if it has a political or regional bias, or some other problem.“We don’t have those specific evaluations, and there are no stringent audit standards, nor requirements that you have to show the audit trail of how you trained the model,” he says. “So it’s the end users who have to be skeptical. You can’t blindly trust models to do the right thing.”Managing AI like a humanWith traditional software, computers are given explicit instructions to execute, and they do so consistently. Being probabilistic, however, AI can perform in very unexpected ways, and its reasons for doing so can go against the customer’s best interest and be hard to detect.For example, when explaining why Grok suddenly began parroting Elon Musk, xAI said the model knew it was made by xAI; as a result, it “searches to see what xAI or Elon Musk might have said on a topic to align itself with the company.”This bias sounds human-like in nature, and for some companies, that’s how they’re addressing the problem.“We have to manage it almost like a person,” says Eric Johnson, CIO at PagerDuty. The incident response company has deployed gen AI and AI agents for internal operations and in its products and services. “I used to have a bunch of help desk people, but now I have agentic solutions answering questions on behalf of my human support agents,” Johnson says. “Now I need fewer human support agents, but I need teams to oversee the agents.”That management job begins before AI agents are deployed, starting with prototyping, testing, and fine-tuning. “You have to correct it and make sure it’s responding the way you want it to,” he says.Oversight continues once the agent is in production. In the case of agents used for productivity, the oversight comes from the users themselves. “There’s a very clear disclaimer since AI isn’t always accurate, and sometimes has bias,” he adds.PagerDuty uses Abacus AI, which enables users choose from several state-of-the-art LLMs, including multiple versions of ChatGPT, Claude, Gemini, Grok, Llama, DeepSeek, and more. But if actions taken by AI have legal or financial implications, then oversight beyond what a simple productivity tool can provide is essential.“It’s like having a new person onboarded into the company,” Johnson says. “If people constantly do what they’re supposed to, then oversight starts to reduce. But I still always check in with my team, doing a bit of ‘trust but verify’ to make sure things are where they should. I think it’s going to be the same with these agentic solutions. If they’re operating in a consistent manner and the business processes haven’t changed, you can rely on that solution more. But it can go astray, and there can be things you didn’t expect, so there’ll always be monitoring.”That monitoring is a joint responsibility between IT teams and the business side, he adds.“People have to understand how to operate and manage armies of AIs and bots,” Johnson says. “Behind the scenes, the infrastructure and technology are evolving very quickly, and it’s more complicated than people give it credit for.”Enlist an AI to catch an AIStartup Qoob uses gen AI to expand the amount of work the eight-person company can do. For example, when LLM testing platform LangSmith wasn’t meeting Qoob’s needs, the company built its own version in a week. With AI, it took a fifth of the time it would have otherwise, says Qoob CTO Mikael Quist.Like PagerDuty, Qoob uses multiple LLMs both as part of its products and for productivity. “We’re constantly evaluating our providers,” Quist says. “If there’s a problem, we can switch to another one.”The key to ensuring the AI does what the company wants it to do is continuous testing and evaluation: “We run an evaluation against different providers automatically,” Quist says. “And we have fallback logic if one fails, then we choose the next-best model.”Valuations are run whenever a model or prompt changes, and LLMs are used as judges to check whether outputs are as expected, but with ML-powered sentiment analysis thrown in. There’s also a human in place to oversee the process and ensure results make sense.The company’s developers use a variety of tools such as Cursor IDE, Claude Code, and VS Code with ChatGPT or Claude. For code review, Qoob uses GitHub Copilot, OpenAI’s Codex, and Claude Code. All three providers review Qoob code to identify issues.“We notice there are differences,” Quist says. “Then we make a decision on what we want to fix, so we have AI overseeing AI, but then humans are making the decision.”Using multiple AI platforms, especially for important decisions, is an important strategyfor reducing the risk of bias or improper alignment, says Zoey Jiang, assistant professor of business technologies at Carnegie Mellon University.If an employee is evaluating browsers , for example, Microsoft’s AI might recommend Edge, but a different AI might not agree with that recommendation. “For important and big business decisions, I think it’s definitely worth it,” she says.According to EY’s Schuller, this approach can be scaled up to work not just for one-off decisions but highly critical ongoing business processes.“There are systems being developed that will dispatch the prompt to multiple LLMs at once,” he says. “And then another LLM will say which response is best.”It’s a costly approach, though. Instead of a single query to a single model, multiple queries are necessary, including additional queries for AI models to evaluate all the other AIs’ responses.This is a variation on the mixture of experts approach, except that normally, the experts are all variants of an LLM from the same company, meaning they might all have the same corporate bias.Set hard limitsOne more mechanism to ensure AI alignment is to set hard limits on what data or systems the agent has access to, or what actions it can take, Jiang says.For example, if an AI is making pricing recommendations or offering discounts to customers, perform a hard check to see whether the price is within company limits, she says.Hard-coded guardrails such as these don’t fall victim to the nondeterministic nature of gen AI solutions — or to humans who don’t always pay attention. The most extreme version of this is the “zero authority” approach to AI deployment.“The chatbot can only accept input and relay outputs,” explains Chris Bennett, VP for AI and ML at Unisys. The actual course of action is chosen by a separate, secure system that uses rules-based decision-making.Similar to this is the “least privilege” approach to data and systems access, he says.“Access should be purposeful, not universal,” he says. “For example, a copilot should be granted access to a single email within a session, rather than be able to access the entire inbox of a user without limitations.”All about architectureUltimately, the company deploying the AI should be the boss of the AI. The way to make that happen is architecture.“CIOs paying attention to the architecture are thinking about things the right way,” says EY’s Schuller. “Architecture is where the AI game is going to be won.”Jinsook Han, chief of strategy, corporate development, and global agentic AI at Genpact, agrees. “The question of who controls AI isn’t just philosophical,” she says. “It requires deliberate architectural choices.” That means guardrails, AI auditors, and human experts for final checks.The boss of AI is whoever builds these systems, she adds. “I’m the owner, the one who owns the house,” she says. “I know where boundaries are and who puts fences up. I’m the one saying how much risk I’m willing to take.”