AI Hacked AI: McKinsey's Secret Chatbot Cracked Wide Open in Just 2 Hours
Something alarming happened in the world of artificial intelligence and cybersecurity this March, and if you haven't heard about it yet, you absolutely need to. As reported by India Today, an autonomous AI agent developed by a cybersecurity startup didn't just break into a corporate system it broke into another AI system. The target? McKinsey & Company's highly prized internal AI chatbot, Lilli. The time taken? Just two hours. No passwords. No insider help. No human sitting at a keyboard. Just an AI agent doing what it was built to do relentlessly probe, exploit, and access. And what it found inside was jaw-dropping.
What Exactly Is Lilli — McKinsey's AI Brain?
Before getting into how the breach happened, it helps to understand just how important Lilli is to McKinsey. The chatbot was launched in July 2023, and its name carries real historical weight — it was named after the very first professional woman hired by the consulting giant, back in 1945. But Lilli is far more than a symbolic tribute. It is McKinsey's central AI engine, purpose-built to handle chat, document analysis, retrieval-augmented generation (RAG) over decades of proprietary research, and AI-powered search across more than 100,000 internal documents. About 72 percent of McKinsey's workforce — that's more than 40,000 consultants globally — uses Lilli on a regular basis. The platform processes in excess of 500,000 prompts every single month. In short, Lilli isn't just a nice-to-have tool. It's the nervous system of one of the most powerful consulting empires of the world. And it was wide open.
Meet CodeWall: The AI That Chose Its Own Target
The attacker in this story isn't a shadowy hacker in a basement. It's a cybersecurity startup called CodeWall. Their business model is built around using autonomous AI agents to continuously attack and red-team their clients' infrastructure — helping companies find vulnerabilities before real bad actors do. During a research preview exercise in late February 2026, something remarkable happened: CodeWall's own AI agent autonomously suggested McKinsey as a target. The reason? The agent identified McKinsey's public responsible disclosure policy — meaning the company officially invites security researchers to report vulnerabilities — and noticed recent updates to the Lilli platform. "So we decided to point our autonomous offensive agent at it," the researchers wrote in their blog post published on March 1. The agent was given nothing but a domain name. No credentials. No insider knowledge. No human in the loop. Just an AI, a target, and a mission.
Step by Step: How the AI Agent Got In
The attack didn't rely on some exotic, never-seen-before vulnerability. In fact, the method was almost embarrassingly straightforward for a firm of McKinsey's stature. Here is exactly how it unfolded. CodeWall's agent began by mapping Lilli's public attack surface. It found publicly exposed API documentation that listed more than 200 endpoints — the access points where external systems and users communicate with Lilli. Of those, 22 endpoints required absolutely no authentication. That means anyone on the open internet could access them without a username or password. The agent zeroed in on one of those unprotected endpoints: a function that saved user search queries directly into Lilli's database. The developers had correctly parameterized the values in this function — a standard security practice.
However, they made one critical mistake: the JSON keys were directly concatenated into the SQL query rather than being handled safely. That tiny oversight opened the door to a classic SQL injection attack. This is exactly the kind of speed and precision that technology leaders like former SoftBank president Nikesh Arora have warned about — AI systems operating far faster than any human oversight mechanism can keep up with.
The SQL Injection: A 30-Year-Old Trick That Still Works
SQL injection is not a new attack. It has been on the OWASP Top 10 list of most critical web application security risks for decades. Yet, here it was, alive and thriving inside one of the world's most well-resourced corporations. CodeWall's agent noticed that database error messages were reflecting the JSON keys verbatim back to the agent — a telltale sign of a SQL injection vulnerability. Over just 15 careful iterations, the agent analyzed those error messages, reverse-engineered the database structure, and began extracting data.
The entire process happened without McKinsey's own security information and event management (SIEM) tools flagging a single alert. Then the agent escalated. It chained the SQL injection vulnerability with another classic flaw: Insecure Direct Object Reference (IDOR). This allowed the agent to jump between user accounts — reading one employee's search history, then another's — escalating privileges until it had full read and write access to the entire production database.
What Was Inside: 46.5 Million Messages and Counting
Once inside, the scope of what CodeWall's agent accessed was staggering. The researchers reported gaining access to 46.5 million chat messages — covering topics as sensitive as corporate strategy, mergers and acquisitions, and client engagements — all stored in plaintext. They also accessed 728,000 files containing confidential client data, 57,000 user accounts, and 95 system prompts that directly control how Lilli behaves. Additionally, the agent gained access to 3.68 million RAG document chunks effectively the entire knowledge base that powers Lilli's responses, built from decades of proprietary McKinsey research, frameworks, and methodologies.
All of this was sitting in a database that an AI agent without any credentials could reach in under two hours. The sheer volume of sensitive intelligence stored inside a single AI system underscores a growing concern across the industry: that AI tools like ChatGPT and Claude are already raising serious data privacy and security questions that enterprises are simply not equipped to answer yet.
The Most Dangerous Part: Writable System Prompts
Here is where the story gets genuinely frightening. The data exfiltration — as serious as it was — might not even be the worst part of this breach. All 95 system prompts controlling Lilli's behavior were stored in the same database, and they were all writable. System prompts are the foundational instructions that tell an AI chatbot how to behave, what guardrails to follow, how to cite sources, and what to refuse to do. With write access, an attacker would not need to deploy any new code, push any updates, or trigger any conventional security alerts.
As CodeWall put it bluntly: "No deployment needed. No code change. Just a single UPDATE statement wrapped in a single HTTP call." A malicious actor could have silently rewritten what Lilli told all 40,000-plus McKinsey consultants — poisoning financial models, manipulating strategic recommendations, altering risk assessments, or embedding instructions for Lilli to secretly leak confidential data through its own responses. And nobody would ever know, because modified prompts don't leave traditional security traces.
McKinsey Responds: Patched Fast, But Questions Remain
CodeWall responsibly disclosed their findings on March 1, 2026. McKinsey's response was swift — by the following day, the consulting firm had patched all unauthenticated endpoints, taken the development environment offline, and blocked public API documentation. In an official statement, McKinsey said its investigation supported by a leading third-party forensics firm identified no evidence that client data or confidential client information was accessed by this researcher or any other unauthorized third party.
McKinsey also clarified that underlying files were stored separately and described as having been "never at risk." Not everyone is fully convinced, however. Security analyst Edward Kiledjian acknowledged that CodeWall's attack chain was "plausible and technically sound," but raised questions about whether the full scope of the claimed impact was completely evidenced. He also pointed out that nine days is an extremely compressed window for a thorough forensic variant analysis.
Why Traditional Security Tools Missed It Entirely
One of the most unsettling takeaways from this entire incident is that Lilli had been running in production for over two years — and McKinsey's own internal security scanners never caught this vulnerability. So why did an AI agent succeed where human-designed tools failed? The answer comes down to how these systems think. Traditional security scanners rely on predefined signatures, rules, and checklists. They look for known patterns and flag known misconfigurations.
They are excellent at identifying common, well-documented flaws that match their existing database of threats. But CodeWall's agent does not follow a checklist. It maps a target system's entire attack surface, probes every angle, chains multiple findings together, and escalates — continuously and at machine speed, the same way a highly skilled human attacker would. Only it never sleeps, never takes weekends off, and never gets tired. That fundamental difference is what made all the difference here.
The Uncomfortable Irony for McKinsey
The timing of this incident carries a particularly sharp irony for McKinsey. The firm has positioned itself as a global leader in AI advisory work. AI-related consulting now reportedly accounts for around 40 percent of McKinsey's total revenue. Earlier this year, McKinsey's CEO publicly stated that the firm has built 25,000 AI agents to support its own workforce. McKinsey has consistently pointed to its own AI adoption including Lilli — as living proof that it practices what it preaches to clients.
Discovering that its flagship internal AI tool had a fundamental SQL injection vulnerability that sat undetected for two years, and that an autonomous AI agent cracked it open in 120 minutes, is not exactly a flattering advertisement for the firm's security posture. It also adds uncomfortable weight to the warnings that industry veterans have been sounding about the speed at which AI is outpacing enterprise governance — warnings that, until now, many boardrooms were too comfortable to take seriously.
A New Threat Model: AI Attacking AI at Machine Speed
The McKinsey Lilli breach is not just a story about one company's vulnerability. It marks a fundamental and historic shift in the threat landscape. For the first time in a widely publicized real-world exercise, an AI agent autonomously identified a target, selected its attack vectors, executed a complex multi-stage exploit, and documented its findings — all without a single human being involved after the initial instruction was given.
This changes everything about how we must think about enterprise security. Cyberattacks are no longer bound by human speed, human fatigue, or human creativity. AI agents can probe millions of endpoints simultaneously, chain together obscure vulnerabilities that no single human analyst would spot, and do all of this continuously around the clock. The era of simply shipping an AI feature and worrying about security later is genuinely over.
What Every Organization Must Do Right Now
The lessons from the McKinsey Lilli incident are clear and immediate. First, every organization running an internal AI platform must audit its API endpoints without delay — identifying which ones require authentication and which do not. Unauthenticated endpoints are not a minor oversight. They are open doors. Second, system prompts and AI configurations must never be stored in the same database as user data, particularly if that database is accessible through any externally facing endpoint. Writable system prompts represent an entirely new category of attack surface that most organizations have not even begun to think about securing.
Third, traditional security scanning tools are no longer sufficient on their own. The only effective way to test against AI-powered attackers is to use AI-powered defenses — running continuous, autonomous red-team exercises against real production infrastructure rather than point-in-time manual penetration tests. Given that leading AI tools are already under scrutiny for how they handle sensitive organizational data, this is not a problem any enterprise can afford to defer to the next budget cycle.
The Bigger Picture: AI Security Is Now a Business-Critical Issue
This incident is a watershed moment. If McKinsey — a firm with world-class technology teams, enormous security budgets, and some of the best talent on the planet — can have its AI platform cracked open in two hours by an autonomous agent using a three-decade-old vulnerability, what does that mean for every other enterprise rolling out AI tools at breakneck speed? The answer is sobering. AI systems are not just new software products.
They are repositories of the most sensitive strategic intelligence an organization possesses — the kind of information that used to live only in locked filing cabinets, encrypted email servers, and secure vaults. When that intelligence feeds an AI that talks to tens of thousands of employees every day, the security stakes are exponentially higher. And the attack surface is exponentially larger. The McKinsey breach is a warning shot. For organizations that heed it, the lesson is clear: secure your AI systems with the same rigor you would apply to your most critical infrastructure — because that is exactly what they are.
Final Thoughts: The AI Arms Race Has Officially Begun
What CodeWall demonstrated with the McKinsey Lilli breach is just the beginning. We are entering an era of AI vs. AI — where offensive agents probe, exploit, and breach defensive systems at speeds and scales no human team can match unaided. The question is no longer whether your AI platform could be attacked this way. The McKinsey incident has proven that it can, and faster than you think.
The question now is whether your organization will be ready before an autonomous agent — one not operating under a responsible disclosure policy — decides to point itself at you. The future of cybersecurity is being written right now, one autonomous agent at a time. And if the first major chapter of that story tells us anything, it's that the most sophisticated enterprises of the world are still dangerously underprepared for what is already here.
Source & AI Information: External links in this article are provided for informational reference to authoritative sources. This content was drafted with the assistance of Artificial Intelligence tools to ensure comprehensive coverage, and subsequently reviewed by a human editor prior to publication.
0 Comments