Inside Microsoft’s New AI ‘Hunter’ That Just Found 16 Windows Security Flaws—Before Hackers Could

0

 

Microsoft's MDASH system uses more than 100 specialized AI agents to find exploitable Windows vulnerabilities before they can be discovered by attackers.

REDMOND, Wash. – For years, security researchers have warned that the same AI capabilities helping defenders spot vulnerabilities could one day be turned against them. Now, Microsoft is offering proof that the good guys might still have a head start.

The company quietly unleashed a new AI-powered vulnerability hunting system, codenamed MDASH, and it didn’t take long to prove its worth. In its first major operational run, the system unearthed 16 security flaws in Windows before any attacker could find or exploit them. Four of those were critical remote code execution bugs that could have handed unauthenticated hackers a straight line straight into enterprise networks.

All 16 were patched in Microsoft’s May 12 Patch Tuesday update. The next day, CEO Satya Nadella posted a rare technical shout-out on X (formerly Twitter), calling the system “a fundamental shift in how we find and fix flaws at AI speed.”

But MDASH isn’t just another automated scanner. It’s something far stranger – and far more effective.

What the heck is MDASH?

MDASH stands for Multi-model Agentic Scanning Harness. Built by Microsoft’s Autonomous Code Security team – which includes several members from Team Atlanta, the group that won the $29.5 million DARPA AI Cyber Challenge – the system doesn’t work like a traditional vulnerability scanner or even a single large language model reviewing code.

Instead, MDASH orchestrates more than 100 specialized AI agents running across a mix of frontier models (like GPT-5.5 and Anthropic’s latest) and smaller, distilled models. Each agent has a single, focused job: some scan for suspicious patterns, others challenge whether a finding is a false positive, and a final stage tries to actually build inputs that prove the bug is exploitable.

Only when that chain completes successfully does a human engineer ever see the result.

According to internal Microsoft documentation, the design is intentionally model-agnostic. “We can swap the underlying models as newer ones arrive without rebuilding the pipeline,” a company spokesperson explained. That means MDASH will only get smarter over time.

The 16 bugs – and two that kept engineers up at night

The vulnerabilities MDASH found are spread across some of Windows’ most sensitive components: the TCP/IP stack, the IKEEXT IPsec service, HTTP.sys, Netlogon, Windows DNS, and even the legacy Telnet client. Ten of them live in kernel-mode.

Two of the four critical flaws stand out as genuinely nasty:

  • CVE-2026-33827 lives in tcpip.sys and can be triggered by a single crafted IPv4 packet. An unauthenticated attacker on the same network could send it and gain LocalSystem execution – the highest privilege level in Windows.
  • CVE-2026-33824 is a pre-authentication double-free vulnerability in the IKEEXT service, reachable over UDP port 500 on any machine running RRAS VPN, DirectAccess, or Always-On VPN. Again, LocalSystem access.

Two more critical bugs in Netlogon and the Windows DNS Client each carry CVSS scores of 9.8 (out of 10). All four are remotely exploitable without any credentials.

Microsoft says these weren’t the kind of bugs a standard scanner would surface. The tcpip.sys flaw required reasoning across three concurrent code paths that all free the same object. The IKEEXT issue spanned six separate source files – exactly the kind of multi-file, multi-path analysis where single-model approaches fall apart.

How MDASH stacks up against OpenAI and Anthropic

Microsoft has been running MDASH through a gauntlet of public and private benchmarks. On CyberGym – a UC Berkeley benchmark built around 1,507 real-world vulnerability reproduction tasks – MDASH scored 88.45%, putting it at the top of the public leaderboard. For comparison, Anthropic’s Mythos Preview model scored 83.1%, while OpenAI’s GPT-5.5 managed 81.8%.

Even more impressive: In private testing against a never-before-seen Windows driver codebase called StorageDrive, MDASH found all 21 planted vulnerabilities with zero false positives. Against five years of confirmed MSRC (Microsoft Security Response Center) cases in clfs.sys and tcpip.sys, it hit 96% and 100% recall, respectively.

According to a detailed breakdown on the Microsoft Security Blog, the system is already being integrated into Microsoft’s internal development pipeline – meaning future versions of Windows could ship with far fewer latent bugs than any previous release.

Who gets MDASH? (And when?)

Right now, MDASH is in limited private preview with a small group of enterprise customers. Microsoft says broader availability is expected in the months ahead, though the company has not announced pricing or whether it will become a standalone product or a feature bundled into existing security tools like Microsoft Defender for Endpoint.

The announcement follows similar moves from rivals. Anthropic recently revealed Project Glasswing, an internal AI system for finding flaws in its own infrastructure, while OpenAI’s Daybreak initiative is running behind similarly narrow access gates. All three companies are racing to find exploitable flaws before attackers do – and the gap between AI-powered defense and AI-powered offense is narrowing fast.

The other side of the AI security race

Because of course, there is a darker side to this story. While Microsoft is celebrating MDASH’s success, the first known zero-day exploit developed entirely by AI has already been confirmed in the wild.

As first reported by Notebookcheck and later confirmed by Google’s Threat Analysis Group, a planned mass exploitation campaign used an AI-generated exploit to bypass two-factor authentication in a widely used web administration tool. The attackers didn’t write a single line of exploit code themselves – they prompted an AI model to generate it, then refined the output until it worked reliably.

That’s the uncomfortable reality that MDASH is designed to counter. Every day that an AI-powered defense tool sits unused is another day that an AI-powered offense tool could be scraping GitHub, mailing lists, and proprietary codebases for the same flaws.

What this means for Windows users

For now, if you’ve installed the May 12 Patch Tuesday updates, you’re protected against all 16 vulnerabilities MDASH found. Microsoft rates the update as “critical” for Windows Server 2019, 2022, and 2025, as well as Windows 11 versions 23H2 and 24H2. Home users should enable automatic updates if they haven’t already.

But the longer-term implication is harder to ignore. For decades, finding security bugs was a slow, manual, human-driven process. MDASH suggests that era is ending – and the only question is who builds the better hunter.

Additional reporting from Help Net Security and The Hacker News contributed to this article.


Tags:

Post a Comment

0 Comments

Post a Comment (0)